10,000 Matching Annotations
  1. Last 7 days
    1. Reviewer #1 (Public review):

      I enjoyed reading this long but compelling account of the new (generalised) version of the Hierarchical Gaussian filter (HGF). Effectively, it describes an extension of the HGF to accommodate the influence of latent states on volatility - and vice versa. This paper describes a generalisation that has been made available to the community via the TAPAS software. This contribution will be of special interest to people in computational psychiatry, where the application of the HGF has been the most prevalent.

      I thought the background, motivation, description and illustration of the scheme were excellent. The paper is rather long; however, it serves as a useful technical reference.

      There are two issues that I think the authors need to address.

      (1) The first is the failure to properly relate the current scheme to standard implementations of Bayesian filtering under hierarchical state-space models.

      (2) The second is that whilst the paper is well-written, some of the mathematical notation is cluttered. Furthermore, I think that the authors need to motivate the otherwise overengineered description of the requisite variational message passing and decomposition into update steps.

      I think that the authors can address both of these issues by including a technical section in the introduction, relating the HGF to state-of-the-art in the broader field of Bayesian filtering and predictive coding. They can then explain the benefits of the particular generative model - to which the HGF is committed - by drilling down on the update scheme and its implementation in the remainder of the paper.

      I was underwhelmed by the account of predictive coding and its relationship to Bayesian filtering. I think that the authors should suppress the references to predictive coding in the recent machine learning literature. Rather, the presented narrative should emphasise the fact that predictive coding and Bayesian filtering are the same thing. The authors could then explain where the hierarchical Gaussian filter fits within Bayesian filtering and why its particular form lends itself to the variational updates they subsequently derive.

      The authors could add something like the following to the introduction (accompanying PDF has the equations). There is a summary of what follows in the Wikipedia entry on generalised filtering, in particular, its relationship to predictive coding (https://en.wikipedia.org/wiki/Generalized_filtering).

      Relationship to Existing Work

      Technically, the hierarchical Gaussian filter is a Bayesian filter under a hierarchical state-space model. The most general form of these models can be expressed as stochastic differential or difference equations as follows, c.f., Equation 9 in (Feldman and Friston, 2010):

      This functional form implies a hierarchical decomposition into hierarchical levels (l) that are linked through latent causes (v), with dynamics among latent states (x) at each level. From the perspective of the HGF, the state-dependency of state (z) and observation (e) noise at each level is a key feature. The variance (i.e., inverse precision) of the random fluctuations z is known as volatility, which - in a hierarchical setting - can depend upon latent causes and states at higher levels. The variational inversion of these models - sometimes called variational or generalised filtering - finds a number of important applications: a key example here is dynamic causal modelling, typically in the analysis of imaging timeseries. In this setting, unknown or latent states, parameters and precisions are updated in variational steps by minimising variational free energy (a variational bound on negative log marginal likelihood).

      In engineering, the simplest form of generalised filtering is known as a Kalman filter, in which all the equations are linear, and volatility is assumed to be constant. In neurobiology, there is an intimate relationship between generalised filtering and predictive coding: predictive coding was originally introduced for timeseries analysis and compression of sound files (Elias, 1955). Subsequently, the implicit filtering or compression scheme was considered as a description of neuronal processing in the retina (Srinivasan et al., 1982) and then cortical hierarchies (Mumford, 1992; Rao, 1999; Rao and Ballard, 1999). The formal equivalence between predictive coding and Kalman filtering was noted in (Rao, 1999). Kalman filtering itself was then recognised as a special case of generalised filtering that could be read as predictive coding in the brain (Friston and Kiebel, 2009). The estimation of precision in these predictive coding schemes has been associated with endogenous (Feldman and Friston, 2010) and exogenous (Kanai et al., 2015) attention; i.e., with and without state dependency, respectively. Subsequently, precision estimation or uncertainty quantification has become a key focus in computational psychiatry.

      In machine learning, there have been recent attempts to implement predictive coding via the minimisation of variational free energy under generative models with the functional form of conventional neural networks: e.g., (Millidge et al., 2022; Salvatori et al., 2022). However, much of this work is nascent and does not deal with dynamics or volatility. There is an interesting exception in machine learning, namely, transformer architectures, where the attention heads can be read as implementing a form of Kalman gain, namely, estimating state-dependent precision, e.g., (Buckley and Singh, 2024).

      Within this general setting, the HGF emphasises the importance of precision estimation or uncertainty quantification by committing to a particular functional form for the generative model that can be summarised as follows:

      "We will unpack this form below and show how it leads to a remarkably compact and efficient Bayesian belief updating scheme. We will appeal implicitly to variational message passing on factor graphs (Dauwels, 2007; Friston et al., 2017; Winn and Bishop, 2005) to decompose message passing between nodes and, crucially, within-node computations. These computations furnish a scalable and flexible form of generalised Bayesian filtering. In principle, this scheme inherits all the biological plausibility of belief propagation and variational message passing in cortical hierarchies (Friston et al., 2017)."

      It might be worth the authors [re-]reading the abstracts of the above papers, for a clearer sense of how those in computational neuroscience and state-space modelling (but not machine learning) think about predictive coding and its relationship to Bayesian filtering. They could then go through the manuscript, nuancing your discussion of the intimate relationship between variational Bayes, generalised filtering, predictive coding and hierarchical Gaussian filtering.

    2. eLife Assessment

      This paper describes a valuable extension of the Hierarchical Gaussian Filter (HGF) to accommodate the influence of volatility and value, among others. The authors present convincing evidence that the model can recover the generative structure of simulated data. There is not strong evidence that the new model provides a superior account of existing empirical phenomena, and the HGF could be better embedded in the larger filtering and predictive coding literature. This contribution will be of special interest to people in computational psychiatry, where the application of the hierarchical Gaussian filter has been the most prevalent.

    3. Reviewer #2 (Public review):

      Summary:

      The authors introduce a generalised HGF featuring (1) volatility coupling (rate of change), value coupling (phasic or autoregressive drift) [and 'noise coupling', which is a volatility parent of an outcome state] (2) parameters: volatility coupling κ, tonic volatility ω, value coupling α, tonic drift ρ, {plus minus}auto-regressive drift λ (3) inputs at irregular intervals (but still discrete time steps, unlike continuous time belief evolution in predictive coding) (4) states with multiple parents or parents with multiple child states (5) value parents by default have a volatility parent, and volatility parents have a value parent (or none) (6) linear or non-linear (including ReLU) functions (7) also beliefs can be any exponential family distribution (incl binary, categorical), hence can also model POMDPs

      They describe the 3 steps involved in updating (for both value and volatility): (1) prediction (2) update posterior (entails passing both pwPE and prediction precision from lower to upper node - the latter is not found in other predictive coding schemes) (3) prediction error NB this makes the network modular, so nodes can be added/removed without recomputing all the update equations.

      They give some examples of models working using simulated data: (1) sharing of parent nodes can generalise an update from one context to another (2) sharing of child nodes enables multisensory cue combination (e.g. auditory-visual, or interoceptive-exteroceptive).

      The authors further discuss a potential shortcoming of the HGF - its discretisation of timesteps - which is less naturalistic but nevertheless makes it very amenable to fitting trial-wise experimental data. They propose to extend the HGF to modelling within-step dynamics in future, which could make testable continuous time neuronal predictions.

      Strengths:

      Overall, I think the paper is excellent - it contributes an important extension to a popular modelling tool which substantially increases the number of potential applications. It is well written, and I have almost no criticisms to make.

      Weaknesses:

      The authors state that this generalised HGF will "make it easy to build large networks with considerable hierarchical depth", comparable to neural network architectures. The examples they give are extremely simple; however, it would be good to see a more complex one.

    4. Reviewer #3 (Public review):

      Summary:

      In this paper, Weber and colleagues develop a generalization of the HGF, a widely used modeling tool. The generalization allows coupling between latent variables that was not possible in the original HGF. The resulting inference algorithm invites a predictive coding interpretation. The modular structure allows the construction of complex models out of simpler building blocks.

      Strengths:

      Overall, I think this is a valuable technical contribution, which will have applications to neuroscience, behavior, and psychiatry. It is mathematically rigorous, and the exposition is, for the most part, clear. It also comes with open-source software, so it should be a valuable resource to the modeling community.

      Weaknesses:

      My main concern is that the way that this paper is written will only be accessible and interesting to a niche audience interested in particular kinds of approximate inference schemes. The paper doesn't draw out the implications until the very end, so it's hard for readers to understand the motivation for certain modeling choices. It also requires readers to work through many pages of math before getting to applications. The applications themselves are very abstract.

    1. eLife Assessment

      This important study investigates how structurally diverse cardenolide toxins in tropical milkweed, especially mixtures containing nitrogen- and sulfur-containing variants, influence monarch caterpillar feeding, growth, and toxin sequestration. The experiments solidly demonstrate that chemical diversity within a single group of plant toxins can have combined effects on even highly specialized herbivores that differ from the effects of each toxin alone. However, as the mixture design does not fully separate true diversity effects from the influence of the N,S-cardenolides themselves and the ecological basis for the chosen natural ratios remains weakly justified. As a result, the broader conclusions would require more fully justified concentration regimes, mixture treatments that exclude N,S-cardenolides, and tests on living plants and non-adapted herbivores to firmly support the proposed coevolutionary interpretation.

    2. Reviewer #1 (Public review):

      Summary:

      In the ecological interactions between wild plants and specialized herbivorous insects, structural innovation-based diversification of secondary metabolites often occurs. In this study, Agrawal et al. utilized two milkweed species (Asclepias curassavica and Asclepias incarnata) and the specialist Monarch butterfly (Danaus plexippus) as a model system to investigate the effects of two N,S-cardenolides-formed through structural diversification and innovation in A. curassavica-on the growth, feeding, and chemical sequestration of D. plexippus, compared to other conventional cardenolides. Additionally, the study examined how cardenolide diversification resulting from the formation of N,S-cardenolides influences the growth and sequestration of D. plexippus. On this basis, the research elucidates the ecophysiological impact of toxin diversity in wild plants on the detoxification and transport mechanisms of highly adapted herbivores.

      Strengths:

      The study is characterized by the use of milkweed plants and the specialist Monarch butterfly, which represent a well-established model in chemical ecology research. On one hand, these two organisms have undergone extensive co-evolutionary interactions; on the other hand, the butterfly has developed a remarkable capacity for toxin sequestration. The authors, building upon their substantial prior research in this field and earlier observations of structural evolutionary innovation in cardenolides in A. curassavica, proposed two novel ecological hypotheses. While experimentally validating these hypotheses, they introduced the intriguing concept of a "non-additive diversity effect" of trace plant secondary metabolites when mixed-contrasting with traditional synergistic perspectives-in their impact on herbivores.

      Weaknesses:

      The manuscript has two main weaknesses. First, as a study reliant on the control of compound concentrations, the authors did not provide sufficient or persuasive justification for their selection of the natural proportions (and concentrations) of cardenolides. The ratios of these compounds likely vary significantly across different environmental conditions, developmental stages, pre- and post-herbivory, and different plant tissues. The ecological relevance of the "natural proportions" emphasized by the authors remains questionable. Furthermore, the same compound may even exert different effects on herbivorous insects at different concentrations. The authors should address this issue in detail within the Introduction, Methods, or Discussion sections.

      Second, the study was conducted using leaf discs in an in vitro setting, which may not accurately reflect the responses of Monarch butterflies on living plants. This limitation undermines the foundation for the novel ecological theory proposed by the authors. If the observed phenomena could be validated using specifically engineered plant lines-such as those created through gene editing, knockdown, or overexpression of key enzymes involved in the synthesis of specific N,S-cardenolides-the findings would be substantially more compelling.

    3. Reviewer #2 (Public review):

      I have reviewed both the original and revised version of this manuscript and while no additional experiments were added, I find the interpretations and discussion of the limitations of the study have improved. This is appreciated.

      My original concern regarding the mixture treatments largely remains. Figure 4 nicely shows that the mixtures are more potent than the average of all single compounds. However, Fig S3 shows that the effects of mixtures are not significantly different from effects of at least one, single N,S compound (voruscharin or uscharin) across all measured growth/sequestration responses. Essentially, the effects of single N,S compounds is similar to mixtures (which also contain N,S compounds).

      While the current results are certainly interesting as presented, in my view the main takeaway of the manuscript would be more compelling if it could be demonstrated that it isn't simply the presence of N,S compounds in the mixtures driving the observations. For example, does a mixture of all compounds except voruscharin or uscharin still have stronger growth/sequestration effects compared to single non-N,S compounds?

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):<br /> <br /> Summary:

      In the ecological interactions between wild plants and specialized herbivorous insects, structural innovation-based diversification of secondary metabolites often occurs. In this study, Agrawal et al. utilized two milkweed species (Asclepias curassavica and Asclepias incarnata) and the specialist Monarch butterfly (Danaus plexippus) as a model system to investigate the effects of two N,S-cardenolides - formed through structural diversification and innovation in A. curassavica-on the growth, feeding, and chemical sequestration of D. plexippus, compared to other conventional cardenolides. Additionally, the study examined how cardenolide diversification resulting from the formation of N,S-cardenolides influences the growth and sequestration of D. plexippus. On this basis, the research elucidates the ecophysiological impact of toxin diversity in wild plants on the detoxification and transport mechanisms of highly adapted herbivores.

      Strengths:

      The study is characterized by the use of milkweed plants and the specialist Monarch butterfly, which represent a well-established model in chemical ecology research. On one hand, these two organisms have undergone extensive co-evolutionary interactions; on the other hand, the butterfly has developed a remarkable capacity for toxin sequestration. The authors, building upon their substantial prior research in this field and earlier observations of structural evolutionary innovation in cardenolides in A. curassavica, proposed two novel ecological hypotheses. While experimentally validating these hypotheses, they introduced the intriguing concept of a "non-additive diversity effect" of trace plant secondary metabolites when mixed, contrasting with traditional synergistic perspectives, in their impact on herbivores.

      Weaknesses:

      The manuscript has two main weaknesses. First, as a study reliant on the control of compound concentrations, the authors did not provide sufficient or persuasive justification for their selection of the natural proportions (and concentrations) of cardenolides. The ratios of these compounds likely vary significantly across different environmental conditions, developmental stages, pre- and post-herbivory, and different plant tissues. The ecological relevance of the "natural proportions" emphasized by the authors remains questionable. Furthermore, the same compound may even exert different effects on herbivorous insects at different concentrations. The authors should address this issue in detail within the Introduction, Methods, or Discussion sections.

      Second, the study was conducted using leaf discs in an in vitro setting, which may not accurately reflect the responses of Monarch butterflies on living plants. This limitation undermines the foundation for the novel ecological theory proposed by the authors. If the observed phenomena could be validated using specifically engineered plant lines-such as those created through gene editing, knockdown, or overexpression of key enzymes involved in the synthesis of specific N,S-cardenolides - the findings would be substantially more compelling.

      Reviewer #2 (Public review):

      This study examined the effects of several cardenolides, including N,S-ring containing variants, on sequestration and performance metrics in monarch larvae. The authors confirm that some cardenolides, which are toxic to non-adapted herbivores, are sequestered by monarchs and enhance performance. Interestingly, N,S-ring-containing cardenolides did not have the same effects and were poorly sequestered, with minimal recovery in frass, suggesting an alternate detoxification or metabolic strategy. These N,S-containing compounds are also known to be less potent defences against non-adapted herbivores. The authors further report that mixtures of cardenolides reduce herbivore performance and sequestration compared to single compounds, highlighting the important role of phytochemical diversity in shaping plant-herbivore interactions.

      Overall, this study is clearly written, well-conducted and has the potential to make a valuable contribution to the field. However, I have one major concern regarding the interpretations of the mixture results. From what I understand of the methods, all tested mixtures contain all five compounds. As such, it is not possible to determine whether reduced performance and sequestration result from the complete mixture or from the presence of a single compound, such as voruscharin for performance and uscharin for sequestration. For instance, if all compounds except voruscharin (or uscharin) were combined, would the same pattern emerge? I suspect not, since the effects of the individual N,S-containing compounds alone are generally similar to those of the full mixture (Figure S3). By taking the average of all single compounds, the individual effects of the N,S-containing ones are being inflated by the non-N,S-containing ones (in the main text, Figure 4). In the mix, of course, they are not being 'diluted', as they are always present. This interpretation is further supported by the fact that in the equimolar mix, the relative proportion of voruscharin decreases (from 50% in the 'real mix'), and the target measurements of performance and sequestration tend to increase in the equimolar mix compared to the real mix.

      Despite this issue, the discussion of mixtures in the context of plant defence against both adapted and non-adapted herbivores is fascinating and convincing. The rationale that mixtures may serve as a chemical tool-kit that targets different sets of herbivores is compelling. The non-N,S cardenolides are effective against non-adapted herbivores and the N,S-containing cardenolides are effective against adapted herbivores. However, the current experiments focus exclusively on an adapted species. It would be especially interesting to test whether such mixtures reduce overall herbivory when both adapted and non-adapted species are present.

      It remains possible that mixtures, even in the absence of voruscharin or uscharin, genuinely reduce sequestration or performance; however, this would need to be tested directly to address the abovementioned concern.

      Thanks for these insightful reviews and your summary assessment. We certainly agree that ours was a laboratory study with a single specialized insect, and both mixtures types had all five compounds (controlling for total toxin concentration). Thus, our conclusion that combined effects of naturally occurring toxins (within the cardenolide class) have non-additive effects for the specialized sequestering monarch are constrained by our experimental conditions. In our assay we used two mixture types, equimolar and “natural” proportions. We acknowledge that the natural proportions will vary with plant age, damage history, etc. of the host plant, Asclepias curassavica. Our proportions were based on growing the plants a few different times under variable conditions. Although we did not conduct these experiments on non-adapted insects, we discuss a related experiment that was conducted with wild-type and genetically engineered Drosophila (Lopez-Goldar et al. 2024, PNAS). In sum, we appreciate the reviewers’ comments.

      Recommendations for the authors:

      Reviewing Editor Comments:

      (i) More convincingly justify the choice and ecological relevance of the "natural" cardenolide ratios, (ii) Clarify the interpretation of mixture effects, and (iii) more explicitly discuss the limitations of leaf-disc assays and the absence of non-adapted herbivores in light of the broader coevolutionary claims.

      Thank you for these suggestions. We have added several sentences of text to the Discussion section to make these points.

      Reviewer #1 (Recommendations for the authors):

      (1) Statistical analysis is missing from Figure 3 and Figure S3, making it difficult to assess the significance of the data.

      Much of the data in Fig. 3 is meant for descriptive presentation, with the main statistical analysis (contrast between N,S and non-N,S cardenolides given in the main text of the results. We have added treatment differences between the sequestration efficiencies to the figure as well.

      (2) To help readers intuitively understand how certain results (such as ECD and sequestration efficiency) were calculated, the authors can provide the equations used for these computations.

      Thank you, this was given in the methods and we have added it to the Result on first mention as well.

      (3) For Figure 4, we suggest presenting the results of the equal mixture treatment and the realistic mixture treatment separately, rather than averaging the results from these two types of treatments.

      We understand and appreciate this comment – all of the treatment means are given in Fig. S3. For this particular figure we have opted to stick with the binary comparison (singles vs. mixed) to maximize replication for statistical tests (typically n = 25 vs. 10).

      Reviewer #2 (Recommendations for the authors):

      Given the interpretations and discussion generally, I feel the manuscript would benefit from either additional experiments (mixtures w/o N-S compounds), inclusion of non-adapted herbivore performance, or reframing of the explicit interpretations from your findings.

      We have added some caveats to the text but not added any additional experiments.

      Also, for all treatments/mixtures are concentrations above the IC50? Perhaps this could be calculated from the information presented, but it may be best to explicitly mention this.

      This is an interesting question. IC50’s are estimated from in vitro assays (with the enzyme and toxins in microplate wells) and so are not translatable to foliar concentrations. As indicated in the text, we chose cardenolide levels based on foliar concentrations to match A. curassavica.

      Some minor points:

      (1) Although the intact N,S-ring-containing compounds are recovered in low amounts in frass (and not sequestered), is there evidence of N,S-ring components being otherwise traceable in the frass? For example, can excess S or N be detected in frass? This could provide insight into differential detoxification or reincorporation of these elements, potentially explaining variation between voruscharin and uscharin.

      Great question! We have not been able to detect breakdown projects. In other experiments we have conducted mass spectrometric analysis of bodies and frass, but have not been able to find the features representing breakdown products. Nonetheless, as mentioned below, the main conversion products are evident and measurable, as in this study.

      (2) As a point of curiosity, is there evidence of interconversion between such compounds? For instance, if monarchs are fed only voruscharin, can other cardenolides be detected in their tissues?

      Yes, we have tried to make this more clear in the text. Both uscharin and voruscharin are converted to calotropin and calactin.

    1. eLife Assessment

      This important study introduces an experimental approach for studying Drosophila oviposition rhythms and identifies the subset of circadian clock neurons that mediate the circadian control of oviposition. The authors try to resolve a known noisy rhythm and provide convincing evidence by using statistical averaging techniques which help reduce this noise but at the cost of variation across individual rhythms. To this end, including the time series of representative individuals for all genotypes tested would have helped in interpreting some of the results. This paper will be of interest to anyone interested in insect ovarian physiology, circadian biology, and reproductive fitness.

    2. Joint Public review:

      Summary

      Riva et al. introduce a semi-automatic setup for measuring Drosophila melanogaster oviposition rhythms and use it to map the timekeeping function underlying egg laying rhythms to a subset of clock cells. Using a combination of neurogenetic manipulations and referencing the publicly available female hemi-brain connectome dataset, they narrow the critical circuit down to possibly two of the three CRYPTOCHROME expressing lateral-dorsal neurons (LNds). Their findings suggest that different overlapping sets of clock neurons may control different behavioral rhythms in D. melanogaster.

      This work will be of interest to researchers interested in the circadian regulation of oviposition in D. melanogaster (and possibly other insects), a phenomenon which has been left relatively under-explored. The construction of a semi-automated setup which can be made relatively cheaply using available motors and 3D printed molds provides a useful model for obtaining longer records of oviposition activity. The analysis of noisy oviposition timeseries, however, may require revisiting both the methods used for sampling eggs laid per female as well as the analytical tools used to clean up and analyze individual records, because simple averaging can lead to incorrect conclusions regarding the underlying nature of the rhythm.

      Strengths

      Additional experiments were carried out for this revised version of the manuscript that strengthen their original findings. These include: using a dominant negative form of the circadian clock gene, cycle, to disrupt the circadian clock, which provides additional support for the role of CRY+ LNds in generating the circadian rhythm of oviposition; reassessing the functionality of PDF neurons and showing that they seem to be important for maintaining the circadian period of egg laying; using the per01 mutation to show the role of period locus function in the control of the circadian rhythm of oviposition. The authors also point to some potentially interesting connectome data that suggest hypotheses regarding the neuronal circuit linking daily timekeeping to oviposition, which will require further validation in future studies. The videos and pictures demonstrate the working of the semi-automated egg collection setup, which should help others create similar devices.

      Weaknesses:

      The major weaknesses of this work result from the noisy nature of the data.

      They include:

      (1) Problems associated with averaging: The authors intended to focus on the oviposition clock in individual females, however due to the inherent noise in the oviposition rhythm they had to resort to averaging across Lomb-Scargle periodograms generated from individual time-series. They then tested whether the averaged periodogram contains a significant frequency. However, this reduction in noise also reduces the ability to compare differences in power of the rhythm across individuals. Furthermore, this method makes it especially difficult to distinguish the contribution of subsets of the circuit on the proportion of rhythmic flies and the power of the rhythm. In this revised version the authors use two manipulations to disrupt the molecular clock, which could have different success rates based on the type and number of cells targeted. Unfortunately, the type of averaging used prevents the detection of any such effects. It is to be noted that, indeed, individual-level differences in period between the PdfDicer-Gal4 > perRNAi and UAS-perRNAi lines help the authors to establish that there is a significant reduction in period length when the molecular clock is abolished in PDF cells. These individual measurements are now very helpful in discerning the effect of manipulations carried out on different circadian neural subsets, some of which could have been missed if only averages were considered.

      (2) Sensitivity to sample size: Averaging reduces the effect of random background noise but noise reduction is dependent upon sample size. Comparing genotypes with different sample sizes in addition to varying signal to noise ratios (which might also change with neural manipulations) makes it difficult to estimate how much of the rhythm structure is contributed by a given neuronal subset; thus, whenever possible comparisons should be made between groups that include similar number of flies. This problem is compounded when the averaged periodogram is composed of both rhythmic and weakly rhythmic individuals. For instance, in the main text the reported value of period length of pdfDicer-Gal4 > perRNAi is 20.74h (see also Fig 2J) but in the Supplementary figure 2S1 this is close to 22h, while the values reported for the control are largely similar (24.35h in Fig 2H versus ~24h in Fig 2S1). A difference of 3.6h between control and experimental flies is much greater than 2h. Which estimate (average versus individual) is more reliable in predicting the behavior of these flies is difficult to determine without further experiments.

      (3) Based on the newly provided data for individual fly periodograms the reader can visually evaluate the rhythmicity associated with each genotype. Such visual inspection did not reveal any clear difference between the proportion of rhythmic individuals between experimental and parental GAL4 and/or UAS controls, except for experiments using per01 mutant animals. This is surprising since if these circuits are controlling the oviposition rhythm, perturbing them should affect most individuals in a similar way.

      In summary, although the authors have implicated CRY+ LNds in the generation of a circadian rhythm in oviposition it is not clear looking at individual readouts if this manipulation is rendering flies arrhythmic or changing the period of the clock slightly, such that there is increased variation in period length at the individual level which is not being captured by the low signal to noise ratio and in the average gives a flattened output as a result. Thus, while the manipulations done to the clock in these neurons might indeed affect the circadian nature of the oviposition rhythm it is still rather difficult to determine if they are indeed the sole clock cells generating this rhythm especially when nearby PDF+ cells also affect period length. Nevertheless, the connectomic data do show that they are very close to the OviIN neurons, placing them at an important juncture of transmitting circadian time information to the downstream oviposition circuit. Overall, the authors have achieved some of their aims, although the analysis methods leave some of their inferences open to speculation.

      Other comments

      Disrupting the clock in the 5th sLNv and 3 Cry+ LNds (and weakly in a small subset of DN1) affected egg-laying. Although the work emphasizes the importance of the LNd, the role of the 5th sLNv's role should be discussed.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public review:

      Weaknesses:

      (1) Controls for the genetic background are incomplete, leaving open the possibility that the observed oviposition timing defects may be due to targeted knockdown of the period (per) gene but from the GAL4, Gal80, and UAS transgenes themselves. To resolve this issue the authors should determine the egg-laying rhythms of the relevant controls (GAL4/+, UAS-RNAi/+, etc); this only needs to be done for those genotypes that produced an arrhythmic egg-laying rhythm.

      (2) Reliance on a single genetic tool to generate targeted disruption of clock function leaves the study vulnerable to associated false positive and false negative effects: a) The per RNAi transgene used may only cause partial knockdown of gene function, as suggested by the persistent rhythmicity observed when per RNAi was targeted to all clock neurons. This could indicate that the results in Fig 2C-H underestimate the phenotypes of targeted disruption of clock function. b) Use of a single per RNAi transgene makes it difficult to rule out that off-target effects contributed significantly to the observed phenotypes. We suggest that the authors repeat the critical experiments using a separate UAS-RNAi line (for period or for a different clock gene), or, better yet, use the dominant negative UAS-cycle transgene produced by the Hardin lab (https://doi.org/10.1038/22566).

      We have followed the referee advice,repeating the experiments with the dominant negative UAS-cyc<sup>DN</sup>. They nicely confirm our conclusions: the abolition of the cellular clock in LNd neurons rule out the rhythmicity of oviposition. The results are presented in Fig. 3 of the new manuscript, panels H to N. We thank the reviewer for this suggestion that has definitely improved our paper, since it allows us to confirm our result using both a different driver and a different UAS sequence. In addition, we included the required GAL4 controls, which can be found in Panels E, L of the figure as well as average egglaying profiles for all genotypes involved (Panels B, D, F, I, K and M). Regarding the MB122Bsplit-Gal4>UAS-per<sup>RNAi</sup> experiment, we moved it to a supplementary figure (Figure 3S1). The paragraph where the new Figure 3 is discussed has been modified accordingly.

      (3) The egg-laying profiles obtained show clear damping/decaying trends which necessitates careful trend removal from the data to make any sense of the rhythm. Further, the detrending approach used by the authors is not tested for artifacts introduced by the 24h moving average used.

      The method used for the assessment of rhythmicity is now more fully explained and tested in the supplementary material. In particular, the issue of trend removal is treated in the second section of the SM, and the absence of "artifacts" (interpreted as the possibility of deciding that a signal is rhythmic when it is not, or vice versa) shown in figs. S3 to S5.

      (4) According to the authors the oviposition device cannot sample at a resolution finer than 4 hours, which will compel any experimenter to record egg laying for longer durations to have a suitably long time series which could be useful for circadian analyses.

      The choice of sampling every 4 hours is not due to a limitation imposed by the device used. In fact the device can be programmed to move at whatever times are desired. As mentioned in the Material and Methods section, "more frequent sampling gives rise to less consistent rhythmic patterns", because the number of eggs sampled at each time slot become too small. In particular, we have tested sampling at intervals of 2 hours, and we have observed that this doubles the work performed by the experimenter but does not lead to an improvement in the assessment of rhythmicity.

      (5) Despite reducing the interference caused by manually measuring egg-laying, the rhythm does not improve the signal quality such that enough individual rhythmic flies could be included in the analysis methods used. The authors devise a workaround by combining both strongly and weakly rhythmic (LSpower > 0.2 but less than LSpower at p < 0.05) data series into an averaged time series, which is then tested for the presence of a 16-32h "circadian" rhythm. This approach loses valuable information about the phase and period present in the individual mated females, and instead assumes that all flies have a similar period and phase in their "signal" component while the distribution of the "noise" component varies amongst them. This assumption has not yet been tested rigorously and the evidence suggests a lot more variability in the inter-fly period for the egg-laying rhythm.

      As stressed in the paper, and in the new Supplementary Material, the individual egg records are very noisy, which in general precludes the extraction of any information about the underlying period and phase. The workaround we (and others, e.g. Howlader et al. 2006) have used is analyzing average egg records for each genotype. Even though this implies assuming the same period and phase for all individuals, we have observed, using experiments with synthetic data, that small variations in individual periods (of the same amount as those present in real experiments where the period of some flies can be assessed individually) still allow us to use our method to decide if the genotype is rhythmic or not. This issue is discussed at length in the new Supplementary Material. There we also discuss an experiment with real flies, showing the individual records, and the corresponding periodograms, for each fly, for a rhythmic (Fig. S14) and an arrhythmic genotype (Fig. S17).

      (6) This variability could also depend on the genotype being tested, as the authors themselves observe between their Canton-S and YW wild-type controls for which their egg-laying profiles show clearly different dynamics. Interestingly, the averaged records for these genotypes are not distinguishable but are reflected in the different proportions of rhythmic flies observed. Unfortunately, the authors also do not provide further data on these averaged profiles, as they did for the wild-type controls in Figure 1, when they discuss their clock circuit manipulations using perRNAi. These profiles could have been included in Supplementary figures, where they would have helped the reader decide for themselves what might have been the reason for the loss of power in the LS periodogram for some of these experimental lines.

      We have added the individual periodograms of the arrhythmic lines to the Supplementary material (Figs. 3S2, 3S5 and panel G of Fig. 3S1), where they can be compared with their respective controls (Figs 3S3, 3S4, 3S6, 3S7 and panel F of Fig. 3S1).

      (7) By selecting 'the best egg layers' for inclusion in the oviposition analyses an inadvertent bias may be introduced and the results of the assays may not be representative of the whole population.

      We agree that the results may be biased for 'the best egg layers'. We remark however, that the flies that have been left out lay very few eggs, some of them even laying no eggs on a whole day. For these flies it is difficult to understand how one can even speak of egg laying rhythmicity (let alone how one can experimentally assess it). Thus, we think it might be misleading to speak of results as "representative of the whole population". Furthermore, it is even possible that the very concept of egg laying rhythmicity makes little sense if flies do not lay enough eggs.

      (8) An approach that measures rhythmicity for groups of individual records rather than separate individual records is vulnerable to outliers in the data, such as the inclusion of a single anomalous individual record. Additionally, the number of individual records that are included in a group may become a somewhat arbitrary determinant for the observed level of rhythmicity. Therefore, the experimental data used to map the clock neurons responsible for oviposition rhythms would be more convincing if presented alongside individual fly statistics, in the same format as used for Figure 1.

      In general, we have checked that there are no "outliers", in the sense of flies that lay many more eggs than the others in the experiment. But maybe the reviewer is referring to the possibility that a few rhythmic flies make the average rhythmic. This issue is addressed in the supplementary material, at the end of section "Example of rhythmicity assessment for a synthetic experiment". In short, we found that eliminating some of the most rhythmic flies from a rhythmic population makes the average a bit less rhythmic, but still significantly so. Conversely, if these flies are transferred to an arrhythmic population, the average is still non rhythmic.

      Regarding "the number of individual records that are included in a group may become a somewhat arbitrary determinant for the observed level of rhythmicity", we stress that we have not performed a selection of flies for the averages. All of the flies tested are included in the average, independently of their individual rhythmicity, provided only that they lay enough eggs.

      (9) The features in the experimental periodogram data in Figures 3B and D are consistent with weakened complex rhythmicity rather than arrhythmicity. The inclusion of more individual records in the groups might have provided the added statistical power to demonstrate this. Graphs similar to those in 1G and 1I, might have better illustrated qualitative and quantitative aspects of the oviposition rhythms upon per knockdown via MB122B and Mai179; Pdf-Gal80.

      We are aware that in the studies of the rhythmicity of locomotor activity the presence of two significant peaks is usually interpreted as a “complex rhythm”, i.e. as evidence of the existence of two different mechanisms producing two different rhythms in the same individual. In our case, since the periodograms we show assess the rhythmicity of the average time series of several individuals, the two non-significant peaks could also correspond to the periods of two different subpopulations of individuals. However, a close examination of the individual periodograms, now provided as Supplementary Figures 3S2 to 3S9, does not show any convincing evidence of any of these two possibilities.

      Another possibility could be that such peaks are simply an artifact of the method in the analysis of time series that consist of very few cycles and also few points per cycle. In the supplemenatry material we show that this can indeed happen. Consider, for example, periodograms 2 and 4 in Fig. S12 of the SM. Even though both of them display two non significant peaks, these periodograms correspond to two synthetic time series that are completely arrhythmic.

      We have added to the manuscript a paragraph discussing the issue of possible bimodality (next to last paragraph in subsection "The molecular clock in Cry+ LNd neurons is necessary for rhythmic egg-laying").

      Wider context:

      The study of the neural basis of oviposition rhythms in Drosophila melanogaster can serve as a model for the analogous mechanisms in other animals. In particular, research in this area can have wider implications for the management of insects with societal impact such as pests, disease vectors, and pollinators. One key aspect of D. melanogaster oviposition that is not addressed here is its strong social modulation (see Bailly et al.. Curr Biol 33:2865-2877.e4. doi:10.1016/j.cub.2023.05.074). It is plausible that most natural oviposition events do not involve isolated individuals, but rather groups of flies. As oviposition is encouraged by aggregation pheromones (e.g., Dumenil et al., J Chem Ecol 2016 https://link.springer.com/article/10.1007/s10886-016-0681-3) its propensity changes upon the pre-conditioning of the oviposition substrates, which is a complication in assays of oviposition rhythms that periodically move the flies to fresh substrate.

      We agree that social modulation can be important for oviposition, as has been shown in the paper cited by the reviewer. But we think that, in order to understand the contribution of social modulation to oviposition, it is important to know, as a reference for comparisons, what the flies do when they are isolated. Our aim in this work has been to provide such a reference.

      Recommendations for the authors:

      (1) The weaknesses identified in the Public review could be addressed as follows: etc.

      We have followed the suggestions of the editor and addressed each of the weaknesses mentioned (see details above).

      (2) Could the authors comment on their choice of using individual flies for their assay rather than (small) groups of flies? Is it possible that their assay would produce less noisy results with the latter?

      First we want to emphasize that our aim here was to assess the presence of individual rhythmicity, free from any external influences, whether arising from environmental external cues (such as light or temperature changes) or by social interactions (with other females or males). However, we were also curious about the behavior when males were put in the same chamber with each female. We performed a few tests and the results were very similar to what we obtained with single females.

      (3) Minor points:

      (a) Line 57-58 - "around 24 h and a peak near night onset (Manjunatha et al., 2008). Egglaying rhythmicity is temperature-compensated and remains invariant despite the nutritional state": Rephrase to something simpler like temperature and nutrition compensated.

      Corrected.

      (b) Line 56-57 - "The circadian nature of this behavior was revealed by its persistence under DD with a period around 24 h and a peak near night onset (Manjunatha et al., 2008)." A better reference here would be to Sheeba et al, 2001 for preliminary investigations into the egg-laying rhythms of individual flies and McCabe and Birley, 1998 for groups of flies under LD12:12 and DD.

      Suggestion accepted.

      (c) Line 65-67 - "We determined..... molecular clock in the entire clock network reduced the LNv did not." This suggests that it was unknown until now that LNv does not have a role, whereas Howlader et al 2006 already suggested that. The reader becomes aware of this at a later part of the manuscript. Please revise.

      This has been revised, and the citation to Howlader et al 2006 added to the new sentence.

      (d) Line 67 - "impairing the molecular clock in the entire clock network reduced the circadian rhythm of.."; saying "Reduced the power of the circadian rhythm" might be better phrasing."

      Suggestion accepted.

      (e) Line 72 - using the Janelia hemibrain dataset.

      Corrected

      (f) Line 72 typo "ussing", should be 'using'.

      Corrected.

      (g) Line 94: why is the periodic signal the same for all on the first day of DD?

      It is well known that in LD conditions activity is driven by the environmental light-dark cycle, which entrains the endogenous circadian clock of all flies. Even after the transition to DD, the effects of this entrainment persist for a few days, allowing the individual rhythmic patterns set by the light-dark cycle to remain synchronized for at least a few cycles. We are assuming that the same happens with oviposition. A sentence has been added explaining this (beginning of third paragraph of subsection "Egg-laying is rhythmic when registered with a semiautomated egg collection device").

      (h) Figure 1A-D, Were all flies included or only rhythmic flies? Please make this clear. How do you distinguish rhythmic and arrhythmic flies in Figure 1E? Their representative individual plots of egg number graphs are required. Why was the number of flies under DD decreased from 20 to 18?

      Throughout the paper, the analysis of average rhythmicity has been performed including all flies, since we postulate that even flies that individually can be classified as non rhythmic have a rhythm that is corrupted by noise, and that this noise can be partially subtracted by performing an average. The explanation of the characterization of rhythmic and arrhythmic individuals is in the Methods section, under the Data Analysis subsection. This is now fully developed in the Supplementary material, where the individual plots for some of the genotypes are included.

      Regarding the question of the number of flies having "decreased from 20 to 18?", there is a misunderstanding here. The results depicted in Figure 1, and in particular in panel E, correspond to two different experiments: one performed only in LD (7 days, n=20), and a second one performed for 5 days in DD, with one previous day in LD (n=18).

      (i) Figure E and K, Are n=20, 18, and n=30, 22 the total numbers of flies including both rhythmic and nonrhythmic? If so, it would be better to put them in the column, not in the rhythmic column.

      The figure has been corrected.

      (j) Line 107-108, please provide a citation for this statement.

      We have added two references: Shindey et al. 2016, and Deppisch et al. 2022.

      (k) Figure 1, 2, etc., please write a peak value inside the periodogram graph. This makes comparison easier.

      The peak values have been added in all Figures.

      (l) Line 184-185, Figure 2F, tau appears shorter in Clk4.1>perRNAi flies than in control, which suggests that DNp1 may play a role?

      As explained in the Supplementary Material, the particularities of oviposition records (discrete values, noise, few samples per period, etc.) preclude an accurate determination of the period if the record is considered as rhythmic. In particular, Fig. S4 shows that differences of 1 hour between the real and the estimated periods are not unusual.

      (m) Figure 4. Why are 2 controls shown? Please explain. Are they the same strains?

      The two controls shown are the UAS control and the GAL4 control. This information has now been added to the figure.

      (n) Line 314 'that' should be 'than'?

      Corrected.

      (o) Line 73-74 - Phrasing is not clear in: "LNds and oviposition neurons, consisting with, the essential role of LNds neurons in the control of this behavior.""

      Corrected.

      (p) Line 81-84 - "the experiments particularly demanding and labor-intensive. In this approach, eggs are typically collected every 4 hours (sometimes also every 2 hours), which usually implies transferring the fly to a new vial or extracting the food with the eggs and replacing it with fresh food in the same vial (McCabe and Birley, 1998; Menon et al., 2014)." McCabe and Birley had an automated egg collection device designed for groups of flies, which sampled eggs laid every hour for 6 days. Please remove this reference in this context

      Reference removed.

      (q) Line 91-92 - "The assessment of oviposition rhythmicity is challenging because the decision of laying an egg relies on many different internal and external factors making this behavior very noisy." This sentence makes it appear that 'assessment' is the limitation. Even locomotor activity is governed by many internal and external factors, yet we can obtain very robust rhythms. The sentence that follows is also not easy to digest. Can the authors frame the idea better?

      We have rewritten the corresponding paragraph in order to make it more clear (second paragraph of the Results section). Additionally, the Supplementary Material contains now a more detailed explanation and analysis of the method used.

      (r) Line 104-107 - rhythmic (with a period close to 24 h, Figure 1F) although the average egg record is strongly rhythmic with a period around 24 h (Figure 1B). Under DD condition, individual rhythmicity percentages are the same as in LD (Figure 1E) and their average record is also very rhythmic with a period of 24 h (Figure 1D). 'Strongly rhythmic' and 'very rhythmic' are less indicative of what is happening with the oviposition rhythm and can be phrased as robust instead, with a focus on their power measured.

      We have accepted the suggestion.

      (s) Line 108-110 - "Thus, egg-laying displays a much larger variability than locomotor activity, compounding the difficulty of observing the influence of the circadian clock on this behavior." The section discussed here does not illustrate the variability in egg-laying as much as the lack of robustness of the rhythm. The variation in rhythmicity going from CS flies (~70% rhythmic) to yw flies (~50% rhythmic) showcases the variability in this rhythm and how it is difficult to observe when compared to locomotor rhythms, which are usually consistently >90% rhythmic across multiple genotypes. These lines can be placed after the discussion about yw and perS flies. Moreover, previous studies using individual flies have reported that egg-laying rhythm is more variable than others Figure 1, Sheeba et al 2001.

      We have accepted the suggestion, replacing "Thus, egg-laying displays a much larger variability than locomotor activity..." by "This shows that, at the individual level, egg-laying is much less robust than locomotor activity ..."

      (t) Figure 1. Genotype notation within the figure panels is not consistent with the accepted / conventional notation or with the main text or legend notations throughout the manuscript.

      We are sorry for this mistake. We have corrected the genotype names in Figures and text in order to make notation consistent across the paper.

      (u) Supplementary Figure 1 Legend. Error in upper right corner? Not left corner? The photo does not clearly show the apparatus. The authors may wish to consider clearer images and more details about the apparatus including details of the 3D printing of the device and perhaps even include a short video where the motor moves the flies to a new chamber (This is only a suggestion to advertise the apparatus, not related to the review of the manuscript). They could also provide information about what fraction of females survived till the end of each trial when 21 flies were examined with 4-hour sampling across 4-5 cycles.

      In general, more than 80% of the females are alive at the end of a one week oviposition experiment. We have added this information in the Methods section at the end of the corresponding subsection ("Automated egg collection device"). Regarding the eggcollection device, we have replaced the photographs in what is now Supplementary Figure 1S1, and a short supplementary movie showing its operation.

      (v) The results depicted in Figure 2B are that of averaged time series. Hence the reader does not know 'the fact' that knocked-down animals are not completely rhythmic. Is the "not completely arrhythmic" in reference to flies with a power > 0.2 (weakly rhythmic) in their egg-laying rhythm or to the presence of ~40% of male flies (Supplementary Table 1) with a locomotor rhythm after perRNAi silencing of most of their clock neurons? This is confusing because no intermediate category of flies is discussed in Figure 2. Please edit for clarity.

      We were referring to the rhythmicity of the genotype, not of the individuals. We have rewritten the corresponding paragraph in order to make it clearer (last paragraph of the first subsection of the Results section).

      (w) Line 173 - ablation or electrically silencing all PDF+ neurons (Howlader et al., 2006). There were no experiments carried out using electrical silencing of PDF+ neurons in the referenced paper.

      We are sorry for this mistake. This has been corrected (we have deleted the mention to electrical silencing).

      (x) Line 173 - Shortening of period by nearly 3 hours cannot be considered minor.

      We agree, and we have deleted the word "minor".

      (y) Line 332-333 - "We also disrupted the molecular clock (or electrically silenced) in PDFexpressing neurons as well as in the DN1p group with no apparent effect on egg-laying rhythms". There was period shortening observed for pdf GAL4 > perRNAi manipulation so there was an effect on the egg-laying rhythm. Additionally, perRNAi based silencing does not electrically silence PDF neurons as the kir 2.1 was expressed only using Clk4.1 GAL4 in the Dn1ps. This line should be rewritten.

      We have rewritten the paragraph mentioned (third paragraph of the Discussion) in order to make it more accurate.

      (4) Page 22 - Data Analysis

      Since the number of eggs laid by a mated female tend to show a downward trend, we proceeded as follows, in order to detrend the data (see the Supplementary Material for further details). First, a moving average of the data is performed, with a 6 point window, and a new time series T is obtained. In principle, T is a good approximation to the trend of the data. Then, a new, detrended, time series D is generated by pointwise dividing the two series (i.e. D(i)=E(i)/T(i), where i indexes the points of each series)." Can the authors provide a reference for this method of detrending? Smoothing can frequently introduce artifacts in the data and give incorrect period estimates. Additionally, the trend visible in the data, especially in Figure 1, suggests a linear decay that can be easily subtracted. Also, there is no discussion of detrending in the Supplementary material attached.

      We are sorry for the confusion with the Supplementary materials. The method used for subtracting both noise and trend from the data is now fully explained in the new Supplementary Material. All the issues raised by the reviewer in this comment have been addressed there.

      (5) Figure by figure

      Page - Type (Figure or text) - Comment

      (a) Page 6 Figure 1C There is remarkable phase coherence seen in the average egg laying time series for CS flies 5 days into DD and as the authors note in Lines 94-95 in the text "Under light-dark (LD) conditions, or in the first days of DD, it can be that the periodic signal is the same for all flies". Since this observation is crucial to constructing the figures seen later in the paper, a note should be made about why this rhythm could persist across flies, so deep into DD.

      As mentioned above, we have added a couple of lines explaining why we think that the assumption of a synchronized periodic signal is reasonable, at least during the first cycles (second paragraph of the first subsection of section Results).

      (b) Figure 1 G The effect of period/phase decoherence seems to be showing up here in the average profile for yw flies as they seem to completely dampen out after 2 days in DD and yet have a 24-hour rhythm in the averaged periodogram. The authors should make a note here if the LS periodogram is over-representing the periodicity of the first few days in DD or if comparing the first 3 vs. the last 3 days in DD gives different results.

      The dampening observed in average oviposition records is a product of the dampening of the oviposition records, which is well known phenomenon, probably caused by the depletion of sperm in the female spermatheque. One of the aims of the method used in the paper was to avoid the bias introduced by this dampening, by means of a detrending procedure. This is explained in the Materials an Methods, and now full details are given in the new Supplementary Materials.

      (c) Figure 1E, K Is this data pooled across 2-3 experiments, as discussed in lines 500-01 under 'Statistical Analysis'? Also, what test is being performed to check for differences between proportions here, seeing as there are no error bars to denote error around a mean value and no other viable tests mentioned in Statistical Analysis?

      We are sorry for this omission. For the comparison of proportions we used the 'N-1' Chisquared test. We have added a sentence detailing this at the end of the Statistical analysis section.

      (d) Figure 1 F, L Can the total number of weakly and strongly rhythmic values be indicated in the scatter plot?

      Corrected.

      (e) Figure 1F, L (legend) Is the Chi-squared test being performed on the proportion values of Figure 1(E, K) or for Figure 1(F, L)?"

      The chi-squared test mentioned was used for Fig1 F-L. As explained above, for the comparison of proportions we used 'N-1' Chi-squared test. This has now been added to the legend of the figure

      (f) Page 8 Figure 2B Seeing as individual flies with a LS periodogram power < 0.2 are considered weakly rhythmic in Figure 1 F, L can Clk856 > perRNAi flies on average also be considered weakly rhythmic, as the peak in the periodogram is above 0.3?

      We prefer to use the weakly rhythmic class only for individual flies. Nevertheless, we agree that this periodogram shows that the genotype analyzed is not completely arrhythmic, and that this might be due to some remaining individual rhythmicity. As mentioned above, we have rewritten the last paragraph of the first subsection of section Results in order to discuss this.

      (g) Figure 2D Can the authors comment on why there is a shorter period rhythm when PDF neurons have a dysfunctional clock, whereas previous evidence (Howlader et al., 2004) suggested that these neurons play no role in egg-laying rhythm? They should also refer to McCabe and Birley, 1998 to see if their results (where they observed a shorter period of ~19h with groups of per0 flies), might be of interest in their interpretations.

      We have added a line commenting this in the corresponding subsection ("LNv and DN1 neurons are not necessary for egg-laying rhythmicity") of the Results, as well as a discussion of this in the third paragraph of the Discussion. In a nutshell, even though Howlader et al did not find a shortening when PDF neurons are ablated, they did find it in pdf01 flies.

      (h) Figure 2 F, H As the authors mention in their Discussion on Page 16, lines 340-45, the manipulation of DN1p neurons might abolish the circadian rhythm in oogenesis as reported by Zhang et al, which is why they looked at this circuit driven by Clk4.1 neurons and comment that "The persistence of the rhythm of oviposition implies that it is not based on the availability of eggs but is instead an intrinsic property of the motor program". However, no change in fecundity is reported for either kir2.1 or perRNAi-based manipulations of these neurons, to help the reader understand if egg availability (at the level of egg formation) is playing any role in the downstream (and seemingly independent) act of egg laying. The authors should report if they see any change in total fecundity for either set of flies w.r.t their respective controls. Also, is the reduction in power seen with electrical silencing vs perRNAi expression of any relevance? Does the percentage of rhythmic flies change between these two manipulations?

      In the line mentioned by the reviewer what we meant is that our results show that the rhythm of oviposition does not seem to be based in the rhythmic production of oocytes, which is not necessarily connected with the total number of eggs produced. We have modified the corresponding line in the paper, in order to avoid this misunderstanding. Regarding the "reduction in power" mentioned, it must be stressed that, in general, the height of the peak is correlated with the fraction of rhythmic individuals. The problem is that this fraction is a much more noisy output, and that is the reason why we have chosen to work with periodograms of averages.

      (i) Figure 2 E and G, a loss of rhythmicity could also be due to a decrease in fecundity in the experimental lines. Since the number of eggs laid for each genotype is already known, can the authors show statistically relevant comparisons between the experimental lines and their respective controls? In this vein, can the averaged time series profiles also be provided for all the genotypes tested (as seen previously in Figure 1 A, C, G, I), perhaps in the supplementary?

      We did not focus on fecundity in the present work. However, our observations do not seem to show any definite relationship with rhythmicity. We plan to address the issue of fecundity more systematically in a future work. The averaged time series profiles have now been added to the figure.

      (j) Scatter plots showing the average period and SEM as seen in Figure 1 (F, L) would help in understanding if these manipulations have any effect on variation in the period of the egg-laying rhythm across flies. Particularly for pdf GAL4 > perRNAi flies which have a net shorter period, (but this might vary across the 34 flies tested).

      We have added a Supplementary Figure (2S1) that shows that the shortening of oviposition period can be also observed at the individual level. We have also added a line commenting this in the corresponding subsection ("LNv and DN1 neurons are not necessary for egg-laying rhythmicity") of the Results, as well as a discussion of this in the third paragraph of the Discussion.

      (k) Page 11 Figure 3B Does the presence of two peaks in the LS periodogram at a power > 0.2 indicate the presence of weakly rhythmic flies with both a short(20h) and a long(~27h) period component or either one? The short-period peak is nearly at p < 0.05 level of significance. So then, do most of the flies in MB122B GAL4 > perRNAi line show a weakly rhythmic shorter period?

      (l) Figure 3D A similar peak is observed again at 20h (LS power > 0.2 and nearly at p < 0.05 significance level again) and a different longer one at (~30h) though this one is almost near 0.2 on the power scale. Given the consistency of this feature in both LNd manipulations, the authors should comment on whether this is driven by variation in periods detected or the presence of complex rhythms (splitting or change in period) in the oviposition time series for these lines.

      (m) Figure 3 General scatter plots showing average period {plus minus} SEM could help explain the bimodality seen in the periodograms. Additionally indicating just how many flies are weakly rhythmic vs. strongly rhythmic can also help to illustrate how important the CRY+ LnDs are to the oviposition rhythm's stability.

      For these three comments (k, l and m), we note that the issue of bimodality has been addressed above, in our response to Weakness 9.

      (o) Figure 4B Same as comments under Figure 1, what is the statistical test done to compare the proportions for these three genotypes?

      As mentioned above, for the comparison of proportions we used the 'N-1' Chi-squared test. We have added a sentence detailing this at the end of the Statistical analysis section.

      (p) Figure 4C Are all flies significantly rhythmic? The authors should also provide an averaged LS periodogram measure for each genotype, to help illustrate the difference in power between activity-rest and egg-laying rhythms.

      Yes, the points represent periods of (significantly) rhythmic flies. This has been added to the caption, to avoid misunderstandings. The differences that arise when assessing rhythmicity in activity records vs. egg-laying records is addressed at length in the Supplementary Material (see e.g. Fig S1).

      (q) Page 15 Figure 5 - general As the authors discuss the possible contribution of DN1ps to evening activity and control over oogenesis rhythm, investigating the connections of the few that are characterized in the connectome (or lack thereof) with the Oviposition neurons, can help illustrate the distinct role they play in the female Drosophila's reproductive rhythm.

      This information was in the text and the Supplementary Tables. Lines 273-275 of the old manuscript read: "The full results are displayed in Supplementary Tables 2 and Table 3, but in short, we found that whereas there are no connections between LNv or DN1 neurons and oviposition neurons..."

      (r) Minor: The dark shading of the circles depicting some of the clusters makes it difficult to read. Consider changing the colors or moving the names outside the circles.

      Figure corrected.

      (s) Line 38: The estimated number of clock neurons has been revised recently (https://www.biorxiv.org/content/10.1101/2023.09.11.557222v2.article-info).

      Thank you for the reference. We have corrected the number of clock neurons in the Introduction of the new manuscript.

    1. eLife Assessment

      This study presents a valuable finding on the mutational order for common alterations in colorectal cancer. The evidence of in vitro growth assays comparing mutations is solid, although inclusion of biological replicates for the transcriptional assessments and in vivo experiments would have strengthened the study. The work will be of interest to scientists working in the field of colon cancer.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, Li et al. used genetically engineered murine intestinal organoids to investigate how the temporal order of oncogenic mutations influences cell state and tumourigenicity of colorectal epithelial cells. By sequentially introducing Apc and Trp53 loss-of-function mutations in alternate orders within a Kras^G12D background, the authors generated isogenic organoid lines for both in vitro and in vivo characterisation. Bulk RNA-seq reveals expected transcriptional changes with relatively modest differences between the two triple-mutant configurations (KAT vs KTA). The key finding emerges from transplantation assays: while KAT and KTA organoids show equivalent tumourigenic potential in immunodeficient mice, only KAT organoids form tumours in immunocompetent hosts (5/10 vs 0/10), suggesting that mutation order shapes susceptibility to immune-mediated clearance. The experiments are well-executed, and the conclusions are generally supported by the data.

      Strengths:

      The experimental system is well-designed for the question. By combining a Kras^G12D transgenic background with sequential CRISPR-mediated knockout of Apc and Trp53 in alternate orders, the authors generated truly isogenic organoid lines that differ only in mutational sequence. This is technically non-trivial and provides a clean platform for dissecting order effects, a question otherwise difficult to address experimentally.

      The authors performed comprehensive baseline characterisation of these organoids, including morphological and histological assessment, quantification of organoid-forming efficiency and proliferation, and bulk RNA-seq profiling. While these analyses revealed no major differences between KAT and KTA organoids, and the observed enhancement of epithelial stemness upon Apc loss and proliferative advantage conferred by Trp53 loss are largely expected, the systematic nature of this characterisation establishes a useful methodological template for future organoid-based studies.

      The authors further investigated the functional impact of mutational order using subcutaneous transplantation assays. By comparing tumour formation in immunodeficient versus immunocompetent hosts, the authors uncover a genuinely unexpected finding: KAT and KTA organoids behave equivalently in the absence of adaptive immunity, but diverge dramatically when immune pressure is applied (KAT: 5/10; KTA: 0/10). This observation is arguably the most compelling aspect of the study and opens an interesting line of inquiry.

      Weaknesses:

      The authors acknowledge that initiating with Kras^G12D does not reflect the typical human sporadic CRC trajectory, where APC loss is usually the first event. While this design choice was pragmatic, it means the observed order effects are contextualised within an artificial starting point. It remains unclear whether the Apc/Trp53 order would matter in a Kras-wild-type background, or whether the Kras-driven cellular state is a prerequisite for these phenotypes to emerge.

      Subcutaneous implantation provides a tractable readout of tumourigenicity, but the cutaneous immune microenvironment differs substantially from that of the intestinal mucosa. Given that the central claim concerns immune-mediated selection, orthotopic transplantation would more directly test whether the observed order effects hold in a physiologically relevant context.

      The ssGSEA comparison involves only 14 ATK tumours, and the key comparisons (Figure 6E) yield borderline significance (p=0.052). More fundamentally, since mutation order cannot be inferred from the clinical samples, the authors are correlating organoid-derived IFN signatures with tumour immunophenotypes without direct evidence that these patients' tumours followed a KAT-like trajectory. The reasoning becomes circular: KAT organoids define the signature used to identify KAT-like clinical tumours.

      Furthermore, the most striking finding of the study, that KTA organoids fail to form tumours in immunocompetent hosts while KAT organoids can, lacks a mechanistic follow-up. The transcriptomic differences between KAT and KTA are modest when cultured as monocultures, yet their in vivo fates diverge dramatically. The authors do not address why these subtle intrinsic differences translate into such divergent immune susceptibility, nor do they characterise the immune response adequately (beyond limited CD4/CD8 IHC at tumour peripheries).

    3. Reviewer #2 (Public review):

      Summary:

      This study addresses an important and timely question in colorectal cancer biology by systematically examining the effects of the common driver mutations APC, KRAS G12D, and TP53 in murine colorectal organoids, with particular emphasis on how the order of APC and TP53 acquisition influences tumor phenotype. These mutations are well known to be frequent, truncal, and often co-occurring in colorectal cancer. While it is increasingly appreciated that mutational order can shape tumor behavior, studies directly comparing the phenotypic consequences of alternative APC-TP53 mutation orders remain rare. This work, therefore, addresses a relevant and timely question.

      Strengths:

      A major strength of the study is its focus on previously unexplored biology, combined with the generation of multiple isogenic murine organoid models with controlled mutational sequences. The authors employ careful and robust quality control of the CRISPR-mediated alterations, and the inclusion of both in vitro and in vivo experiments strengthens the relevance of the work.

      Weaknesses:

      There are, however, several limitations that should be considered when interpreting the findings. First, KRAS G12D activation is used as the initiating alteration, whereas APC loss is generally believed to be the initiating event in most human colorectal cancers. Second, the analysis is restricted to comparing only two mutation orders (KAT versus KTA), which limits the breadth of conclusions that can be drawn about mutation ordering more generally. Finally, key RNA-sequencing and in vivo experiments rely on a single isogenic line, which substantially constrains interpretability.

      The aim of the study was to systematically investigate how mutation accumulation and order influence colorectal cancer initiation. While the data suggest that the relative timing of APC and TP53 loss may be particularly important for tumor initiation, the absence of biological replication makes it difficult to draw robust conclusions. Engraftment efficiency and tumor behavior can be influenced by many factors for a single clone, including additional passenger mutations acquired during culturing, as well as epigenetic differences that are independent of the engineered mutations.

    4. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      In this study, Li et al. used genetically engineered murine intestinal organoids to investigate how the temporal order of oncogenic mutations influences cell state and tumourigenicity of colorectal epithelial cells. By sequentially introducing Apc and Trp53 loss-of-function mutations in alternate orders within a Kras^G12D background, the authors generated isogenic organoid lines for both in vitro and in vivo characterisation. Bulk RNA-seq reveals expected transcriptional changes with relatively modest differences between the two triple-mutant configurations (KAT vs KTA). The key finding emerges from transplantation assays: while KAT and KTA organoids show equivalent tumourigenic potential in immunodeficient mice, only KAT organoids form tumours in immunocompetent hosts (5/10 vs 0/10), suggesting that mutation order shapes susceptibility to immune-mediated clearance. The experiments are well-executed, and the conclusions are generally supported by the data. 

      Strengths: 

      The experimental system is well-designed for the question. By combining a Kras^G12D transgenic background with sequential CRISPR-mediated knockout of Apc and Trp53 in alternate orders, the authors generated truly isogenic organoid lines that differ only in mutational sequence. This is technically non-trivial and provides a clean platform for dissecting order effects, a question otherwise difficult to address experimentally. 

      The authors performed comprehensive baseline characterisation of these organoids, including morphological and histological assessment, quantification of organoid-forming efficiency and proliferation, and bulk RNA-seq profiling. While these analyses revealed no major differences between KAT and KTA organoids, and the observed enhancement of epithelial stemness upon Apc loss and proliferative advantage conferred by Trp53 loss are largely expected, the systematic nature of this characterisation establishes a useful methodological template for future organoid-based studies. 

      The authors further investigated the functional impact of mutational order using subcutaneous transplantation assays. By comparing tumour formation in immunodeficient versus immunocompetent hosts, the authors uncover a genuinely unexpected finding: KAT and KTA organoids behave equivalently in the absence of adaptive immunity, but diverge dramatically when immune pressure is applied (KAT: 5/10; KTA: 0/10). This observation is arguably the most compelling aspect of the study and opens an interesting line of inquiry. 

      We greatly appreciate your positive comments on our study.

      Weaknesses: 

      The authors acknowledge that initiating with Kras^G12D does not reflect the typical human sporadic CRC trajectory, where APC loss is usually the first event. While this design choice was pragmatic, it means the observed order effects are contextualised within an artificial starting point. It remains unclear whether the Apc/Trp53 order would matter in a Kras-wild-type background, or whether the Kras-driven cellular state is a prerequisite for these phenotypes to emerge. 

      We agree with the reviewer that initiating tumorigenesis with Kras<sup>G12D</sup> does not fully recapitulate the most common trajectory of sporadic human CRC, where APC loss typically occurs first. We had noted this point in the original Discussion and will further clarify it more explicitly in the Introduction part of the revised manuscript.

      Our experimental design was intended to establish a controlled and genetically tractable system to interrogate the principle of mutation order effects. In this context, Kras<sup>G12D</sup> activation provides a stable oncogenic baseline that facilitates sequential genome engineering and comparison of isogenic lines.

      Although APC loss is frequently the initiation event, a recent study has suggested that Kras<sup>G12D</sup> priming can reshape the selective landscape for subsequent driver events, including Apc alterations (PMID: 41339549). Consistent with this notion, our data indicate that Kras<sup>G12D</sup> activation induces a permissive oncogenic cellular state that may influence the phenotypic consequences of later mutations. We therefore speculate that the Kras<sup>G12D</sup>-primed context may contribute to the observed order-dependent effects.

      We agree that testing Apc/Trp53 order in a Kras-wild-type background would be an important future direction, and we will point this out explicitly in the revised Discussion.

      Subcutaneous implantation provides a tractable readout of tumourigenicity, but the cutaneous immune microenvironment differs substantially from that of the intestinal mucosa. Given that the central claim concerns immune-mediated selection, orthotopic transplantation would more directly test whether the observed order effects hold in a physiologically relevant context. 

      In the present study, we employed subcutaneous transplantation, which is a widely used platform to assess tumorigenic potential under controlled immune conditions. This approach offers high reproducibility, straightforward tumor monitoring, and has been broadly applied in organoid-based cancer studies in both immunodeficient (PMID: 23273993, 23776211, 32209571, 33055221) and immunocompetent (PMID: 32209571, 33055221, 41672595) settings.

      Importantly, our primary goal was to determine whether mutation order influences susceptibility to immune-mediated clearance, rather than to model the full complexity of the intestinal niche. The clear divergence between KAT and KTA specifically in immunocompetent hosts supports the existence of intrinsic mutation order-dependent immune vulnerability.

      Nevertheless, we fully agree with the reviewer that orthotopic transplantation would provide a more physiologically relevant immune microenvironment and represents also an important direction for future investigation. We will explicitly discuss this limitation and highlight orthotopic validation as an important future direction in the revised Discussion.

      The ssGSEA comparison involves only 14 ATK tumours, and the key comparisons (Figure 6E) yield borderline significance (p=0.052). More fundamentally, since mutation order cannot be inferred from the clinical samples, the authors are correlating organoid-derived IFN signatures with tumour immunophenotypes without direct evidence that these patients' tumours followed a KAT-like trajectory. The reasoning becomes circular: KAT organoids define the signature used to identify KAT-like clinical tumours. 

      We thank the reviewer for raising this important point. We would like to clarify that our intention was not to infer the actual mutation order in clinical samples, which indeed cannot be reliably reconstructed from bulk tumor RNA-seq data.

      Instead, our goal was to determine whether the transcriptional programs distinguishing KAT and KTA organoids could be observed in human CRC cohorts. In this context, the organoid-derived IFN-related signature was used as a molecular reference to assess potential clinical relevance, rather than to classify tumors by evolutionary trajectory.

      We agree that the statistical significance in Figure 6E is modest (p = 0.052), and we would like to revise the text to present this analysis more cautiously as a suggestive trend rather than definitive evidence. We will also clarify this limitation explicitly in the revised manuscript to avoid overinterpretation.

      Furthermore, the most striking finding of the study, that KTA organoids fail to form tumours in immunocompetent hosts while KAT organoids can, lacks a mechanistic follow-up. The transcriptomic differences between KAT and KTA are modest when cultured as monocultures, yet their in vivo fates diverge dramatically. The authors do not address why these subtle intrinsic differences translate into such divergent immune susceptibility, nor do they characterise the immune response adequately (beyond limited CD4/CD8 IHC at tumour peripheries). 

      We thank the reviewer for this important point. We agree that the mechanistic basis underlying the differential immune susceptibility between KAT and KTA remains incompletely resolved.

      A practical limitation of the current study is that KTA grafts failed to establish tumors in immunocompetent hosts, which precluded downstream histological and immune profiling of established lesions. As a result, our in vivo immune characterization of KTA grafts is nearly impossible.

      Nevertheless, our transcriptomic analyses indicate that KAT and KTA organoids differ in interferon-response and immune-related programs prior to transplantation, and those differentially expressed genes were consistently preserved in tumor cells derived from immunodeficient hosts. These results suggest the presence of intrinsic tumor-cell-autonomous differences may influence immune recognition and clearance.

      We will expand the Discussion to outline several non-mutually exclusive mechanisms that could account for this phenotype, including altered interferon responsiveness, differential antigen presentation capacity, and changes in tumor cell-intrinsic immune visibility programs. These hypotheses are consistent with the transcriptional differences observed prior to transplantation and provide a framework for future mechanistic investigation. We agree that deeper immune profiling (e.g., immune infiltrate composition, antigen presentation status, and functional immune assays) will be important to fully elucidate the mechanism and represents a key direction for future work.

      Reviewer #2 (Public review): 

      Summary: 

      This study addresses an important and timely question in colorectal cancer biology by systematically examining the effects of the common driver mutations APC, KRAS G12D, and TP53 in murine colorectal organoids, with particular emphasis on how the order of APC and TP53 acquisition influences tumor phenotype. These mutations are well known to be frequent, truncal, and often co-occurring in colorectal cancer. While it is increasingly appreciated that mutational order can shape tumor behavior, studies directly comparing the phenotypic consequences of alternative APC-TP53 mutation orders remain rare. This work, therefore, addresses a relevant and timely question. 

      Strengths: 

      A major strength of the study is its focus on previously unexplored biology, combined with the generation of multiple isogenic murine organoid models with controlled mutational sequences. The authors employ careful and robust quality control of the CRISPR-mediated alterations, and the inclusion of both in vitro and in vivo experiments strengthens the relevance of the work.

      We greatly appreciate your positive comments on our study.

      Weaknesses: 

      There are, however, several limitations that should be considered when interpreting the findings. First, KRAS G12D activation is used as the initiating alteration, whereas APC loss is generally believed to be the initiating event in most human colorectal cancers.

      We sincerely thank the reviewer for their insightful comments regarding the initiation of tumorigenesis with a Kras mutation rather than the more canonical Apc loss, which was also raised by the reviewer #1. We fully agree that the Apc-first represents the most prevalent sequence in human colorectal cancer (CRC), We will more clearly explain the rationale for our experimental design in the revised Introduction part as outlined in our response to reviewer #1.

      Second, the analysis is restricted to comparing only two mutation orders (KAT versus KTA), which limits the breadth of conclusions that can be drawn about mutation ordering more generally.

      We thank the reviewer for pointing this limitation out. However, as a proof-of-concept, study of Apc and Trp53 loss, two major oncogenic events in CRC, serves as a biologically meaningful starting point for dissecting order-dependent effects. Although it is of great significance to compare all six possible mutation orders of these three driver genes, generating and thoroughly characterizing all genotypes represents a substantial undertaking beyond the scope of this initial study.

      Finally, key RNA-sequencing and in vivo experiments rely on a single isogenic line, which substantially constrains interpretability. 

      The aim of the study was to systematically investigate how mutation accumulation and order influence colorectal cancer initiation. While the data suggest that the relative timing of APC and TP53 loss may be particularly important for tumor initiation, the absence of biological replication makes it difficult to draw robust conclusions. Engraftment efficiency and tumor behavior can be influenced by many factors for a single clone, including additional passenger mutations acquired during culturing, as well as epigenetic differences that are independent of the engineered mutations.

      We thank the reviewer for raising his/her concern. We apologize that we have not made a clear presentation of our data source. Indeed, for all major in vitro and in vivo assays of double and triple mutants (KA/KT/KAT/KTA), we analyzed at least two independently derived clones per genotype. These independent clones harbor distinct mutations in target genes and were treated as biological replicates throughout the study.

      To improve clarity and transparency, we will revise the relevant figures and figure legends to explicitly indicate the clonal origin of each data point.

    1. eLife Assessment

      This study presents an important investigation of how people approach and avoid uncertainty, with a particular focus on the effects of overall uncertainty. They find that individuals approach uncertainty to a point, but when uncertainty is particularly high, they avoid it. The results are interpreted under a cognitive cost-resource rational framework. The methods are convincing, using appropriate and current methodologies.

    2. Reviewer #1 (Public review):

      This manuscript reports on the behavior of participants playing a game to measure exploration. Specifically, participants completed a task with blocks of exploratory choices (choosing between two 'tables', and within each table, two 'card decks', each of which had a specific probability of showing cards with one color versus another) and test choices, where participants were asked to choose which of the two decks per table had a higher likelihood of one color. Blocks differed on how long (how many trials) the exploration phase lasted. Participants' choices were fit to increasingly complex models of next-trial exploration. Participants' choices were best fit by an intermediate model where the difference in uncertainty between tables influenced the choice. Next, the authors investigated factors affecting whether participants sought out or avoided uncertainty, their choice reaction times, and the relationship of these measures with performance during the test phase of each block. Participants were uncertainty-seeking (exploratory) under most levels of overall uncertainty but became less uncertainty-seeking at high levels of total uncertainty. Participants with a stronger tendency to approach uncertainty at lower levels of total uncertainty were more accurate in the test phase, while the tendency to avoid uncertainty when total uncertainty was high was also weakly positively related to test accuracy. In terms of reaction times, participants whose reaction times were more related to the level of uncertainty, and who deliberated longer, performed better. The individual tendency to repeat choices was related to avoidance of uncertainty under high total uncertainty and better test performance. Lastly, choices made after a longer lag were less affected by these measures.

    3. Author response:

      The following is the authors’ response to the original reviews

      We would like to sincerely thank the editor and reviewers for their thoughtful and constructive feedback on our manuscript. We are grateful not only for the close reading and insightful suggestions, but also for the open and generous way in which the reviewers engaged with our work. In revising the manuscript, we have clarified how our contribution is situated within the existing literature, conducted additional analyses to examine individual differences in exploration strategies, expanded and refined our description of the DDM analyses, and added correlations between strategies and other behavioral measures. We have also clarified methodological points, such as the estimation of thresholds, and provided new supplementary figures and analyses where appropriate. In several places, we have modified and qualified our interpretations in line with the reviewers’ comments. We believe these changes have significantly strengthened the manuscript, and we are grateful for the scientific dialogue with the reviewers.

      Review 1 (Public review):

      This manuscript reports on the behavior of participants playing a game to measure exploration. Specifically, participants completed a task with blocks of exploratory choices (choosing between two 'tables', and within each table, two 'card decks', each of which had a specific probability of showing cards with one color versus another) and test choices, where participants were asked to choose which of the two decks per table had a higher likelihood of one color. Blocks differed on how long (how many trials) the exploration phase lasted. Participants' choices were fit to increasingly complex models of next-trial exploration. Participants' choices were best fit by an intermediate model where the difference in uncertainty between tables influenced the choice. Next, the authors investigated factors affecting whether participants sought out or avoided uncertainty, their choice reaction times, and the relationship of these measures with performance during the test phase of each block. Participants were uncertainty-seeking (exploratory) under most levels of overall uncertainty but became less uncertainty-seeking at high levels of total uncertainty. Participants with a stronger tendency to approach uncertainty at lower levels of total uncertainty were more accurate in the test phase, while the tendency to avoid uncertainty when total uncertainty was high was also weakly positively related to test accuracy. In terms of reaction times, participants whose reaction times were more related to the level of uncertainty, and who deliberated longer, performed better. The individual tendency to repeat choices was related to avoidance of uncertainty under high total uncertainty and better test performance. Lastly, choices made after a longer lag were less affected by these measures.

      The authors note that their paradigm, which does not provide immediate rewarding feedback, is novel. However, the resulting behavior appears similar to other exploratory learning tasks, so it's unclear what this task design adds - besides perhaps showing that exploratory behavior is similar across types of reward environments. Several papers have shown that cognitive constraints modulate exploration (PMIDs: 30667262, 24664860, 35917612, 35260717); although this paper provides novel insights, it does not situate its findings in the context of this prior literature. As a result, what it adds to the literature is difficult to discern.

      We are grateful for your thoughtful reading of our paper and for pointing us to these relevant references. We appreciate the need to clarify how our work is situated within the existing literature. In brief, the novelty of our paper lies in measuring exploration in contexts where it is not in direct competition with the need to exploit knowledge for reward. This approach enables us to include orders of magnitude more exploration trials. With this increased power, we were able— for the first time— to distinguish between competing algorithms for addressing uncertainty, and we identified a novel tendency to avoid uncertainty when overall uncertainty is high. We now state this more clearly in the discussion section and cite the suggested papers.

      “While the literature on exploration is expansive, the paradigm presented here extends it in important ways. Researchers of reinforcement learning have previously examined exploration in the context of reward-seeking decisions. Using such paradigms as the bandit task Schulz and Gershman (2019), it was demonstrated that humans don't always choose the option they believe will yield the most reward, but also make random and directed choices with the aim of exploring other uncertain options (Schulz and Gershman, 2019; Wilson et al., 2014). Recently, studies using the bandit task have lent empirical support to the notion that exploration is difficult, as participants explore less under time pressure or cognitive load (Brown et al., 2022; Otto et al., 2014; Cogliati Dezza et al., 2019; Wu et al., 2022). Crucially, this literature has focused on cases where reward can be gained on each trial (Brown et al., 2022; Cohen et al., 2007; Daw et al., 2006; Schulz and Gershman, 2019; Song et al., 2019; Tversky and Edwards, 1966; Wilson et al., 2014; Wu et al., 2022). In such tasks, the motivation to exploit current knowledge predominates exploration, rendering it rare and difficult to measure (Findling et al., 2019). In contrast, our task was designed to remove the impetus to immediately exploit current knowledge , and as a result we were able to observe many exploratory choices. With this increased experimental power, we were able to compare different algorithms approximating the goal of approaching uncertainty, and describe how and when humans avoid uncertainty instead of approaching it.”

      Reviewer #1 (Recommendations For The Authors):

      Are all participants best fit by the delta uncertainty model? Since other parts of the paper focus on individual differences, it would be useful to examine if people differ in the computational complexity of their exploration strategies and if this difference relates to other behavior.

      We thank you for this helpful suggestion, which prompted us to conduct additional analyses. To address your question, we summarized point-wise predictive accuracy for each participant and compared it across the three models. The results are presented in the new Supplements 2 and 3 to Figure 6.

      These analyses show that, for the vast majority of participants, uncertainty was favored over exposure as a choice strategy, and for a sizable majority, it was also favored over EIG. As detailed in Figure 6 and its supplements, 125 participants were best described by uncertainty relative to EIG, 58 by EIG, and 11 showed inconclusive results. Similarly, 96 participants were better fit by uncertainty than exposure, while an additional 72 had negative exposure coefficients (consistent with uncertainty-based choice). Exposure was supported for 26 participants.

      We also examined how these strategies relate to other behavioral measures. Exposure was not strongly linked to test performance. EIG, by contrast, showed a positive association with test performance, perhaps because it is more closely correlated with uncertainty. Importantly, however, across posterior predictive checks in the main text and supplements, approaching uncertainty continues to provide the best overall description of participants’ strategies.

      The authors construct a hierarchy of exploratory strategies. Perseveration/switching is also an explore/exploit strategy that would lie above random exploration in the authors' hierarchy.

      We chose not to place perseveration within the hierarchy, as from a normative perspective it is not, strictly speaking, an exploration strategy. At its extreme, perseveration would lead a participant to repeatedly sample only one option, leaving the others entirely unexplored. Switching is represented in the hierachy by the equating exposure strategy – they are very similar.

      For the analyses examining uncertainty seeking vs. aversion by total uncertainty, how was the cut point determined? Did this differ across people?

      Thank you for highlighting the need for greater clarity on this point. The threshold was indeed fitted to the data and varied significantly across participants (see Table 6 in Appendix 3). For each participant, the threshold marks the point at which behavior shifts from approaching to avoiding uncertainty. This threshold is a key factor underlying individual differences in the tendency to avoid uncertainty when overall uncertainty is high, as illustrated in the analyses of Figure 6 and related results. We now make this point clearer in the methods section:

      “To quantify how the influence of Δ-uncertainty on choice varied with overall uncertainty, we fit a multilevel piecewise logistic regression model. This model estimated a threshold in overall uncertainty, treated as a free parameter, and allowed the slope of Δ-uncertainty on choice to differ below and above this threshold. Below the threshold, a positive slope reflects a tendency to approach uncertainty; above the threshold, a negative interaction captures the tendency to avoid Δ-uncertainty with higher values of overall uncertainty.”

      More details on the DDM analyses are needed - it's not clear how the outputs of the DDM correspond to what is stated in the text in the results.

      We agree that the section detailing the DDM analyses could be clarified. We analyzed two key parameters of the DDM: the drift rate, which we interpret as reflecting the efficacy of deliberation over uncertainty, and the bound separation, which corresponds to the tendency to deliberate rather than respond quickly. Our results show that good learners exhibit both higher drift rates and higher bounds. When participants repeat a previous choice, both the drift rate and bounds are lower. We changed the way we report the results:

      “We found that RTs indeed varied in relation to the absolute value of Δ-uncertainty as expected b=0.69, 95\% PI=[0.58,0.78]. Crucially, a stronger dependence of RT on the absolute value of Δ-uncertainty predicted better performance at test (drift-rate and test performance association b=0.81, 95% PI=[0.58,1.07]). We further found that participants who tended to deliberate longer for the sake of accuracy also tended to perform better at test (bound height and test perfromance association b=1.46, 95% PI=[0.58,2.34]; Figure8c). In summary, participants who were better at deliberating about uncertainty during exploration, and who deliberated for longer, performed better at test. Thus, making good exploratory choices that lead to efficient learning involves prolonged deliberation.”

      We also provide a detailed explanation of this correspondence in the Methods section:

      “The DDM explains RTs as the culmination of three interpretable terms. The first is the efficacy of a participant’s thought process in furnishing relevant evidence for the decision - in our case the efficacy of choosing according to Δ-uncertainty (the drift rate in DDM parlance). The second term governs the participant’s speed-accuracy tradeoff by determining how much evidence they require to commit to a decision. This can also be thought of as how long a participant is willing to deliberate when a decision is difficult (bound height). Finally, the portion of the RT not linked to the deliberation process is captured by a third term (non-decision time).”

      The authors note that "the three choice strategies prescribe different table choices on most trials" but (from what I can see) only provide a representative participant's plot in Figure 2. What was the overall correlation of predicted choices from the three models?

      Thank you for pointing out this oversight. The correlations are now shown in the supplement to Figure 2. In brief, correlations between exposure and the other two strategies are low, while the correlation between EIG and uncertainty is moderate. These dependencies motivated our decision to fit a separate logistic regression model for each strategy and to compare strategies using formal model comparison and posterior predictive checks, rather than including them all in a single regression model.

      It appears that the models are all constructed to predict table choices and not card deck choices. Can the authors clarify this? If so, what role do the card deck choices have?

      Indeed, the manuscript focuses on table choices, as these are the choices of primary interest from an exploration perspective. It is most straightforward to define the three exploration strategies with respect to table choices, whereas for deck choices it is not clear how to define EIG in respect to the perforamnce at test. The hierarchical structure of the task was originally chosen to increase complexity, with the goal of creating a rich task that engages cognitive resources. We have not formally tested this assumption, and do not expect that the patterns we observe should be absent in a flat version of the task.

      Reviewer 2 (Public review):

      Summary:

      This paper focuses on an interesting question that has puzzled psychologists for decades, that is, why do people demonstrate a mix of uncertainty approach and avoidance behavior, given the fact that reducing uncertainty could always gain information and seems beneficial? This paper designed a novel task to demonstrate behavioral signatures of uncertainty approaching and avoidance during the exploration phase within the same task at both a within-subject and betweensubject level. On the algorithmic level, this paper compared four different implementations of uncertainty-guided exploration and found that the model sensitive to relative uncertainty provides the best fit for human behavior compared to its counterparts using expected information gain or past exposure. This paper then links people's uncertainty attitude with accuracy and finds that uncertainty avoidance during exploration does not impair task performance, implying that uncertainty avoidance may be the output of a resource-rational decision-making process. To examine this account, this paper uses reaction time as an independent proxy of costly deliberation and shows that people deliberate shorter when engaging in repetitive choice, which presumably saves cognitive resources. Finally, the paper shows that people's tendency to engage in repetitive choice correlates with their tendency to avoid uncertainty, which supports the argument that avoiding uncertainty could be a strategy developed under the constraint of limited cognitive resources.

      Strengths:

      One of the highlights of this paper, as mentioned in the previous paragraph, is that the authors can establish the existence of the uncertainty approach and avoidance behavior within the same task whereas previous work usually focuses on one of them. This dissociation allows the authors to examine what situational factor is related to the emergence of the act of avoiding uncertainty, and extract parameters describing participants' attitude towards uncertainty during baseline as well as during situations where uncertainty avoidance is more common. Besides documenting the existence of uncertainty avoidance behavior, this paper also tried to explain this behavior by proposing under the resource rational framework and has carefully quantified different aspects (e.g., accuracy; choice speed) of participants' behavior as well as examined their relationships. Though more experiments are needed to fully understand human uncertainty avoidance behavior, this paper has provided both empirical and theoretical contributions toward a mechanistic understanding of how people balance approaching and avoiding uncertainty.

      Weaknesses:

      I have a couple of concerns related to this paper. First, there seems to exist an anticorrelation between total uncertainty and absolute relative uncertainty (Figure 5 panel C, \delta uncertainty is restricted to a small range when total uncertainty is high). It seems to be a natural product of the exploration process since the high total uncertainty phase is usually the period where the participant knows little about either option, leading to a less distinguishable relative uncertainty. However, it remains unknown whether the documented uncertainty avoidance still applies when extrapolating to larger absolute relative uncertainty.

      We sincerely thank you for your close reading of our manuscript and for highlighting its strengths. In the paradigm we study, overall and relative uncertainty are not anticorrelated. While the two are related—as in any finite-information exploration task, where the value of overall uncertainty constrains the possible range of relative uncertainty—they are not correlated and can therefore be used as predictors in a single regression model. We agree that strategies could differ substantially in a (near) infinite-information setting, such as when people seek semantic knowledge. The advantage of a finite-information task is its tractability, which enables the computational analyses we conducted. That said, the inherently greater intractability of an infinite-information task would likely alter human strategies, as it poses challenges both to participants and to researchers.

      It would be great if the experiment allows for a manipulation of uncertainty in the middle of the experiment (e.g., introducing a new deck/informing that one deck has been updated)

      We agree, and look forward to probing this question in the future. We’ve added the point to our discussion section:

      “Our theoretical analysis and experiments leave several open questions. One concerns the relationship between overall uncertainty and time on task: in our paradigm, overall uncertainty was correlated with the number of cards observed. Although our findings remain robust when trial number is included as a covariate in the regression models, future work could more directly disentangle these factors by orthogonalizing overall uncertainty and elapsed time. This might be achieved, for instance, by manipulating overall uncertainty within a game—such as by introducing new tables or altering outcome probabilities mid-round.”

      Relatedly, the current 'threshold' of uncertainty avoidance behavior, if I understand correctly, is found by empirically fitting participants' data. This brings the question: can we predict when people will demonstrate uncertainty avoidance behavior before collecting any data? Or, is it possible that by measuring some metrics related to cognitive cost sensitivity, we could predict the proportion of choices that participants will show uncertainty-avoidant behavior?

      Thank you again for probing our thinking further. The threshold of uncertainty is indeed fitted on an individual basis using a hierarchical model. We believe there should be ways to predict it. In the current data, we find that it is correlated with the baseline tendency to approach uncertainty: in other words, participants who perform better show a slightly stronger tendency to avoid uncertainty when overall uncertainty is high. This underscores the complexity of identifying correlates of a coping strategy, as it is intricately linked to the difficulty being coped with. We speculate that working memory capacity may play an important role in this strategy, as well as the interplay between working memory–based learning and slower incremental learning mechanisms. Beyond speculation, however, we currently have no data to test these ideas.

      Finally, regarding the analysis of different behavior patterns in the game, it seems that the authors try to link repetitive behavior, uncertainty attitude, and accuracy together by testing the correlation between the two of them. I wonder whether other multivariate statistical methods e.g., mediation analysis, will be better suited for this purpose.

      This was a very insightful comment. We revisited the data and fitted test performance using a multiple regression model, predicting performance from the three exploration-phase strategies simultaneously: baseline tendency to approach uncertainty, tendency to avoid uncertainty when overall uncertainty is high, and tendency to repeat previous choices. When adjusting for the baseline tendency to approach, we find that the tendency to avoid uncertainty is indeed associated with a slight decrement in test performance. However, in our sample, the better learners—who are more effective at approaching uncertainty—also tend to avoid it when overall uncertainty is high. This nuance highlights the point discussed earlier. We find similar results when fitting the data with a mediation model, but we favour the multiple regression approach, since have no strong convictions about which exploration strategy causes another. We have detailed this analysis in the main text and have accordingly modified and qualified our interpretation of this finding:

      “In contrast, the relationship between the tendency to avoid uncertainty and test performance was more nuanced. In both samples, participants who were more inclined to approach uncertainty also tended to avoid it when overall uncertainty was high r=0.43, p=5.42 x 10<sup>-10</sup>. Accordingly, avoidance was positively correlated with test performance at the population level b=1.18, 95% PI=[0.80, 1.58] Figure 7b; see Methods for parameter estimation). However, once we adjusted for the tendency to approach, avoidance was reliably associated with worse test performance b=-0.83, 95% PI=[-1.28,-0.40].”

      Reviewer #2 (Recommendations For The Authors):

      Could the authors elaborate more on why the negative relationship between exposure and choice (Figure 4a) is a natural phenomenon under the relative uncertainty model?

      Indeed, we believe this is a natural phenomenon under the uncertainty model. When simulating an uncertainty-driven agent, the negative relationship arises naturally. We interpret this as the agent repeatedly pursuing tables that are more difficult to learn—those with smaller probability differences. The agent is drawn to these tables precisely because they are harder to master. By contrast, an EIG-driven agent would not repeatedly return to tables that are too difficult to learn. We have revised the Results section to make this point clearer:

      “The simulations demonstrate that the surprising negative correlation between choice and Δ-exposure is an epiphenomenon of uncertainty-driven exploration: agents repeatedly return to harder-to-learn tables, gaining more exposure to them precisely because they remain more uncertain about these tables.”

      It would be great if the authors could provide the correlation between different uncertainty estimates to help the readers have a better sense of how different these estimates are.

      We’ve added this information in the supplement to Figure 2. In brief, correlations between exposure and the other two strategies are low, while the correlation between EIG and uncertainty is moderate. These dependencies motivated our decision to fit a separate logistic regression model for each strategy and to compare strategies using formal model comparison and posterior predictive checks, rather than including them all in a single regression model.

    1. eLife Assessment

      This important study convincingly demonstrates how bacterial cells can modulate outer membrane-peptidoglycan tethering by expressing two different Lpp homologs with distinct cross-linking efficiencies, revealing that Salmonella typhimurium LppB forms disulfide-based homodimers (or heterotrimers with Lpp when present) and is covalently attached to peptidoglycan primarily via the L,D-transpeptidase LdtB at residue K58. The evidence supporting the authors' claims is solid, including the regulatory role of LppB dimerization for its abundance in E. coli and its ability to inhibit Lpp/A crosslinking to peptidoglycan, although additional analysis and quantification of muropeptides in wild-type E. coli overexpressing LppB would further strengthen the findings. Overall, the work will be of great interest to microbiologists studying cell envelope biogenesis.

    2. Reviewer #1 (Public review):

      Summary:

      Pierre Despas et al. studied the role of Salmonella typhimurium LppB in outer membrane tethering. Using E. coli {delta}lpp mutant the authors showed that Salmonella LppB is covalently attached to PG through K58 and that these crosslinks are formed by the L,D-transpeptidase LdtB, primarily. Additionally, authors demonstrate that LppB forms homodimers via a disulfide bond through C57, but when Lpp is present it can also form heterotrimers with it. Thus, suggesting a regulatory role in Lpp-PG crosslinking.

      Strengths:

      In my view, this is a nice piece of work that expands our understanding of the role of lpp homologs. The experiments were well-designed and executed, the manuscript is well-written and the figures are well-presented.

      Weaknesses:

      I have some suggestions to give a clearer message, because I think a few images don't reflect much of what the authors wrote.

      It'd be helpful for readers to see the phylogenetic tree of the rest of the organisms that harbor LppB homologs and Lpp.

      Increased expression of LppB under low pH is subtle. This result would benefit from quantifying the blots (Fig. S1) and performing statistical analysis.

      Similarly, the SDS-EDTA sensitivity result (Fig. S2) is not convincing; the image doesn't seem to show isolated colonies at low pH (Fig. S2B). Please measure CFU/mL and report endpoint growth graphs instead. Statistical analysis should also be presented.

      The reduction to PG crosslinking of the C57R mutant is unclear (Fig 4B lane 22). The authors state: "suggesting that additional features of the LppB C-terminal region underlie its reduced efficiency." Does this mean additional amino acids play a role? Did the authors try to substitute Cys with other amino acid residues like Ala or Ser and quantify protein levels to find a mutant with similar expression levels? Do these have less crosslinking too?

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Pierre Despas and co-workers, reports the biochemical characterization of LppB a peculiar Lpp (Braun's lipoprotein) homolog found in Salmonella enterica. S. enterica encodes two Lpp homologs LppA and LppB: while LppA and Lpp function similarly, the role of LppB is less clear. LppB shares with Lpp the C-terminal Lys needed for covalent attachment to peptidoglycan (PG) but diverges in residues that precede the terminal Lys featuring a Cys residue at the penultimate position. By using E. coli as a surrogate model, the authors show that LppB can be covalently linked to PG via the terminal Lys residues and that the penultimate Cys residue can be used to form homodimer species when expressed alone and heterotrimeric complexes when co-expressed with Lpp. Interestingly, LppB expressed in E. coli seems to be stabilized at acidic pH a condition Salmonella encounters in macrophage phagosomes. Finally, based on decreased intensity of LppB-PG crosslinked bands as LppB expression increases the authors suggest that LppB is able to negatively modulate the outer membrane-peptidoglycan connectivity.

      Strengths:

      The manuscript is interesting, describes a novel strategy employed by bacteria to fine tuning outer membrane-PG attachment and provides new insights into how envelope remodeling processes can contribute to bacterial fitness and pathogenicity.

      Weaknesses:

      The analysis and quantification of muropeptides formed in E. coli strains overexpressing LppB would strengthen the main conclusion of the manuscript.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript is interesting, and it is clearly written. While the experiments are well executed, a general flaw is that the LppA/B analyses are done in the E. coli K12 host as surrogate for Salmonella enterica. For the mechanistic and molecular analyses of LppB a surrogate host is certainly adequate, yet it limits extrapolation of the physiological implications of LppB in the natural context.

      Strengths:

      The work convincingly demonstrates that LppB forms disulfide-based dimers and that it is crosslinked to PG via LdtB in E. coli. Moreover, dimerisation is required for LppB abundance in E. coli and LppB can inhibit crosslinking of Lpp/A to PG in E. coli.

      Weaknesses:

      Regarding the key conclusion of the work: while it is shown that LppB is oxidized in E. coli, whether envelope integrity (or OMV production) changes arise from switches in oxidation of the LppB cysteines remains to be shown, for E. coli let alone in the native host Salmonella. Does expression of LppB influence Lpp/A activity or OM tethering in E. coli? Since the inhibition of the Lpp/A linking to PG is not affected by the oxidation state of LppB, the abstract/title implies redox-control of envelope integrity which is a bit misleading and an overstatement. Both are features of LppB: i.e. it dimerizes through disulfide bond formation and it reduces PG binding of Lpp/A through trimerisation. However, no link between the two is shown.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Pierre Despas et al. studied the role of Salmonella typhimurium LppB in outer membrane tethering. Using E. coli ∆lpp mutant the authors showed that Salmonella LppB is covalently attached to PG throug K58 and that these crosslinks are formed by the L,Dtranspeptidase LdtB, primarily. Additionally, authors demonstrate that LppB forms homodimers via a disulfide bond through C57, but when Lpp is present it can also form heterotrimers with it. Thus, suggesting a regulatory role in Lpp-PG crosslinking.

      Strengths:

      In my view, this is a nice piece of work that expands our understanding of the role of lpp homologs. The experiments were well-designed and executed, the manuscript is wellwritten and the figures are well-presented.

      Weaknesses:

      I have some suggestions to give a clearer message, because I think a few images don't reflect much of what the authors wrote.

      We thank Reviewer #1 for this important comment. We agree that several figures could more directly illustrate the points made in the text. In a revised version, we intend to revise the relevant figure panels and legends to better align the visual message with the conclusions, and we will adjust the corresponding text to explicitly state what each figure demonstrates and how the data support our interpretation. We anticipate that these changes will improve clarity and strengthen the alignment between figures and text.

      It'd be helpful for readers to see the phylogenetic tree of the rest of the organisms that harbor LppB homologs and Lpp.

      We thank Reviewer #1 for this suggestion. We examined the distribution of Lpp-family proteins across closely related Enterobacteriaceae. While species such as Escherichia fergusonii, Shigella flexneri and Shigella dysenteriae encode Lpp and as well as a paralogous small lipoprotein (YqhH, see Fig.S7), we find that LppB-like orthologs (equivalent to lppB from Salmonella) appear to be restricted to Salmonella species to our knowledge. Because LppB shows this lineage-specific distribution, inclusion of a broader phylogenetic tree would primarily highlight its restricted presence rather that provide additional evolutionary insight. We will clarify this point in the revised manuscript.

      Increased expression of LppB under low pH is subtle. This result would benefit from quantifying the blots (Fig. S1) and performing statistical analysis.

      We thank Reviewer #1 for this observation. We agree that the increase in LppB levels at acidic pH appears modest. We will carefully reassess this result across independent experiments and, where technically appropriate, provide quantitative information to better document the magnitude of the effect. Additionally, we will revise the text to more accurately described the observed difference.

      Similarly, the SDS-EDTA sensitivity result (Fig. S2) is not convincing; the image doesn't seem to show isolated colonies at low pH (Fig. S2B). Please measure CFU/mL and report endpoint growth graphs instead. Statistical analysis should also be presented.

      We thank Reviewer #1 for this suggestion. We agree that the SDS-EDTA sensitivity assay presented in Fig. S2 could benefit from a more quantitative assessment. We will perform CFU/mL measurements from independent biological replicates to better quantify the observed differences and include statistical analysis when appropriate. In addition, we will revise the corresponding text to more accurately reflect the magnitude of the phenotype.

      The reduction to PG crosslinking of the C57R mutant is unclear (Fig 4B lane 22). The authors state: "suggesting that additional features of the LppB C-terminal region underlie its reduced efficiency." Does this mean additional amino acids play a role? Did the authors try to substitute Cys with other amino acid residues like Ala or Ser and quantify protein levels to find a mutant with similar expression levels? Do these have less crosslinking too?

      We thank Reviewer #1 for this important comment. As correctly noted, the reduced abundance of the LppB<sub>C57R</sub> variant likely contributes to its reduced level of peptidoglycancrosslinked species. Therefore, we cannot formally distinguish whether the reduced peptidoglycan crosslinking reflects decreased intrinsic crosslinking efficiency or simply reduced protein abundance and stability. We will revise the text to clarify this point and explicitly acknowledge this limitation. The C57R substitution was chosen because arginine is present at the equivalent position in the Salmonella LppA homolog, allowing us to assess the functional consequences of a naturally occurring sequence variation between Lpp-family members. While substitutions such as C57A or C57S could further dissect the specific contribution of the cysteine residue, our use of the C57R substitution provides direct insight into the functional implications of this naturally occurring difference between Lpp homologs.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Pierre Despas and co-workers, reports the biochemical characterization of LppB a peculiar Lpp (Braun's lipoprotein) homolog found in Salmonella enterica. S. enterica encodes two Lpp homologs LppA and LppB: while LppA and Lpp function similarly, the role of LppB is less clear. LppB shares with Lpp the Cterminal Lys needed for covalent attachment to peptidoglycan (PG) but diverges in residues that precede the terminal Lys featuring a Cys residue at the penultimate position. By using E. coli as a surrogate model, the authors show that LppB can be covalently linked to PG via the terminal Lys residues and that the penultimate Cys residue can be used to form homodimer species when expressed alone and heterotrimeric complexes when co-expressed with Lpp. Interestingly, LppB expressed in E. coli seems to be stabilized at acidic pH a condition Salmonella encounters in macrophage phagosomes. Finally, based on decreased intensity of LppB-PG crosslinked bands as LppB expression increases the authors suggest that LppB is able to negatively modulate the outer membrane-peptidoglycan connectivity.

      Strengths:

      The manuscript is interesting, describes a novel strategy employed by bacteria to fine tuning outer membrane-PG attachment and provides new insights into how envelope remodeling processes can contribute to bacterial fitness and pathogenicity.

      Weaknesses:

      The analysis and quantification of muropeptides formed in E. coli strains overexpressing LppB would strengthen the main conclusion of the manuscript.

      We thank Reviewer #2 for this insightful comment. We agree that quantitative analysis of muropeptides in E. coli strains expressing LppB would strengthen the main conclusion. This point was also raised in the editorial assessment and by Reviewer #3, underscoring its importance. In a revised version, we plan to perform muropeptide profiling by HPLC, coupled where appropriate to mass spectrometry, to quantitatively assess peptidoglycan composition in the relevant strains.

      Reviewer #3 (Public review):

      Summary:

      The manuscript is interesting, and it is clearly written. While the experiments are well executed, a general flaw is that the LppA/B analyses are done in the E. coli K12 host as surrogate for Salmonella enterica. For the mechanistic and molecular analyses of LppB a surrogate host is certainly adequate, yet it limits extrapolation of the physiological implications of LppB in the natural context. 

      Strengths:

      The work convincingly demonstrates that LppB forms disulfide-based dimers and that it is crosslinked to PG via LdtB in E. coli. Moreover, dimerization is required for LppB abundance in E. coli and LppB can inhibit crosslinking of Lpp/A to PG in E. coli. 

      Weaknesses:

      Regarding the key conclusion of the work: while it is shown that LppB is oxidized in E. coli, whether envelope integrity (or OMV production) changes arise from switches in oxidation of the LppB cysteines remains to be shown, for E. coli let alone in the native host Salmonella. Does expression of LppB influence Lpp/A activity or OM tethering in E. coli? Since the inhibition of the Lpp/A linking to PG is not affected by the oxidation state of LppB, the abstract/title implies redox-control of envelope integrity which is a bit misleading and an overstatement. Both are features of LppB: i.e. it dimerizes through disulfide bond formation and it reduces PG binding of Lpp/A through trimerization. However, no link between the two is shown.

      We thank Reviewer #3 for this important comment and for highlighting the need to clarify the relationship between LppB oxidation, oligomerization, and its effect on peptidoglycan crosslinking. We agree that while our data demonstrate that LppB forms disulfide-linked oligomers and that LppB expression reduces Lpp/A attachment to peptidoglycan, our current results do not establish a direct causal link between the oxidation state of LppB and its ability to modulate outer membrane–peptidoglycan tethering. Therefore, we will revise the manuscript to avoid implying redox-dependent control of envelope integrity and to more clearly present these as distinct but potentially related properties of LppB.

    1. eLife Assessment

      Foucault and colleagues examine how human adaptive learning depends on the structure of the learning task. The authors provide useful findings clarifying the differences in how people learn in environments that are continuously versus discontinuously changing. While they provide solid evidence for most conclusions, support for some of the claims is incomplete in the current form.

    2. Reviewer #1 (Public review):

      Foucault and colleagues examine how people's belief updating in a predictive inference task depends on qualitative differences in generative structure, in particular focusing on two generative structures frequently employed in learning and belief updating tasks (changepoints and random walks). While behavior and normative predictions for these structures have been explored many times in different tasks and settings, these exact structures have, to the best of my knowledge, never been explored in the same study and modeling framework for direct comparison. The authors use ideal observer models coupled with a response bias module to make predictions for what structure-appropriate adaptive learning would look like across the two conditions, then they ran an experiment to test behavioral predictions for the two structures under different levels of stochasticity. The authors present evidence that stochasticity changes in learning for two qualitatively different reasons, and that depending on which of these factors dominate, can have different effects on learning. They show that human participants showed qualitative trends consistent with adjusting their structural assumptions of the task to guide learning and adjusting their assessments of stochasticity.

      The experiment was well designed and executed, and the paper was well written. The findings from the study are largely consistent with other work in the field, but there are a few advances that go beyond previously established findings, most notably a nuanced examination of how stochasticity affects learning behavior, which has the potential to provide an explanation for a notable discrepancy in the field (Pulco and Browning 2025; Piray and Daw 2024). The paper has notable strengths in its use of computational models to generate qualitative predictions that are evaluated in empirical behavioral data.

      The current paper has a few weaknesses. It makes strong claims regarding the impacts of stochasticity on optimal learning that were difficult to evaluate, given a lack of clarity on the exact modeling that was implemented and incompletely supported by the existing analysis. The paper also lacks statistical support for some of its claims and evaluates models only through their ability to reproduce summary measures, rather than through direct model fitting.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Foucault, Weber, and Hunt examines human learning behavior across change-point and continuously changing environments. The authors suggest that humans normatively adjust their learning dynamics to the current environmental dynamics. Moreover, they argue that humans not only track the means of the outcome-generating process, but also the variance, which extends recent work in this domain. The present results suggest that human learners are well able to distinguish the two moments and adjust their behavior accordingly.

      Strengths:

      (1) The paper is clearly written, and the figures demonstrate the results well. The authors clearly explain the two key results and their implications for the field.

      (2) The paper uses a common modeling framework for the two environments. This makes it less likely that differences in learning behavior between the two environments are driven by general model properties rather than the specific learning mechanisms.

      Weaknesses:

      (1) Interpretation in terms of normative learning

      (1.1) Perseveration and paddle movement

      The model presented in the main manuscript is equipped with a response-probability mechanism that controls whether the paddle is updated. Especially on smaller prediction errors, the paddle is often not updated (perseveration). I wonder whether this mechanism truly reflects normative updating behavior or rather a heuristic strategy. Not moving the paddle is non-normative. A fully Bayesian model would hardly ever show a learning rate of exactly zero (one could argue only when the error is itself zero or after a massive amount of trials). This is partly apparent in Supplementary Figure 1, where the lowest learning rates are around alpha = 0.2 (change-point environment) and 0.5 (random walk).

      Supplementary Figure 1 shows the learning rate for the normative model without the response-probability mechanism. Primarily in the random-walk environment, but to some extent also in the change-point condition, the shape of the learning rate changes quite dramatically compared to Figure 4. In the random-walk environment, the learning rate appears relatively stable, with a value slightly larger than 0.5. In the change-point case, the learning rate is somewhat higher in the range of smaller prediction errors. Doesn't this speak against the interpretation that the model in the main manuscript is really behaving in a purely normative fashion? The tendency to perseverate might reflect a simplified strategy, which is sometimes described as "satisficing". That is, in line with the authors' description of the mechanism, perseveration occurs when it seems "good enough" (Simon, 1956), which has been demonstrated in a belief updating context before (Bruckner et al., 2025; Gershman, 2020; Nassar et al., 2021).

      Supplementary Figure 3 suggests that humans show quite a lot of this type of behavior. It indicates that in the change-point condition, in only 20% of the trials in the minimal prediction error range, participants update their prediction (i.e., in 80% of these trials, they perseverate on the previous prediction). This update probability increases as a function of the prediction error. In the random-walk condition, update probabilities are higher, starting at around 40% and also increasing as a function of the error.

      Indeed, Supplementary Figure 4 suggests that the shape of the learning rate for true update trials is much shallower for humans and the "perseverative" model compared to the model in Supplementary Figure 1. This suggests that the curve in Figure 4 (main manuscript), hinting at a continuous increase in the learning rate, could be the result of a mixture of perseveration (alpha = 0) and higher learning rates compared to the normative model without the response-probability mechanism.

      (1.2) Control models

      One might reply that the response-probability mechanism just adds noise, while the actual learning mechanism is still normative. However, a standard Rescorla-Wagner model with the same response-probability mechanism might also show increasing apparent learning rates as a function of prediction error (when perseveration trials and regular update trials are averaged as a function of the prediction error).

      Therefore, I suggest adding a control analysis with a Rescorla-Wagner model. One version with the same response mechanism yielding perseveration, and one standard Rescorla-Wagner model without this mechanism. This should help identify how well the present analyses can distinguish true learning-rate dynamics from averaging artifacts due to perseveration.

      (1.3) Discussion of the possibility of non-normative learning mechanisms

      Given the considerations above, I suggest a more balanced discussion of potential non-normative influences on learning, in particular, perseveration. Several previous papers have similarly shown that perseveration prominently characterizes human learning and decision-making (Bruckner et al., 2025; Gershman, 2020; Nassar et al., 2021), and in my opinion, it would be relevant to discuss how normative and non-normative mechanisms might jointly shape learning.

      (2) Model description

      The Bayesian model is quite central to the paper. However, the mathematical details are sparse, and I did not fully understand the differences between the model variants and how they were implemented. In particular, what approximations were used to make the model tractable? And how does the variance inference work? Is the learning rate directly computed, similar to the Nassar model, or is it derived from updates and prediction errors?

      (3) Apparent learning rates in humans

      The main learning-rate analyses compute the fraction of updates and prediction errors. For quality assurance, it would be useful to see a few supplementary histograms of the apparent learning rates. It would be great to have one plot across all participants and a few example plots for single participants. These analyses will reveal the distribution of learning rates and the proportion at the boundaries, which can sometimes be a source of bias.

      References:

      Bruckner, R., Nassar, M. R., Li, S.-C., & Eppinger, B. (2025). Differences in learning across the lifespan emerge via resource-rational computations. Psychological Review, 132(3), 556-580. https://doi.org/10.1037/rev0000526.

      Gershman, S. J. (2020). Origin of perseveration in the trade-off between reward and complexity. Cognition, 204, 104394. https://doi.org/10.1016/j.cognition.2020.104394.

      Nassar, M. R., Waltz, J. A., Albrecht, M. A., Gold, J. M., & Frank, M. J. (2021). All or nothing belief updating in patients with schizophrenia reduces precision and flexibility of beliefs. Brain, 144(3), 1013-1029. https://doi.org/10.1093/brain/awaa453.

      Simon, H. A. (1956). Rational choice and the structure of the environment. Psychological Review, 63(2), 129-138. https://doi.org/10.1037/h0042769.

    4. Reviewer #3 (Public review):

      Summary:

      This paper uses a single Bayesian modelling framework to derive specific predictions for making inference, either with assumptions of a change-point structure or a gradually changing structure across tasks.

      Strengths:

      The paper nicely summarizes the slightly different subliteratures that have studied human behavior with models that only assume a single underlying task structure. The diagnostic predictions from the models are presented clearly, and the human data are nicely consistent with the model predictions.

      As the authors discuss themselves, this work opens the door to many questions on the structured learning of inferring (from experience or verbal instructions) which meta-model is most appropriate to use.

      Weaknesses:

      Alignment between models and human behavior is mostly qualitative; the models are not fit to individual data (which could, for instance, uncover interesting differences between individuals.

      There is no consideration of the possibility that individuals may not fully use one or the other meta-model (of gradual change vs changepoints), but instead a hybrid. Fits of the models to data may help uncover if some people (e.g., the 10% in experiment 2 that were best matched by the CP model?) use a slightly different mix of strategies than the one suggested by the verbal instructions received (which may cause the pattern in Figure 6d, which looks to have featured both models).

    5. Author response:

      We thank the reviewers for their constructive feedback and careful evaluation of our manuscript. We are encouraged that the study was viewed as well designed and clearly presented, that its computational modeling approach was recognized as a strength, and that the key findings were appreciated. We agree that some claims would benefit from additional support and clarification. Below, we outline the main revisions we will undertake to strengthen the manuscript and address the points raised in the reviews. These revisions are intended to strengthen the evidential support for our conclusions and clarify aspects of the results and modeling.

      (1) Statistical support.

      Some claims were judged to lack sufficient statistical support [Reviewer 1]. In the revised manuscript, we will carefully review all inferential claims and ensure that they are supported by appropriate statistical analyses. Where necessary, we will implement additional statistical tests and expand statistical reporting to ensure that differences between conditions, models, or behavioral measures are formally evaluated and that key aspects of the data are appropriately described.

      (2) Modeling clarification.

      Some aspects of the modeling were considered insufficiently clear, particularly regarding how the models were implemented [Reviewers 1 and 2]. We will expand the Methods section to provide a clearer and more complete description of the Bayesian models and their implementation. In particular, we will clarify that full probability distributions were computed (without reduced approximations such as those used in simplified Bayesian variants), and that the only approximation concerns numerical discretization of continuous state spaces at fine resolution. We will clarify that variance is part of the joint multidimensional state space and is inferred jointly with the mean. We will also explicitly state that apparent learning rates are derived from predicted paddle responses in the same way as for participants, and are not directly computed within the Bayesian inference process.

      (3) Model fitting.

      The absence of direct model fitting to individual participants was identified as a limitation [Reviewers 1 and 3]. In response, we will implement individual-level model fitting (to the extent feasible in practice) and conduct formal model comparison based on the fitted models. We will further validate the fitted models by examining whether they reproduce the main behavioral signatures observed in the data.

      (4) Normative interpretation and control analyses.

      The interpretation of the models as normative was questioned in light of the response-probability mechanism [Reviewer 2]. In the revision, we will clarify the distinction between the normative inference component of the model and the response-level mechanism. We will revise the framing of the results accordingly and ensure that normative claims are restricted to the inference component. We will also expand the discussion to integrate relevant literature on perseveration and satisficing, and clarify how normative and non-normative mechanisms may jointly shape behavior. In addition, following the reviewer’s suggestion, we will include control analyses using standard Rescorla–Wagner models, with and without the response-probability mechanism, to evaluate whether the observed signatures can be accounted for by simpler learning rules.

      (5) Additional points.

      We will also address the additional points raised in the reviews. Specifically, we will include supplementary histograms of apparent learning rates [Reviewer 2]. We will provide additional clarification and analyses regarding the effects of stochasticity on learning [Reviewer 1]. Finally, we will explore hybrid or mixture models and strategies and expand the discussion of this possibility [Reviewer 3].

      We believe that these revisions will substantially strengthen the support for our claims and address the concerns raised in the current assessment. We are grateful for the reviewers’ engagement with our work and for their comments, which will allow us to significantly improve the clarity and strength of the manuscript.

    1. eLife Assessment

      This study offers a valuable analysis of how moment-to-moment fluctuations in arousal are associated with structured, non-uniform patterns of brain-wide functional connectivity during wakefulness. Using data-driven analyses of resting-state and naturalistic fMRI with eye tracking, the authors present convincing evidence that arousal is a dynamic, continuous process that shapes brain activity in a structured way beyond a simple global effect. However, the strength of the conclusions is limited by a reliance on specific analytical choices and the need for additional controls and robustness analyses. This paper sheds light on the link between brain activity and ongoing fluctuations in arousal and will be of interest to researchers studying large-scale brain functional organization and links between the brain and body.

    2. Reviewer #1 (Public review):

      Summary:

      In this study, the authors aim to characterize how moment-to-moment fluctuations in arousal during wakefulness shape large-scale functional brain connectivity. Using pupil diameter as an index of arousal and high-field functional imaging, they seek to determine whether arousal-related modulation of connectivity is uniform across the brain or organized into structured patterns, and whether such patterns show hemispheric asymmetry. The work further aims to assess whether these organizational features generalize across resting-state and naturalistic viewing conditions.

      Strengths:

      The study addresses an important and timely question regarding how spontaneous variations in arousal influence whole-brain communication during wakefulness. The dataset is rich, combining high-field imaging with concurrent physiological measurements, and the analyses are ambitious in scope. A key strength is the attempt to move beyond region-based effects and to describe arousal-related modulation at the level of large-scale connectivity organization. The comparison across rest and movie viewing provides useful context and suggests a degree of consistency across behavioral states.

      Weaknesses

      First, a central claim is that arousal modulates functional connectivity in a hemispherically asymmetric and community-specific manner. Although structured asymmetries are demonstrated at the group level, it remains unclear whether these effects reflect a stable neurobiological principle or arise from high-dimensional, connection-wise analyses that are sensitive to sampling variability. Given the interpretive weight placed on hemispheric lateralization, stronger evidence of robustness and individual-level consistency would be necessary to support this conclusion.

      Second, all analyses are based on ultra-high-field imaging. The manuscript does not address whether the reported arousal-related patterns, including the community structure and hemispheric asymmetries, are expected to be reproducible at standard field strengths. It therefore remains unclear whether the findings depend critically on the use of high-field data or whether they would generalize to more widely available datasets, limiting the broader applicability of the results.

      Third, arousal-connectivity coupling is assessed using zero-lag correlations between pupil diameter and time-resolved connectivity estimates. Physiological and hemodynamic considerations suggest that pupil-linked arousal and blood-based imaging signals may exhibit systematic temporal delays. The absence of analyses examining sensitivity to such delays raises the possibility that the reported coupling patterns depend on a specific temporal alignment assumption.

      Fourth, the estimation of time-resolved connectivity relies on a single choice of sliding-window length. The manuscript does not examine whether the reported patterns are stable across different window sizes. Given ongoing concerns about parameter dependence in time-resolved connectivity analyses, sensitivity analyses would be important to establish that the findings are not artifacts of a particular analytical choice.

      Finally, the identification of seven connectivity communities is a central result, yet the justification for this choice relies primarily on a single clustering quality measure. In practice, evaluation of clustering solutions typically draws on multiple complementary criteria, including measures of compactness and separation, approaches for selecting the number of clusters, and assessments of stability under resampling. Without such complementary evaluations, it is difficult to determine whether the reported community structure reflects a stable organizational feature or sensitivity to specific methodological decisions.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript addresses a clear and widely relevant question: how ongoing fluctuations in alertness during wakefulness relate to large-scale patterns of coordinated brain activity. The authors combine high-field magnetic resonance imaging with simultaneous pupil measurements, and they compute an edgewise measure of arousal-related coupling for every pair of regions. Their main contribution is to show that arousal-related coupling is low-dimensional and organized into seven reproducible "connectivity communities", each with characteristic network pair compositions. A secondary contribution is the observation that these communities exhibit systematic but community-specific hemispheric asymmetries, including a striking left/right dissociation within the ventral attention network, where the left side participates broadly across communities while the right side forms a more cohesive, segregated arousal-responsive module. A final contribution is cross-context generalization: the same organizational structure and lateralization signatures are largely preserved during naturalistic movie watching.

      Strengths:

      (1) The paper moves beyond state contrasts and quantifies arousal-related modulation continuously within wakefulness, directly addressing a gap highlighted in the Introduction.

      (2) The hemispheric asymmetry result is not framed as a crude global dominance effect; the authors explicitly test and argue that the key signal lies in structured spatial heterogeneity rather than mean shifts.

      (3) The cross paradigm replication in movie watching is a strong design choice and supports the claim that the organizational motifs are not limited to unconstrained rest.

      Weaknesses:

      (1) Arousal effects on BOLD signals and on pupil size can have different delays, so it would be valuable to test lagged relationships (for example, shifting the pupil series forward and backward) to show that the main community structure and lateralization results are not sensitive to an arbitrary temporal alignment.

      (2) Pupil diameter covaries with blinks, eye closure, and other factors that can covary with head motion and physiological noise. The Methods include substantial quality control and denoising, including motion regression and scrubbing, plus exclusions for eye closure.

      (3) The dataset is described in terms of runs retained (for example, 485 resting runs), and runs are treated as observations in clustering after z-scoring across runs. If multiple runs come from the same individuals, the manuscript would benefit from explicitly showing that results replicate at the participant level (for example, community structure stability within participant across runs, and participant-level summary statistics used for inference), rather than relying primarily on pooled run-level patterns.

      (4) Time-resolved connectivity is estimated using a 30-second sliding window and 5 second step. It is reasonable to wonder whether the same conclusions hold with alternative estimators that do not rely on fixed windows. The Discussion acknowledges this limitation, but adding a small robustness analysis would make the paper more definitive.

    4. Reviewer #3 (Public review):

      Summary:

      The paper investigates neural fluctuations underlying arousal using a combination of resting state/naturalistic movie watching fMRI and eye tracking data. The authors have used several data-driven approaches, including time-varying sliding window analyses and clustering methods, to characterize large-scale brain organization and hemispheric asymmetries associated with arousal fluctuations. This is an interesting study framing arousal as a dynamic, continuously varying process rather than a discrete state. Overall, the manuscript is well written and provides sufficient methodological and analytical detail accompanied by an explanation of results. However, several conceptual and methodological issues require clarification or further discussion to strengthen the interpretation and robustness of the findings.

      Strengths:

      This is an interesting study framing arousal as a dynamic, continuously varying process rather than a discrete state. Overall, the manuscript is well written and provides sufficient methodological and analytical detail accompanied by an explanation of results.

      Weaknesses:

      (1) A major limitation of the study is the limited discussion of subcortical regions, which play a central role in arousal regulation according to extensive prior literature. Although the current analyses focus primarily on cortical organization, the authors should include a brief discussion of how their findings relate to subcortical arousal systems.

      (2) While sliding window methods can capture temporal changes in functional organization, they have limitations in characterizing moment-to-moment neural fluctuations. In particular, results can be highly sensitive to window length and step size. The manuscript would benefit from (a) a clearer discussion of these methodological limitations, (b) justification for the chosen window length and step size, and (c) a sensitivity analysis demonstrating whether the main findings are robust across different parameter choices.

      (3) The authors use k-means clustering to identify groups of brain regions and refer to these groupings as "communities." However, in general, community detection typically refers to graph-based algorithms that identify modules based on connectivity structure (e.g., modularity maximization). The clusters derived from k-means in feature space are not necessarily equivalent to graph-theoretic communities. The authors should explicitly clarify this distinction and adjust terminology accordingly to avoid conceptual ambiguity.

    1. eLife Assessment

      The new development of Neuroplex, a pipeline that links projection-defined neuronal identity to in vivo calcium activity within the same animal, is a valuable contribution to the field of neuroscience and beyond. The strength of evidence is judged to be solid, as the methods, data, and analyses broadly support the stated claims.

    2. Reviewer #1 (Public review):

      Genetically encoded fluorescent proteins expressed in specific cell types allow recognising them in vivo and, if the protein is a functional indicator, as in the case of genetically encoded calcium indicators (GECIs), to record activity from the same cellular ensemble. Ideally, if proteins (fluorophores) have perfectly distinct spectral properties, signals can be distinguished from as many cell types as the number of employed fluorophores. In practice, fluorescent proteins have non-negligible crosstalk both in absorption and emission bands. In addition, fluorescence contribution of each fluorophore normally varies from cell to cell and therefore spectral properties of cells expressing two or more proteins are different. The work of Phillips et al. addresses this challenge. The authors present an approach defined as "Neuroplex", allowing identification of up to nine cell types from the same number of fluorophores. The fingerprint of each cell is then associated with functional fluorescence from the GECI GCaMP, allowing recording calcium activity from that specific cell. The method is implemented in vivo using head-mounted miniscopes.

      The authors used a mouse line expressing GCaMP in cortical pyramidal neurons and developed an experimental pipeline. First, they injected the nine AAV viruses, causing expression of fluorophores in a different brain area. The idea was not to image that area, but a non-infected medial prefrontal cortex (mPFC) section where neurons could be infected by their axons projecting in an injected area, in this way being identified by their targeting region(s). A GRIN lens, allowing spectral analysis, was mounted in the mPFC section, and GCaMP fluorescence was then recorded during behavioural tasks and analysed to identify regions of interest (ROIs) corresponding to neuron somata. After functional imaging, the head of the mouse was fixed, spectral analysis was performed, and after necessary correction for chromatic distortions, the fluorophore contribution was determined for each ROI (neuron) from where GCaMP signals were detected. Notably, the procedures for estimation and correction of chromatic aberration and light transmission (described in Figure 2) were a major challenge in their technical achievements. The selection of the nine fluorophores was another big effort. This was done by combining computer simulations and direct measurement of spectra from individual proteins expressed in HEK293 cells. It is important to say that the authors could simulate arbitrary combinations of two or more different fluorophores and evaluate the ability of their algorithm to detect the correct proteins against wrong estimations of false-negative (absence of an expressed protein) or false-positive (presence of a non-expressed protein). Not surprisingly, this ability decreases with the level of GCaMP expression. The authors underline that most errors were false-negatives, which have a milder impact in terms of result interpretation, but the rate of false positives was, nevertheless, relevant in detecting a second fluorophore from a cell expressing only one protein. The experimental profiles of fluorophores were dependent both on the specific fluorescent protein and on the projecting area, and the distribution of double-labelled did not match anatomical evidence. This result should be taken as the limitation of the present pioneering experiments, presented as proof-of-principle of the approach, but Neuroplex may provide far improved precision under different experimental conditions.

      In my view, the work of Phillips et al. represents a significant advance in the state-of-the-art of the field. The rigorous analysis of limitations in the use of Neuroplex must be considered an important guideline for future uses of this approach.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript introduces Neuroplex, a pipeline that integrates miniscope Ca²⁺ imaging in freely moving mice with multiplexed confocal and spectral imaging to infer projection identities of recorded neurons. This technical approach is promising and could broaden access to projection-resolved population imaging. However, the core quantitative analyses apply a winner-take-all single-label assignment per neuron even when multiple fluorophores exceed threshold, with additional labels treated descriptively as "secondary hits." While the authors acknowledge and simulate dual labeling, the extent to which this single-label decision rule affects subtype fractions and behavioural comparisons remains uncertain without a multi-label (or probabilistic) sensitivity analysis and propagation of classification uncertainty.

      Strengths:

      (1) Conceptual advance and practicality: Decoupling acquisition from identity readout constitutes an innovative approach that is, in principle, applicable in laboratories currently using single-color miniscopes.

      (2) Engineering thoroughness: The manuscript offers detailed consideration of GRIN optics, spectral libraries, registration procedures, and simulations that address signal-to-noise ratio, background, and class imbalances.

      (3) Immediate community value: If demonstrated to be robust, the pipeline could enable projection-resolved analyses without reliance on specialized multicolor miniscopes.

      Weaknesses:

      (1) Single-label assignment in the main analyses: When multiple fluorophores exceed threshold for a neuron/ROI, the workflow applies a winner-take-all rule and assigns a single label (the fluorophore with the largest standardized beta), while additional above-threshold fluorophores are retained only as "secondary hits." This is a reasonable specificity-first choice, but because cortical excitatory neurons can collateralize, collapsing dual-threshold ROIs to one identity may under-represent dual-projecting cells and could bias estimated subtype fractions and behavioural comparisons.

      (2) Dual-label detection is acknowledged but remains descriptive in vivo: the manuscript explicitly discusses the possibility of dual projection, evaluates dual-fluorophore detection in simulations (including performance under realistic noise/background), and reports in vivo rates of secondary hits. However, these dual-threshold events are not incorporated as co-identities in the main statistical analyses, making it difficult to judge how robust the principal biological conclusions are to the single-label decision rule.

      (3) Uncertainty is not propagated: False-positive/false-negative rates from simulations and uncertainty from registration/segmentation are not carried forward into quantitative confidence bounds on subtype proportions or behaviour-by-subtype effects.

    4. Reviewer #3 (Public review):

      This manuscript presents Neuroplex, a technically rigorous and carefully validated pipeline that links miniscope calcium imaging in freely behaving animals with high-dimensional fluorophore-based cell-type identification using in vivo multiplexed spectral confocal imaging through the same implanted GRIN lens. The work overcomes a major practical limitation of head-mounted microscopy by enabling the identification of up to nine projection-defined neuronal populations within the same animal, without post-fixation histology. The approach is well motivated and supported by extensive calibration and simulation. While the biological results are primarily illustrative, the methodological contribution is clear and likely to be broadly useful.

      Major comments

      (1) The approach relies on the assumption that fluorophore identity assigned during anesthetized confocal imaging accurately reflects the identity of neurons recorded during prior behavioural sessions. While the use of the same GRIN lens and in vivo co-registration mitigates many concerns, the manuscript would benefit from a more explicit discussion, or empirical demonstration, if available, of the stability of fluorophore assignments across time. Even limited repeat spectral imaging in a subset of animals would strengthen confidence in longitudinal applicability.

      (2) Fluorophore identity is determined using thresholding of linear unmixing coefficients relative to an empirically defined baseline, followed by a second adaptive pass for over-represented fluorophores. While this heuristic is extensively validated via simulations, it remains ad hoc from a statistical perspective. The authors should more explicitly justify this choice and discuss its limitations relative to probabilistic or likelihood-based classifiers, particularly with respect to uncertainty estimation at the single-ROI level.

      (3) Identifiability of fluorophores is demonstrated empirically, but the manuscript does not explicitly quantify spectral separability (e.g., similarity metrics between basis spectra or conditioning of the unmixing matrix). A brief analysis of spectral independence or sensitivity of beta estimates to noise would provide mathematical reassurance, especially given the reliance on linear regression in a high-dimensional feature space.

      (4) The spectral unmixing treats CNMF-derived ROIs as fixed supports. I wonder whether ROI boundaries, neuropil contamination, and partial overlap can introduce structured uncertainty that could bias spectral estimates. If so, the authors should acknowledge this dependency more explicitly and discuss how ROI quality or overlap might influence false negatives or false positives, particularly in densely labelled regions.

      (5) The manuscript reports meaningful rates of secondary fluorophore detection, but also nontrivial false-positive rates for secondary labels under realistic conditions. The authors appropriately caution against over-interpretation, but the Discussion should more clearly delineate when dual-label assignments are likely to be biologically interpretable versus methodologically ambiguous, and how experimental design (e.g., fluorophore pairing) should be optimized accordingly.

      (6) I suspect that Neuroplex will be most effective in certain regimes (moderate convergence, bright and spectrally distinct fluorophores) and less reliable in others. A more explicit discussion of best practices, anticipated failure modes, and experimental scenarios where the method may be inappropriate would increase the practical value of the paper for adopters.

    1. eLife Assessment

      This important study describes long-range serial dependence of performance on a visual texture discrimination training task that manipulated conditions to induce differing degrees of location transfer of learning. The authors re-analyzed a previously-published behavioral data set, generating compelling evidence from converging approaches that serial dependence effects can persist across multiple days post-training, and are impacted by whether training promotes more or less location transfer. Although underlying mechanisms for these processes remain unclear, these results will interest neuroscientists in general by informing our understanding of the importance of temporal integration to long-term perceptual learning and its propensity towards specificity or generalizability.

    2. Reviewer #1 (Public review):

      This paper presents a reanalysis of a large existing dataset to examine whether serial dependence effects-systematic influences of recent stimulus history on current perceptual judgments-are associated with generalization in perceptual learning. The central hypothesis is that extended, longer-range history effects (beyond the most recent trials) are beneficial for transfer across locations. The authors reanalyze data from a texture discrimination task in which observers discriminated peripheral target orientation against a line background, with performance quantified by stimulus-onset asynchrony thresholds. Three training conditions were compared: a fixed single-location condition, a two-location alternating condition, and a dummy-trial condition with frequent target-absent trials. Transfer was assessed after training at new locations. Serial dependence was quantified using history-sequence analyses and linear mixed-effects models estimating bias weights across stimulus lags, with summary measures distinguishing recent (1-3 trials back) and more distant (4-6 trials back) dependencies.

      The authors report extended serial dependence effects, persisting up to 6-10 trials back, with substantial cumulative bias that remains stable across multiple days of training and is not correlated with overall performance thresholds. Recent history effects are stronger for faster responses, suggesting a contribution from decision- or response-related processes, whereas more distant effects decline within sessions, potentially reflecting adaptation dynamics. Critically, longer-range serial dependence is significantly stronger in training conditions that promote generalization than in the single-location condition. Individual differences in the strength and decay profile of distant history effects predict the magnitude of transfer across locations, whereas recent history effects do not. History effects are also correlated across trained locations, suggesting stable individual differences.

      The authors interpret longer-range serial dependence as reflecting integrative processes that extract task-relevant structure over time, thereby supporting generalization, while shorter-range effects are attributed to more transient mechanisms such as priming or decision-level bias. The discussion connects these findings to Bayesian accounts of perceptual stability and to concepts of overfitting in machine learning.

      The study offers a novel and thoughtful link between short-term serial dependence and long-term generalization in perceptual learning, helping bridge two literatures that are often treated separately. The large dataset enables robust estimation of individual differences, and the use of mixed-effects modeling appropriately accounts for variability across observers. The empirical distinction between recent and more distant history effects is well-supported and adds important nuance to interpretations of serial dependence. Converging evidence from both group-level comparisons and individual-level correlations strengthens the central conclusions.

      Several limitations should be addressed. First, the study relies entirely on previously collected data, without experimental manipulations designed to selectively isolate serial dependence mechanisms. Filtering choices, while theoretically motivated, may amplify history effects in ways that are difficult to quantify. Second, sequential dependencies can arise from multiple sources, including gradual updating of internal weight structures, adaptation processes, and history-dependent biases in decision-making. The current analyses do not clearly separate these contributions, limiting mechanistic attribution of long-range effects. Third, the conclusions are based on a single perceptual task, leaving open questions about generality across paradigms. Finally, while the discussion references computational ideas, no explicit modeling is provided to test whether plausible learning rules can jointly account for the observed history profiles and transfer effects.

      The findings align with theoretical frameworks that conceptualize perceptual learning as gradual reweighting of stable sensory representations at the decision stage (e.g., Petrov et al., 2005). Trial-by-trial updates in these models naturally give rise to sequential dependencies and sensitivity to training statistics. The observation that longer-range history effects predict generalization is consistent with broader temporal integration supporting more flexible learning, while narrower integration may lead to specificity. The results also indicate that multiple mechanisms - including decision-level biases and adaptation - may coexist with reweighting processes, highlighting the value of hybrid accounts.

      In summary, this is a careful and data-rich reanalysis that highlights a potentially important role for serial dependence in enabling generalization during perceptual learning. While the underlying mechanisms remain underspecified, the evidence supporting the reported associations is strong, and the work provides a valuable empirical foundation for further experimental and modeling efforts.

    3. Reviewer #2 (Public review):

      This manuscript investigates how people's perceptual reports are influenced by events and trials in the past, and how this long-range dependence relates to broader learning across locations in a visual learning task. The authors present clear and internally consistent analyses showing that extended temporal integration is associated with greater generalization of learning. The study is thought-provoking and may contribute meaningfully to understanding how short-term influences and long-term improvement interact, although several interpretational points would benefit from clarification.

      Strengths:

      (1) The manuscript identifies unusually long-range perceptual biases extending up to ten trials back, which is a striking and potentially important finding.

      (2) The association between strong long-range dependence and greater learning generalization is clearly documented and supported by consistent analyses.

      (3) The dataset is large and rich, and the authors apply repeated and well-controlled analyses that give confidence in the stability of the effects.

      (4) The writing is generally clear, and the manuscript raises interesting conceptual links between temporal integration and generalization of learning.

      Weaknesses / Points Requiring Clarification:

      (1) The manuscript repeatedly equates generalization with increased efficiency, but this relationship is not universally true. In some populations or tasks, excessive generalization can reduce task-specific efficiency. The authors should discuss this context-dependence to clarify when generalization is beneficial versus detrimental.

      (2) Serial dependence is also present, though smaller, in the central fixation task. It remains unclear whether this bias could contribute to the serial dependence observed in the main task. The authors should clarify whether the two biases are independent or whether the central-task bias might partially influence orientation judgments in the main task.

      (3) Several figure captions and labels contain minor inconsistencies in formatting and terminology. Careful proofreading would improve clarity.

    4. Reviewer #3 (Public review):

      This reanalysis of a classic study of visual perceptual learning in a texture discrimination task convincingly demonstrates the presence of sequential dependence effects, commonly seen in response time analyses in 2-alternative tasks, on response accuracy in the texture task in the visual periphery and in a simultaneous central letter report at fixation. Overall, this paper provides a new and interesting analysis of the effects of sequential dependencies from trial to trial on performance, learning, and generalizability in perceptual learning.

      Strengths:

      This new analysis of sequential dependency effects (SDEs) extends commonly observed sequential effects in two-choice reaction times to accuracy and relates them to response accuracy during visual learning in a frequently used perceptual learning task. The paper makes a convincing case that different conditions known to impact generalization of learning to a second visual location also express quantitatively distinct n-back SDEs.

      Weaknesses:

      Most of the new analyses emphasize the effects of SDEs, including trials designed to enhance the size of the effects, specifically when the current trial is low visibility, and the prior trial is of high visibility. Unless there is an argument that learning and subsequent generalization primarily occur in low-visibility trials, the presentation should also include displays and an emphasized discussion of analysis for all trials, unfiltered.

    1. eLife Assessment

      This study provides valuable evidence regarding our expectations about task difficulty and how this might influence proactive attention. The findings suggest that anticipated demands enhance the strength of attentional selection at cued locations. The evidence is solid but not definitive, as the conclusions rely on the absence of changes in spatial breadth and would benefit from clearer statistical justification and a more cautious interpretation of alternative mechanisms.

    2. Reviewer #1 (Public review):

      Summary:

      The authors attempt to use a combination of behavioural and EEG analyses in order to investigate whether expectation of task difficulty influences spatial focus narrowing in the context of a spatially cued task, alongside an expected attention-related amplitude effect. This distinguishes the experiment from previous tasks, which looked at this potential spatial narrowing in the context of more non-cued diffuse attention tasks. The authors present two major findings:

      (1) Behaviourally, they analysed the effects of cue validity and difficulty expectation on response accuracy, and found that participants displayed an effect of difficulty expectation in validly cued trials, showing relatively enhanced behaviour to Hard Expectation trials, but no effect of expectation in invalidly cued trials.

      (2) Inverted encoding modelling on broadband EEG showed greater pre-target attentional processing in the Hard Expectation blocks. They go on to show that this enhancement comes in the form of greater amplitude of the Channel Tuning Functions (CTFs) approximately 300 to 400ms post-cue, in the absence of any spatial tuning specificity enhancement (as would be evident in a difference in CTF fit width).

      Together, these results provide valuable findings for those investigating the separable effects of expectation and attention on target detection in visual search.

      Strengths:

      (1) This is a very solidly performed experiment and analysis, with different streams of evidence convincingly pointing in the same direction, i.e. a gain effect of Expectation in the absence of a spatial tuning effect.

      (2) EEG is competently analysed and interpreted, and the paper is well written and simple in its motivation.

      (3) The authors report appropriately on the results in the Discussion, without overreaching.

      Weaknesses:

      I mainly have a few minor issues for the authors to clarify, which I will leave to Recommendations. However, a few analyses need further work:

      (1) The GLMM method used has very large degrees of freedom (pages 6 and 7) of 34542. I assume this is the number of trials minus the number of parameters? This would imply that random slopes were not modelled in the analyses. However, looking at the Methods, it is reported that they were modelled. The authors should clarify exactly what was done here and why, including the LMM model.

      (2) Figure 4 shows an "example CTF fit". Why only one? You could put transparent lines in the background for each individual fit, followed by the grand average, or show each fit in the supplementary section?

    3. Reviewer #2 (Public review):

      Summary:

      The authors set out to determine whether people can adjust how narrowly or broadly they focus attention in advance based on expectations about how difficult an upcoming visual task will be. Specifically, they aimed to test whether expecting a more demanding search leads to a narrower focus of attention or instead strengthens attention at the relevant location without changing its spatial extent.

      Strengths:

      The study addresses a timely and interesting question about how expectations influence the preparation of attention before a task begins. The experimental design is well-suited to isolating anticipatory effects by manipulating expectations about task difficulty independently of moment-to-moment stimulus information. The manuscript is clearly written, and the methods are described in sufficient detail to support transparency and reproducibility.

      Weaknesses:

      Despite the strengths of the design and the merit of the work, I have a few concerns regarding the analysis and the interpretation of the results.

      (1) I was somewhat confused by aspects of the behavioural analysis. I may be mistaken, but fixed effects in generalised mixed-effects models are more commonly reported using Wald statistics with beta coefficients rather than F statistics, and the very large degrees of freedom reported here are difficult to interpret. In particular, they appear closer to trial counts than to the number of participants, which raises questions about how statistical uncertainty is being estimated. This concern is compounded by the fact that different statistical approaches appear to yield different conclusions: the generalised mixed-effects models and the pairwise t-tests reported in the figure caption do not fully align. Moreover, the latter are not described in the Methods, and the justification for using them in the figure is not provided. Taken together, this makes it difficult to assess the strength of the behavioural evidence. The reported effects of expectation on behaviour also appear small, and there is no clear cost at uncued locations. This limited behavioural footprint makes it difficult to determine how robust the proposed preparatory mechanism is. It also complicates the interpretation of the neural findings as reflecting a general strategy for optimising task preparation.

      (2) A central premise of the study is that, if observers proactively narrow their attentional focus when expecting difficult search, this should be reflected in sharper spatial tuning profiles. This prediction is presented as a diagnostic test of whether expectations modulate attentional scope. However, the absence of such sharpening is later taken as evidence that expectations do not alter spatial extent and instead operate exclusively through gain modulation. This inference may be stronger than the data allow. The lack of an observed difference in tuning width does not necessarily rule out changes in attentional scope, particularly if such changes are subtle, temporally limited, or not well captured by the spatial resolution of the approach. As a result, while the findings are consistent with a gain-based account, they do not definitively exclude the possibility that expectations also influence spatial extent, and the logic linking the original prediction to the final conclusion would benefit from a more cautious interpretation.

      (3) The difference between easy and hard searches in the CTF slope is taken as evidence for enhanced preparatory spatial attention under high expected difficulty. However, these differences could also reflect broader changes in alertness or motivational state between blocks. The behavioural results show a small overall increase in accuracy in expect-hard blocks, which may be consistent with a more general increase in task engagement rather than a spatially specific preparatory mechanism. Although the authors decompose slope differences into amplitude and width parameters, the interpretation still relies on ruling out alternative, more global explanations for enhanced signal strength or reduced variability. This leaves some ambiguity as to whether the observed modulation reflects a specific adjustment of preparatory attention or a more general change in task state.

    1. eLife Assessment

      This useful manuscript addresses a stability issue for long-term chronically implanted array recordings and electrolytic lesioning, which is relevant to both basic science and translational research. The authors provide a systematic scanning electron microscopy (SEM) of explanted arrays, evaluating electrode damage and sharing extensive datasets accessible through interactive plots. The strength of the evidence is solid, but it can be improved by performing additional analyses on complementary neurophysiology, functional, or histological data.

    2. Reviewer #1 (Public review):

      Summary:

      This work presents a GUI with SEM images of 8 Utah arrays (8 of which were explanted, and 4 of which were used for creating cortical lesions).

      Strengths:

      Visual comparison of electrode tips with SEM images, showing that electrolytic lesioning did not appear to cause extra damage to electrodes.

      Weaknesses:

      Given that the analysis was conducted on explanted arrays, and no functional or behavioural in-vivo data or histological data are provided, any damage to the arrays may have occurred after explantation, making the results limited and inconclusive (firstly, that there was no significant relationship between degree of electrode damage and use of electrolytic lesioning, and secondly, that electrodes closer to the edge of the arrays showed more damge than those in the center).

      Overall, these results add new data and reference images to the field, although the insights that can conclusively be drawn are limited due to the low number of electrodes used and lack of in-vivo/ histological/ impedance data.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work presents a GUI with SEM images of 8 Utah arrays (8 of which were explanted, and 4 of which were used for creating cortical lesions).

      Strengths:

      Visual comparison of electrode tips with SEM images, showing that electrolytic lesioning did not appear to cause extra damage to electrodes.

      Weaknesses:

      Given that the analysis was conducted on explanted arrays, and no functional or behavioural in vivo data or histological data are provided, any damage to the arrays may have occurred after explantation. This makes the results limited and inconclusive (firstly, that there was no significant relationship between degree of electrode damage and use of electrolytic lesioning, and secondly, that electrodes closer to the edge of the arrays showed more damage than those in the center).

      We agree insofar as we could not fully control the circumstances of each array during explantation. However, array explantation is potentially damaging, but not universally damaging, as demonstrated by some largely intact arrays in this paper. If electrolytic lesions were damaging to the array, they would be observed. All arrays examined in this paper were carefully stored as described in the paper. All analyses of this type require an explant surgery [?????]. Our conclusions remain as strong as any of the results of these analyses.

      Overall, these results do not add new insight to the field, although they do add more data and reference images.

      We respectfully disagree, as there is no extant SEM analysis on electrode arrays used for lesioning.

      Reviewer #2 (Public review):

      In this study, the authors used scanning electron microscopy (SEM) to image and analyze eleven Utah multielectrode arrays (including eight chronically implanted in four macaques). Four of the eight arrays had previously been used to deliver electrolytic lesions. Each intact electrode was scored in five damage categories. They found that damage disproportionately occurred to the outer edges of arrays. Importantly, the authors conclude that their electrolytic Lesioning protocol does not significantly increase material degradation compared to normal chronic use without lesion. Additionally, the authors have released a substantial public dataset of single-electrode SEM images of explanted Utah arrays. The paper is well-written and addresses an important stability issue for long-term chronically implanted array recordings and electrolytic lesioning, which is relevant to both basic science and translational research. By comparing lesioning and non-lesioning electrodes on the same array and within the same animal, the study effectively controls for confounds related to the animal and surgical procedures. The shared dataset, accessible via interactive plots, enhances transparency and serves as a valuable reference for future investigations. Below, we outline some major and minor concerns that could help improve the work.

      Major concerns:

      (1) Electrode impedance is a critical measurement to evaluate the performance of recording electrodes. It would be helpful if the authors could provide pre-explant and post-explant impedance values for each electrode alongside the five SEM damage scores. This would allow the readers to assess how well the morphological scores align with functional degradation.

      We agree, electrode impedance is very important in determining electrode performance. However, due to the multi-year, multi-subject nature of this work, we unfortunately do not have this data.

      (2) The lesion parameters differ across experiments and electrodes. It would be helpful if the authors could evaluate whether damage scores (and/or impedance changes) correlate with total charge, current amplitude, duration, or frequency.

      Thank you for this recommendation. We have included additional analyses in Supplementary Materials.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) ‘Both in vitro and in vivo testing of electrode arrays revealed environmental damage to these materials, such as cracking, textural defects, and degradation in response to the brain’s temperature and salinity [32]. The immune response of the brain also damages the electrodes due to effects like glial scarring (gliosis) and inflammation [33, 34]. This damage may be exacerbated by the surgical techniques used during implantation, which include pushing the electrode array into cortex and tethering the implant to the skull [33, 35, 36].’

      In the above text, several relevant references have been left out, e.g.:

      Barrese et al., 2013

      Patel et al., 2023

      Woeppel et al, 2021

      Chen et al., 2023

      Bjanes et al., 2025

      Thank you for this recommendation. This section has been updated.

      (2) ‘Aggressive electrical stimulation is known to dissolve platinum-based electrodes [37, 38]. Other studies have shown iridium oxide to be more resistant to stimulation-related damage, but not completely insusceptible [39, 40].’ Reference number 25 is relevant here.

      Thank you for this recommendation. This section has been updated.

      (3) ‘F’s and C’s PMd arrays were used for electrolytic lesioning experiments Monkey U was implanted with three 96-channel arrays; two in M1 and one in PMd.’ There seems to be a punctuation mark missing.

      Thank you for this recommendation. This section has been updated.

      (4) Methods: How much charge was injected via the electrodes that were used for lesioning? What current amplitudes, voltages, durations, and number of pulses were used? If more than 1 pulse was applied, what were the frequencies? Was the pulse cathode-only/ anode/only? What were the electrode impedance values at the time of stimulation? How many electrodes were used for lesioning at any given moment? How long after lesioning did the arrays remain in the tissue?

      Thank you for your questions. An additional supplemental table (Supplemental Table 6) detailing specific NHP lesions parameters has been added. A summary of the lesion procedure (DC, bipolar, two electrodes at a time) has also been included in Methods. All arrays remained in the subject until explant, which ranged between hours (same-day lesion and explant) to several years. Further details on the lesioning procedure are available in citation [?]. Explant dates are available in Supplemental Table 1. Unfortunately, we do not have the impedance values at time of lesioning as this is not a measure we record frequently after implant, though we agree the data would be useful to have.

      (5) Caption for Figure 1: ‘All array images are displayed with the wire bundle to the right side.’ I recommend adding this text from Figure 2 to the caption of Figure 1: ’electrode tips facing viewer’.

      Thank you for this recommendation. This section has been updated.

      (6) ‘Electrodes used for electrolytic lesioning are denoted with blue dots.’ Was stimulation carried out across all these electrodes simultaneously?

      No, stimulation was not carried out across all electrode simultaneously. Pairs of electrodes were stimulated at the same time to create lesions. Lesions were performed on different days. We have updated our methods section to reflect this. See the Methods section and citation [?] for more details.

      (7) For the control array, in Figure 1: ‘Click each column to view a close-up of the 5th row (from top to bottom) of electrodes:’ . It would be clearer to state: ’Click each column to view a close-up of a single electrode in the 5th row (from top to bottom):’.

      Thank you for this recommendation. This section has been updated.

      (8) Figure 2 caption: ‘Blank electrodes and electrodes with shank fractures are ignored and displayed in black, as they are not scored.’. What is a ‘blank’ electrode?

      A ‘blank’ electrode is an electrode on the array that physically exists but is not wire bonded at time of manufacture to produce recordings. The corner electrodes of the Utah array are all blank electrodes. We have updated this wording to ‘unwired’ for clarity.

      (9) I recommend incorporating Supplementary Figure 1 into Figure 2, so that the reader can immediately see where the rings are, without referring to the Supplementary Materials.

      Thank you for this recommendation. We have chosen to keep these figures separate for stylistic reasons.

      (10) Supplementary Figures: The figures should have the word ’Supplementary’ in the title, i.e., ‘Supplementary Figure X,’ not just ‘Figure X.’

      Thank you for this recommendation. These captions have been updated.

      (11) Throughout the results, the text is overly focused on the type of statistical test used and the p-values, e.g.: ‘When comparing lesioning and non-lesioning electrodes within the same array, each of the two nonparametric statistical tests (Mann-Whitney U-test, Levene Test) returned insignificant p-values for each category of damage as well as for total damage scores for all four arrays used in lesioning experiments.’.

      To make the findings more digestible for the reader, the text should be rephrased in terms of whether the metrics being compared were significantly different or not. E.g.: ‘For each category of damage, as well as for the total damage score, no significant difference was found between electrodes that were or were not used for lesioning (either the mean or the variance of the scores).’.

      Thank you for this recommendation. We have rephrased the text to reflect this note.

      (12) ‘In Monkey H, the Mann-Whitney U test resulted in an insignificant p-value for coating cracks and parylene C delamination scores, while the Levene test resulted in an insignificant p-value for abnormal debris, coating cracks, and parylene C cracking scores. In Monkey F, the Mann-Whitney U test resulted in an insignificant p-value for parylene C delamination scores, while the Levene test resulted in an insignificant p-value for coating cracks, parylene C delamination, and parylene C cracking scores. In Monkey U, the Mann-Whitney U test resulted in significant p-values for all scores, while the Levene test resulted in an insignificant p-value for abnormal debris, tip breakage, and coating cracks scores. Finally, in Monkey C, the Mann-Whitney U test resulted in an insignificant p-value for parylene C delamination and parylene C cracking scores, while the Levene test resulted in an insignificant p-value for abnormal debris, parylene C delamination, and parylene C cracking scores.’

      To point out another example, this chunk of text is highly repetitive and is unnecessary, as the reader can simply refer to Supplementary Table 4. It should be completely rephrased and summarized, to deliver the key message, i.e. briefly describe what kinds of damage occurred for which arrays. Also, what is the point of the two statistical tests? What are the authors trying to conclude?

      Thank you for this recommendation. We have rephrased and pared down the text to reflect this note.

      (13) Discussion: ‘Similarly, other work did not show significant differences in SEM-visible degradation between both platinum and iridium oxide coated electrodes used for stimulation [24, 25].’ What differences are being referred to here? Differences in degradation between stimulated Pt versus stimulated IrOx electrodes? Or between stimulated Pt and unstimulated PT electrodes? Stimulated IrOx and unstimulated IrOx? Or something else?

      Thank you for your questions. We are comparing platinum against iridium oxide in this sentence. The wording of our original text has been updated to clarify our intention.

      (14) Supplementary Tables: P-values lower than .05, .01, and .001 should simply be replaced with ¡.05, ¡.01, and ¡.001. The alpha value after a Bonferroni correction should be stated somewhere in each table or table caption.

      Thank you for this recommendation. We have edited the tables to reflect this note.

      (15) Title: ‘Material Damage to Multielectrode Arrays after Electrolytic Lesioning is in the Noise’ I don’t understand what the title means. What is in the noise? And what is ‘the noise’?

      “In the noise” is a colloquialism referring to how background information (“noise”) may obscure or distract from other features. This title conveys how material damage to multielectrode arrays due to electrolytic lesioning is largely obscured by the general damage observed on multielectrode arrays after implant and explant.

      (16) This reference has been left out altogether: Chen et al., 2014. The effect of chronic intracortical microstimulation on the electrode-tissue interface.

      Thank you, this reference is now included.

      Reviewer #2 (Recommendations for the authors):

      (1) The number of lesion electrodes is low, especially since there are only 2-10 lesion electrodes on three of the four arrays, yielding limited statistical power.

      We agree that the low number of lesioned electrodes limits statistical power. However, due to ethical considerations, it is unlikely for arrays to contain much more than this number of lesion electrodes.

      (2) The dataset includes both platinum and iridium oxide-coated electrodes. A direct comparison of their damage profiles would be informative.

      Thank you for this recommendation. We have included this additional analysis in Supplementary Materials.

      (3) It is unclear what “is in the Noise” in the title means without reading the manuscript. It is helpful to improve the clarity of the title.

      Thank you for this recommendation.

      (4) Please spell out “PMd” and “M1” at first mention to facilitate reading.

      Thank you for this note. The text has been updated to reflect this recommendation.

    1. eLife Assessment

      This important study presents single-unit activity collected during model-based (MB) and model-free (MF) reinforcement learning in non-human primates. The dataset was carefully collected, and the statistical analyses, including the modeling, are rigorous. The evidence convincingly supports different roles for particular cortical and subcortical areas in representing key variables during reinforcement learning.

    2. Reviewer #1 (Public review):

      Summary:

      Using single-unit recording in 4 regions of non-human primate brains, the authors tested whether these regions encode computational variables related to model-based and model-free reinforcement learning strategies. While some of the variables seem to be encoded by all regions, there is clear evidence for stronger encoding of model-based information in anterior cingulate cortex and caudate.

      Strengths:

      The analyses are thorough, the writing is clear, the work is well-motivated by prior theory and empirical studies.

      Weaknesses:

      The authors have adequately addressed my prior comments.

    3. Reviewer #2 (Public review):

      Summary:

      The authors investigate single-neuron activity in rhesus macaques during model-based (MB) and model-free (MF) reinforcement learning (RL). Using a well-established two-step choice task, they analyze neural correlates of MB and MF learning across four brain regions: the anterior cingulate cortex (ACC), dorsolateral PFC (DLPFC), caudate, and putamen. The study provides strong evidence that these regions encode distinct RL-related signals, with ACC playing a dominant role in MB learning and caudate updating value representations after rare transitions. The authors apply rigorous statistical analyses to characterize neural encoding at both population and single-neuron levels.

      Strengths:

      (1) The research fills a gap in the literature, which has been limited in directly dissociating MB vs. MF learning at the single unit level and across brain areas known to be involved in reinforcement learning. This study advances our understanding of how different brain regions are involved in RL computations.

      (2) The study used a two-step choice task Miranda et al., (2020), which was previously established for distinguishing MB and MF reinforcement learning strategies.

      (3) The use of multiple brain regions (ACC, DLPFC, caudate, and putamen) in the study enabled comparisons across cortical and subcortical structures.

      (4) The study used multiple GLMs, population-level encoding analyses, and decoding approaches. With each analysis, they conducted the appropriate controls for multiple comparisons and described their methods clearly.

      (5) They implemented control regressors to account for neural drift and temporal autocorrelation.

      (6) The authors showed evidence for three main findings:

      (a) ACC as the strongest encoder of MB variables from the four areas, which emphasizes its role in tracking transition structures and reward-based learning. The ACC also showed sustained representation of feedback that went into the next trial.

      (b) ACC was the only area to represent both MB and MF value representations.

      (c) The caudate selectively updates value representations when rare transitions occur, supporting its role in MB updating.

      (7) The findings support the idea that MB and MF reinforcement learning operate in parallel rather than strictly competing.

      (8) The paper also discusses how MB computations could be an extension of sophisticated MF strategies.

      Weaknesses:

      (1) There is limited evidence for a causal relationship between neural activity and behavior. The authors cite previous lesion studies, but causality between neural encoding in ACC, caudate, and putamen and behavioral reliance on MB or MF learning is not established.

      (2) There is a heavy emphasis on ACC versus other areas, but is unclear how much of this signal drives behavior relative to the caudate.

      (3) The authors mention the monkeys were overtrained before recording, which might have led to a bias in MB versus MF strategy.

      (4) The authors have responded to the weaknesses appropriately in the manuscript.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Using single-unit recording in 4 regions of non-human primate brains, the authors tested whether these regions encode computational variables related to model-based and model-free reinforcement learning strategies. While some of the variables seem to be encoded by all regions, there is clear evidence for stronger encoding of model-based information in the anterior cingulate cortex and caudate.

      Strengths:

      The analyses are thorough, the writing is clear, and the work is well-motivated by prior theory and empirical studies.

      Weaknesses:

      My comments here are quite minor.

      The correlation between transition and reward coefficients is interesting, but I'm a little worried that this might be an artifact. I suspect that reward probability is higher after common transitions, due to the fact that animals are choosing actions they think will lead to higher reward. This suggests that the coefficients might be inevitably correlated by virtue of the task design and the fact that all regions are sensitive to reward. Can the authors rule out this possibility (e.g., by simulation)?

      We fully agree with the reviewer that the task design has in-built correlations between transition and reward, and thus the correlation between neural selectivity for feedback and transition (Figure 3E) may be due to the different reward expectation after common or rare transitions. We did try to make this point in the manuscript:

      This suggests that the brain treats being diverted away from your current objective equivalent to losing reward, which is sensible as the subject would normally expect lower rewards on rare trials if their reward-seeking behaviour was efficient.

      We’ve now updated the wording of this statement to try and better make this point and avoid confusion that any non-reward-related encoding is involved:

      “As the reward expectation will be higher on common compared to rare trials, this demonstrates that the brain encodes being diverted to an area with a lower reward expectation equivalent to actually receiving a low reward (and vice versa).”

      We have also adjusted the significance test of this correlation to use a circular permutation test that accounts for correlations between the regressors. This test still found there to be significant correlation in all areas.

      We have described this new permutation test in Methods:

      “For comparing correlations between weights for different features (i.e., between transition and reward coding, Figure 3E), the null distribution of correlations observed in circularly shifted data was compared to the correlation seen in the actual data. This accounts for any correlations between features that existed in the task by preserving the structure of the design matrices.”

      And updated the text in Results accordingly:

      “All regions, but particularly ACC, encoded a common transition (at the time of transition) similar to a high reward (at the time of feedback), as there was a positive correlation between the coefficients for reward and transition (the transition parameter was signed such that common and rare transitions were equivalent to high and low rewards, respectively) (ACC r=0.4963, DLPFC r=0.3273, caudate r=0.4712, putamen, r=0.5052; all p<0.002 except DLPFC where p=0.006, circular permutation test; Figure 3E, S5).”

      The explore/exploit section seems somewhat randomly tacked on. Is this really relevant? If yes, then I think it needs to be integrated more coherently.

      We thank the reviewer for this comment. We agree that the motivation for the explore/exploit analysis was not sufficiently clear in the original version.

      Our aim was not to introduce this as a separate or tangential effect, but rather to highlight how the task’s reward structure (with outcome levels stable for 5–9 trials) naturally created alternating periods favoring exploitation of a known high-value option versus exploration when outcomes changed. This feature of the task is tightly linked to MB-RL computations, as it requires integration of state-transition knowledge and updating across trials.

      Importantly, we show previously in the manuscript that ACC encoded state-transition structure (i.e., common versus rare transition) and MB-value estimates (at choice epoch). However, here we aimed to highlight that the same region also modulated choice encoding as a function of whether the subject was in an exploratory or exploitative regime – by knowing another feature of the task that relies on state-transition and outcome. We have revised this section to better integrate it into the main logic of the paper:

      “In our task, the outcome level (high, medium, low) of each second-stage stimulus remained the same for 5-9 trials before potentially changing. This design naturally created periods where subjects could ‘exploit’ the same Choice 1 to maximize reward for several trials; and other periods where they had to ‘explore’ different second-stage stimuli to optimize reward (as contingencies shifted). In classical MB-RL, the transition between reward states can be learned by keeping counts of observed transitions from a current state-action pair to a subsequent state, yielding a maximum-likelihood estimate of the environment’s dynamics [42]. In fact, knowledge about the reward contingency schedule could support decision-making in both exploitation – by enabling efficient choice when rewards are stable; and exploration – by guiding alternative behaviour most likely to yield improved outcomes (this is different from MF learning, where exploration is more random since the agent lacks explicit state-transition knowledge).

      We thus repeated our decoding analysis of choice 1 stimulus identity, but this time limited trials to those where they had not received a high reward for the previous two trials (‘explore’ trials), and those where the previous two rewards had been the highest level (‘exploit’ trials). All regions encoded choice 1 for some duration of the choice epoch for both explore (p<0.002 in all cases, permutation test; Figure 7A) and exploit (p<0.002 in all cases; Figure 7B) conditions, but decoding accuracy was strongest in ACC. Choice 1 was less strongly decoded – particularly in ACC – in the former condition compared to the latter (p<0.002 for at least 140 ms in all cases, permutation test on differences observed; Figure 7C); and, also during exploitation, the ACC encoded choice 1 before the choice was even presented to the subject (Figure S8). This pre-choice ACC encoding in exploit trials may reflect the need to allocate cognitive (or attentive) resources to features – i.e., choice 1 stimulus identity – that are most certain predictors of important outcomes. As a control, we also decoded the direction of the Choice 1 (where choice was indicated via joystick movement), which was randomised each trial and therefore orthogonal to the stimulus that was chosen. Again, all four regions encoded its direction in both explore (p<0.002 in all cases; Figure 7D) and exploit (p<0.002 in all cases; Figure 7E). However, there were minimal differences in the strength of the representation between explore and exploit conditions (ACC, p=0.088, cluster-based permutation test; DLPFC p=0.016; caudate p=0.32; putamen p=1; Figure 7F). Therefore, exploit behaviour specifically upregulated relevant task parameters that were worth remembering across trials.”

      Reviewer #2 (Public review):

      Summary:

      The authors investigate single-neuron activity in rhesus macaques during model-based (MB) and model-free (MF) reinforcement learning (RL). Using a well-established two-step choice task, they analyze neural correlates of MB and MF learning across four brain regions: the anterior cingulate cortex (ACC), dorsolateral PFC (DLPFC), caudate, and putamen. The study provides strong evidence that these regions encode distinct RL-related signals, with ACC playing a dominant role in MB learning and caudate updating value representations after rare transitions. The authors apply rigorous statistical analyses to characterize neural encoding at both population and single-neuron levels.

      Strengths:

      (1) The research fills a gap in the literature, which has been limited in directly dissociating MB vs. MF learning at the single unit level and across brain areas known to be involved in reinforcement learning. This study advances our understanding of how different brain regions are involved in RL computations.

      (2) The study used a two-step choice task Miranda et al., (2020), which was previously established for distinguishing MB and MF reinforcement learning strategies.

      (3) The use of multiple brain regions (ACC, DLPFC, caudate, and putamen) in the study enabled comparisons across cortical and subcortical structures.

      (4) The study used multiple GLMs, population-level encoding analyses, and decoding approaches. With each analysis, they conducted the appropriate controls for multiple comparisons and described their methods clearly.

      (5) They implemented control regressors to account for neural drift and temporal autocorrelation.

      (6) The authors showed evidence for three main findings:

      (a) ACC as the strongest encoder of MB variables from the four areas, which emphasizes its role in tracking transition structures and reward-based learning. The ACC also showed sustained representation of feedback that went into the next trial. b) ACC was the only area to represent both MB and MF value representations.

      (c) The caudate selectively updates value representations when rare transitions occur, supporting its role in MB updating.

      (7) The findings support the idea that MB and MF reinforcement learning operate in parallel rather than strictly competing.

      (8) The paper also discusses how MB computations could be an extension of sophisticated MF strategies.

      Weaknesses:

      (1) There is limited evidence for a causal relationship between neural activity and behavior. The authors cite previous lesion studies, but causality between neural encoding in ACC, caudate, and putamen and behavioral reliance on MB or MF learning is not established.

      We agree with the reviewer that the present study does not establish causal relationships, and we do not claim otherwise in the manuscript. Our work was designed as a comprehensive characterization of neural activity across ACC, DLPFC, caudate, and putamen during reward-seeking decision-making. By systematically comparing MB- and MF- RL signals across these regions, we provide new insights into the division of labor and cooperative interactions within cortico-striatal networks.

      While causal manipulations (e.g., lesions, inactivations, stimulation) are indeed required to directly establish necessity or sufficiency, correlational studies such as ours play a crucial role in identifying where and how computationally relevant signals are represented. Importantly, our findings align with and extend prior causal work, for example showing that ACC and striatal lesions disrupt MB control. Thus, our study contributes a detailed functional mapping of MB and MF RL encoding across multiple nodes of this circuit, which serves as an important foundation for future causal investigations (e.g., using transcranial ultrasound stimulation).

      (2) There is a heavy emphasis on ACC versus other areas, but it is unclear how much of this signal drives behavior relative to the caudate.

      We appreciate the reviewer's observation regarding this matter. Our intention was not to place a heavy emphasis on ACC, rather this came naturally from the data. The ACC demonstrated considerably more robust and enduring neural activity compared to other brain regions – for instance, reward-related signals in the ACC continued well beyond individual trials (Fig. 2A-B), and encoding of state transitions remained active from the initial transition through to the feedback phase (Fig. 3A-B). By comparison, distinctions among other regions were less pronounced, which naturally resulted in the ACC receiving greater attention in our analytical findings.

      We acknowledge that the caudate plays an essential and complementary role in driving behavior, and we believe that this is emphasized in the two key subsections of our “Results”. First, caudate neurons encoded model-based choice values (Fig. 4A, 4C) and uniquely remapped these values following rare transitions (Fig. 5), reflecting flexible adjustment of action values. Second, decoding analyses showed that both ACC and caudate populations predicted first-stage choices (Fig. 6C), linking their activity directly to behavioral decisions. In the Discussion section, we also highlight that “the distinctive caudate signal of updating (flipping) the value estimates of the currently experienced option on rare trials” goes beyond a “general temporal-difference RPE” and rather supports “the role of caudate in MB valuation”.

      (3) The role of the putamen is somewhat underexplored here.

      Our analyses were conducted in an identical manner across all four recorded regions (ACC, DLPFC, caudate, and putamen), and we consistently reported the results for putamen alongside the others. For example, in the Results section we describe how “both caudate and putamen encoded the reward from the previous trial negatively during the feedback period of the current trial” (Fig. 2F-G), and that “all regions had a significant population of neurons that encoded MB-, but not MF-, derived value” including putamen (Fig. 4F). Similarly, we show that putamen, like caudate, encoded a dopamine-like RPE signal at feedback (“both caudate and putamen neurons clearly responded at feedback with the parametric features of a dopamine-like RPE”; Discussion). These findings align with previous work linking the putamen to MF learning and are discussed explicitly in the context of MF-MB dissociations. We therefore believe that the putamen was not underexplored, but rather that its contribution was more circumscribed relative to ACC and caudate because the signals observed were quantitatively weaker and less distinctive for MB computations.

      (4) The authors mention the monkeys were overtrained before recording, which might have led to a bias in the MB versus MF strategy.

      We agree that extensive training can influence the balance between MB and MF in choice behaviour and neuronal responses.

      In a previous comprehensive behavioral analysis of the same dataset (Miranda et al., 2020, PLoS Computational Biology - ref. 36, Figure S6B) we showed that both MB and MF strategies contributed to behavior, with MB dominance stable across weeks of testing – supporting that overtraining did not eliminate MF influences (but rather stabilized a mixed strategy with robust MB contributions).

      In the same manuscript, we have also: i) cautioned the readers when comparing our results to data from the original human studies; ii) acknowledged that our extensive training cannot address earlier phases of learning in which sensitivity to the task structure is first acquired; and iii) also provided task-related reasons for such MB dominance – as training made the transition structure well learned (making MB computationally less costly and faster to implement) and the non-stationary outcomes favored the flexibility of MB strategies.

      In the present manuscript, we also have acknowledged that overtraining may have shifted neural signals toward stronger MB representations, or alternatively enabled more sophisticated task representations:

      “On the other hand, MF-based estimates were neither as striking nor as specific to striatal regions as expected and observed in previous studies [18]. The monkeys were extensively trained on the task before recordings commenced, which may have caused a shift towards both MB behaviour and MB value representation within the striatum. Alternatively, this training may have allowed more sophisticated representations to occur, such as using latent states to expand the task space [54].”

      Importantly, we strongly believe that this possibility does not detract from our main finding that both MB and MF signals were present across regions, with ACC showing the strongest multiplexing of the two.

      (5) The GLM3 model combines MB and MF value estimates but does not clearly mention how hyperparameters were optimized to prevent overfitting. While the hybrid model explains behavior well, it does not clarify whether MB/MF weighting changes dynamically over time.

      We appreciate this comment and would like to note that, for completeness, we have on several occasions directed the reader to our prior behavioural analysis of the same dataset (Miranda et al., 2020, PLoS Computational Biology, ref 36). In that work, we provide a full and detailed description of both the task and the computational modeling approach (see particularly the “Model fitting procedures” section). Furthermore, our model-fitting was grounded in the MF/MB RL framework used in the original human two-step study (Daw et al., 2011); and the fitting procedures also followed previous studies (Huys et al., 2011).

      Hyperparameters – including the MB/MF weighting parameter (ω) - were estimated using maximum likelihood under two complementary approaches and with priors providing regularization across sessions. First, we performed a fixed-effects analysis, in which parameters were estimated independently for each session by maximizing the likelihood separately; secondly, we conducted a mixed-effects analysis, treating parameters as random effects across sessions within each subject. The effect of the prior procedure reduces the risk of overfitting by constraining parameters based on their empirical distributions, rather than allowing unconstrained session-by-session estimates. Finally, all model fitting procedures were verified on surrogate generated data.

      With regard to dynamic weighting, our approach – consistent with most two-step studies – assumed ω to be constant across trials within each session. This was a deliberate choice, both for comparability with prior work and because our subjects were extensively trained, making session-level stability of strategy weights a reasonable assumption. Indeed, our analyses showed no systematic drift in ω across sessions, suggesting that MB/MF balance was stable over sessions. While approaches that allow dynamic ω estimation are possible, we believe such extensions would likely have minimal impact in the current dataset.

      (6) It was unclear from the task description whether the images used changed periodically or how the transition effect (e.g., in Figure 3) could be disambiguated from a visual response to the pair of cues.

      All images were kept constant across sessions. Common/Rare transitions themselves were not explicitly cued, but rather each second-stage state was associated with a specific background colour, followed ~1s later by the presentation of two specific second-stage choice cues (Figure 1B). Hence the subject could infer whether they were transitioned down a Rare or Common path by the background colour, which can be disambiguated in time from the visual responses to the second-stage cues. We’ve updated the Results text to make this clearer:

      “Tracking the state-transition structure of the task is imperative for solving the task as a MB-learner. All four regions encoded whether the current trial’s first-stage choice transitioned to the common or rare second-stage state (which could be inferred by a change in background colour immediately after choice indicating which second stage state they had just entered, Figure 1A).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Figure 7 appears to be missing.

      We thank the reviewer for pointing this out. Figure 7 was inadvertently omitted in the previous version and has now been included in the revised manuscript.

      (2) No stats reported in the section on explore/exploit.

      We apologise for this oversight. This section now also reports the relevant statistics:

      “We thus repeated our decoding analysis of choice 1 stimulus identity, but this time limited trials to those where they had not received a high reward for the previous two trials (‘explore’ trials), and those where the previous two rewards had been the highest level (‘exploit’ trials). All regions encoded choice 1 for some duration of the choice epoch for both explore (p<0.002 in all cases, permutation test; Figure 7A) and exploit (p<0.002 in all cases; Figure 7B) conditions, but decoding accuracy was strongest in ACC. Choice 1 was less strongly decoded – particularly in ACC – in the former condition compared to the latter (p<0.002 for at least 140 ms in all cases, permutation test on differences observed; Figure 7C); and, also during exploitation, the ACC encoded choice 1 before the choice was even presented to the subject (Figure S8). This pre-choice ACC encoding in exploit trials may reflect the need to allocate cognitive (or attentive) resources to features – i.e., choice 1 stimulus identity – that are most certain predictors of important outcomes. As a control, we also decoded the direction of the Choice 1 (where choice was indicated via joystick movement), which was randomised each trial and therefore orthogonal to the stimulus that was chosen. Again, all four regions encoded its direction in both explore (p<0.002 in all cases; Figure 7D) and exploit (p<0.002 in all cases; Figure 7E). However, there were minimal differences in the strength of the representation between explore and exploit conditions (ACC, p=0.088, cluster-based permutation test; DLPFC p=0.016; caudate p=0.32; putamen p=1; Figure 7F).”

      (3) Make sure that error bars are explained in all figure captions where appropriate.

      We apologise that this information was absent. Error bars always represent the standard error of the mean. This has now been added to all relevant figure legends.

      Reviewer #2 (Recommendations for the authors):

      Overall, I think this is a great manuscript and was presented clearly and succinctly. I have some minor suggestions:

      (1) Typo: Abstract "ACC, DLPFC, caudate and striatum" I think should be "caudate and putamen".

      We have amended this incorrect reference in the introduction:

      “One such task that does enable the dissociation of MB and MF computations is Daw et al. (2011)’s ‘two-step’ task [18]. It contains a probabilistic transition between task states to uncouple MF learners (who would assign credit to which state was rewarded regardless of the transition) from MB learners (who would appropriately assign credit based on the reward and transition that occurred). Rodents [19], monkeys [36], and humans [18] all use MB-like behaviour to solve the task. Evidence in rodents suggests dorsal anterior cingulate cortex (ACC) tracks rewards, states, and the probabilistic transition structure, and that ACC is essential in implementing a MB-strategy [37]. Here, we compare primate single neuron activity of 4 different subregions implicated in reward-based learning and choice (ACC, dorsolateral PFC (DLPFC), caudate, and putamen) during performance of the classic two-step task, and demonstrate signatures of MB-RL primarily in ACC, and MF-RL signatures most notably in putamen.”

      (2) Could the authors provide a rationale for why they did the single-level encoding the way they did, instead of running an ANOVA?

      We thank the reviewer for this point. We are not entirely certain which specific ANOVA approach is being suggested, but our rationale for using a GLM-based encoding analysis is that such approach allows us to model continuous, trial-by-trial variables (e.g., value signals, prediction errors, transitions) while simultaneously controlling for multiple correlated predictors. This approach is widely used in systems neuroscience (particularly in decision-making research) offering analytical flexibility and comparability with prior approaches.

      (3) How were the 20 iterations for decoding decided? That seems low.

      We do not agree that 20 repetitions of 5-fold cross validation is low. The error bars in panels 6C-E demonstrate what low variance occurred across these 20 repetitions. It is the average of these low variance repetitions against which we performed statistics by performing a permutation test where these 20 repetitions were repeated a further 500 times.

      (4) It was unclear to me how the authors reached the conclusion "Thus, caudate activity appeared to represent the value of the state the subject was currently in." when the state value wasn't computed directly. I don't see how encoding the chosen and unchosen option is the same as the state the animal is in, which should also incorporate where the animal is in a block of trials or session, and the knowledge regarding the chosen and unchosen option.

      We agree with this point and have tempered this statement:

      “Thus, caudate’s encoding of an option’s value also reflected the availability of the option.”

      (5) Figures 1C, D, and E were not legible to me even at 200% zoom.

      We apologise for this oversight. We’ve now updated panels 1C-E to a more readable size:

      (6) There is a Figure 2H in the figure legend, but the panel appears to be missing from Figure 2.

      This text has been removed.

      (7) Figure 2: It would've been nice to see F and G for all areas.

      We have now added this data as additional panels in Figure 2.

      (8) Figure 3: How is the transition disambiguated from a visual response to the set of images?

      This was indicated by the background changing colour to that of the learned second stage state before the actual choices were presented. We’ve updated the Results text to make this clearer:

      “Tracking the state-transition structure of the task is imperative for solving the task as a MB-learner. All four regions encoded whether the current trial’s first-stage choice transitioned to the common or rare second-stage state (which was indicated by a change in background colour before the second stage choices were presented, Figure 1A).”

      (9) Figure 4F: Is this collapsed across time points? So neurons that were significant at any time? I'm confused how Figure 4A relates to 4F, as 4A shows much lower percentages of significant neurons.

      Figure 4F counts the total number of neurons that had a significant period of encoding at any timepoint over the epoch (as assessed with a length-based permutation test). Whereas, 4A shows the amount of significant encoding neurons at any one time point. Investigating this further, we found that the encoding was dynamic with different neurons encoding different parts of the epoch. We have now added a new supplementary figure to highlight this and refer to it in Results:

      “Examination of the strongest signal observed, ACC’s encoding of MB Q-values, showed a dynamic pattern with different neurons encoding the signal at different parts of the epoch (Figure S6). When aggregating the number of significant coders throughout the epoch, and examining the specificity of MB versus MF coding, we found that all regions had a significant population of neurons that encoded MB-, but not MF-, derived value (30, 18.72, 23 and 24% of neurons in ACC, DLPFC, caudate and putamen respectively; all p<0.0014 binomial test against 10% (as the strongest response to either of the two options was used); Figure 4F).“

      (10) Data/ code could be made publicly available instead of upon request.

      All data and code to reproduce figures are now available at https://github.com/jamesbutler01/TwoStepExperiment. The manuscript has been updated to reflect this:

      Data and materials availability:

      All data and code to reproduce figures are available at https://github.com/jamesbutler01/TwoStepExperiment.

    1. eLife Assessment

      This valuable study utilizes a newly developed approach to culture T gondii bradyzoites in myotubes, and then takes advantage of the antiparasitic compound collection known as the Pathogen Box, to find compounds that target both tachyzoite and bradyzoite forms of the parasite. A set of compounds yielding patterns consistent with targeting the mitochondrial bc1 complex was explored further, with solid evidence for changes in ATP production in bradyzoites to support the conclusions about the importance of this complex. The paper will be interesting for parasitologists studying drug discovery of apicomplexan parasites.

    2. Reviewer #1 (Public review):

      Summary:

      The authors' goal was to advance the understanding of metabolic flux in the bradyzoite cyst form of the parasite T. gondii, since this is a major form of transmission of this ubiquitous parasite, but very little is understood about cyst metabolism and growth.

      Nonetheless, this is an important advance in understanding and targeting bradyzoite growth.

      Strengths:

      The study used a newly developed technique for growing T. gondii cystic parasites in a human muscle-cell myotube format, which enables culturing and analysis of cysts. This enabled screening of a set of anti-parasitic compounds to identify those that inhibit growth in both vegetative (tachyzoite) forms and bradyzoites (cysts). Three of these compounds were used for comparative Metabolomic profiling to demonstrate differences in metabolism between the two cellular forms.

      One of the compounds yielded a pattern consistent with targeting the mitochondrial bc1 complex, and suggest a role for this complex in metabolism in the bradyzoite form, an important advance in understanding this life stage.

      Weaknesses:

      Studies such as these provide important insights into the overall metabolic differences between different life stages, and they also underscore the challenge with interpreting individual patterns caused by metabolic inhibitors due to the systemic level of some of some targets, so that some observed effects are indirect consequences of the inhibitor action. While the authors make a compelling argument for focusing on the role of the bc1 complex, there are some inconsistencies in the some patterns that underscore the complexity of metabolic systems.

    3. Reviewer #2 (Public review):

      Summary:

      A particular challenge in treating infections caused by the parasite Toxoplasma gondii is to target (and ultimately clear) the tissue cysts that persist for the lifetime of an infected individual. The study by Maus and colleagues leverages the development of a powerful in vitro culture system for the cyst-forming bradyzoite stage of Toxoplasma parasites to screen a compound library for candidate inhibitors of parasite proliferation and survival. They identify numerous inhibitors capable of inhibiting both the disease-causing tachyzoite and the cyst-forming bradyzoite stages of the parasite. To characterize the potential targets of some of these inhibitors, they undertake metabolomic analyses. The metabolic signatures from these analyses lead them to identify one compound (MMV1028806) that interferes with aspects of parasite mitochondrial metabolism. In the revised version of the manuscript, the authors present convincing evidence that MMV1028806 targets the mitochondrial electron transport (ETC) chain of the parasite (although they don't identify the actual target in the ETC). The revised manuscript also nicely addresses my other criticisms of the original version. Overall, the study presents an exciting approach for identifying and characterizing much-needed inhibitors for targeting tissue cysts in these parasites.

      Strengths:

      The study presents convincing proof-of-principle evidence that the myotube-based in vitro culture system for T. gondii bradyzoites can be used to screen compound libraries, enabling the identification of compounds that target the proliferation and/or survival of this stage of the parasite. The study also utilizes metabolomic approaches to characterize metabolic 'signatures' that provide clues to the potential targets of candidate inhibitors. In addition to insights into candidate bradyzoite inhibitors, the study also provides new insights into the physiological role of the mitochondrial electron transport chain of bradyzoites, and raises a host of interesting questions around the functional roles of mitochondria in this stage of the parasite.

      Weaknesses:

      In the revised manuscript, the authors have included additional oxygen consumption rate data that indicate that MMV1028806 targets the mitochondrial electron transport chain (ETC). These data are convincing. On line 481, the authors state that "treatments with ATQ, BPQ, MMV1028806, and antimycin A resulted in substantially reduced oxygen consumption levels relative to the DMSO control and suggest indeed a blockage of the mETC consistent with the inhibition of the bc1-complex." The OCR assay the authors use is still only an indirect measure of bc1 activity. Given that most OCR-inhibiting compounds in T. gondii are bc1 inhibitors, it is possible (and perhaps likely) that MMV1028806 is targeting this complex. However, the data cannot rule out that it is targeting another component of the ETC (or potentially even a TCA cycle enzyme). Without a direct test that MMV1028806 inhibits bc1 complex activity, the authors should be more cautious in their interpretation (e.g. by acknowledging the limitations of their conclusion, or acknowledging other possible targets). Similarly, the conclusion on line Line 622 that "... we confirmed the bc1-complex as a target" is overstating the findings. The phrasing on lines 683-695 is more appropriate: "... suggesting that it also targets complex III or a functionally linked site within the mitochondrial electron transport chain."

    4. Reviewer #3 (Public review):

      Summary:

      The authors described an exciting 400-drug screening using a MMV pathogen box to select compounds that effectively affect the medically important Toxoplasma parasite bradyzoite stage. This work utilises a bradyzoites culture technique that was published recently by the same group. They focused on compounds that affected directly the mitochondria electron transport chain (mETC) bc1-complex and compared with other bc1 inhibitors described in the literature such as atovaquone and HDQs. They further provide metabolomics analysis of inhibited parasites which serves to provide support for the target and to characterise the outcome of the different inhibitors.

      Strengths:

      This work is important as, until now, there are no effective drugs that clear cysts during T. gondii infection. So, the discovery of new inhibitors that are effective against this parasite-stage in culture and thus have the potential to battle chronic infection is needed. The further metabolic characterization provides indirect target validation and highlight different metabolic outcome for different inhibitors. The latter forms the basis for new studies in the field to understand the mode of inhibition and mechanism of bc1-complex function in detail.

      The authors focused in the function of one compound, MMV1028806, that is demonstrated to have a similar metabolic outcome to burvaquone. Furthermore, the authors evaluated the importance of ATP production in tachyzoite and bradyzoites stages and under atovaquone/HDQs drugs.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors' goal was to advance the understanding of metabolic flux in the bradyzoite cyst form of the parasite T. gondii, since this is a major form of transmission of this ubiquitous parasite, but very little is understood about cyst metabolism and growth. Nonetheless, this is an important advance in understanding and targeting bradyzoite growth.

      Strengths:

      The study used a newly developed technique for growing T. gondii cystic parasites in a human muscle-cell myotube format, which enables culturing and analysis of cysts. This enabled the screening of a set of anti-parasitic compounds to identify those that inhibit growth in both vegetative (tachyzoite) forms and bradyzoites (cysts). Three of these compounds were used for comparative Metabolomic profiling to demonstrate differences in metabolism between the two cellular forms.

      One of the compounds yielded a pattern consistent with targeting the mitochondrial bc1 complex and suggests a role for this complex in metabolism in the bradyzoite form, an important advance in understanding this life stage.

      Weaknesses:

      Studies such as these provide important insights into the overall metabolic differences between different life stages, and they also underscore the challenge of interpreting individual patterns caused by metabolic inhibitors due to the systemic level of some of the targets, so that some observed effects are indirect consequences of the inhibitor action. While the authors make a compelling argument for focusing on the role of the bc1 complex, there are some inconsistencies in the patterns that underscore the complexity of metabolic systems.

      We agree with reviewer #1 that metabolic fingerprints are complex to interpret and we did try to approach this problem by including mock treatment and non-metabolic inhibitors as controls. We address specific concerns below.

      Reviewer #2 ( Public review):

      Summary:

      A particular challenge in treating infections caused by the parasite Toxoplasma gondii is to target (and ultimately clear) the tissue cysts that persist for the lifetime of an infected individual. The study by Maus and colleagues leverages the development of a powerful in vitro culture system for the cyst-forming bradyzoite stage of Toxoplasma parasites to screen a compound library for candidate inhibitors of parasite proliferation and survival. They identify numerous inhibitors capable of inhibiting both the disease-causing tachyzoite and the cyst-forming bradyzoite stages of the parasite. To characterize the potential targets of some of these inhibitors, they undertake metabolomic analyses. The metabolic signatures from these analyses lead them to identify one compound (MMV1028806) that interferes with aspects of parasite mitochondrial metabolism. The authors claim that MV1028806 targets the bc1 complex of the mitochondrial electron transport chain of the parasite, although the evidence for this is indirect and speculative. Nevertheless, the study presents an exciting approach for identifying and characterizing much-needed inhibitors for targeting tissue cysts in these parasites.

      Strengths:

      The study presents convincing proof-of-principle evidence that the myotube-based in vitro culture system for T. gondii bradyzoites can be used to screen compound libraries, enabling the identification of compounds that target the proliferation and/or survival of this stage of the parasite. The study also utilizes metabolomic approaches to characterize metabolic 'signatures' that provide clues to the potential targets of candidate inhibitors, although falls short of identifying the actual targets.

      Weaknesses:

      (1) The authors claim to have identified a compound in their screen (MMV1028806) that targets the bc1 complex of the mitochondrial electron transport chain (ETC). The evidence they present for this claim is indirect (metabolomic signatures and changes in mitochondrial membrane potential) and could be explained by the compound targeting other components of the ETC or affecting mitochondrial biology or metabolism in other ways. In order to make the conclusion that MMV1028806 targets the bc1 complex, the authors should test specifically whether MMV1028806 inhibits bc1-complex activity (i.e. in a direct enzymatic assay for bc1 complex activity). Testing the activity of MMV1028806 against other mitochondrial dehydrogenases (e.g. dihydroorotate dehydrogenase) that feed electrons into the ETC might also provide valuable insights. The experiments the authors perform also do not directly measure whether MMV1028806 impairs ETC activity, and the authors could also test whether this compound inhibits mitochondrial O2 consumption (as would be expected for a bc1 inhibitor).

      We thank the reviewer for highlighting this important aspect. To further investigate the effect of MMV1028806 on the mETC, we adapted a commercial oxygen consumption assay and demonstrated that MMV1028806, like Atovaquone and Buparvaquone, inhibits the ETC, leading to reduced oxygen consumption similar to Antimycin A, which inhibits the bc1-complex. These results are now included in the revised manuscript (Methods, lines 210–233; Results, lines 460–468).

      (2) The authors claim that compounds targeting bradyzoites have greater lipophilicity than other compounds in the library (and imply that these compounds also have greater gastrointestinal absorbability and permeability across the blood-brain barrier). While it is an attractive idea that lipophilicity influences drug targeting against bradyzoites, the effect seems pretty small and is complicated by the fact that the comparison is being made to compounds that are not active against parasites. If the authors are correct in their assertion that lipophilicity is a major determinant of bradyzoicidal compounds compared to compounds that target tachyzoites alone, you would expect that compounds that target tachyzoites alone would have lower lipophilicity than those that target bradyzoites. It would therefore make more sense to (statistically) compare the bradyzoicidal and dual-acting compounds to those that are only active in tachyzoites (visually the differences seem small in Figure S2B). This hypothesis would be better tested through a structure-activity relationship study of select compounds (which is beyond the scope of the study). Overall, the evidence the authors present that high lipophilicity is a determinant of bradyzoite targeting is not very convincing, and the authors should present their conclusions in a more cautious manner.

      Thank you for raising this excellent point. We performed a statistical test of tachyzoidal and both bradyzoidal and dually active compounds and find indeed no significant difference (P = 0.06). We altered the results text line 367-368 and the figure S2B caption to explicitly mention this.

      (3) Page 11 and Figure 7. The authors claim that their data indicate that ATP is produced by the mitochondria of bradyzoites "independently of exogenous glucose and HDQ-target enzymes." The authors cite their previous study (Christiansen et al, 2022) as evidence that HDQ can enter bradyzoites, since HDQ causes a decrease in mitochondrial membrane potential. Membrane potential is linked to the synthesis of ATP via oxidative phosphorylation. If HDQ is really causing a depletion of membrane potential, is it surprising that the authors observe no decrease in ATP levels in these parasites? Testing the importance of HDQ-target enzymes using genetic approaches (e.g. gene knockout approaches) would provide better insights than the ATP measurements presented in the manuscript, although would require considerable extra work that may be beyond the scope of the study. Given that the authors' assay can't distinguish between ATP synthesized in the mitochondrion vs glycolysis, they may wish to interpret their data with greater caution.

      We thank the reviewer for addressing this important point. The enzymatic assay used in our study cannot distinguish whether ATP is produced via glycolysis or mitochondrial respiration. However, we minimized glycolytic ATP production in bradyzoites by starving them for one week without glucose. After this period, amylopectin stores are depleted, forcing the parasites to utilize glutamine via the GABA shunt to fuel the TCA cycle and generate ATP predominantly through respiration. While minor ATP production via gluconeogenic fluxes cannot be excluded, the main ATP supply under these conditions is expected to originate from the mitochondrial electron transport chain. Indeed, ATP levels are lower in HDQ-treated bradyzoites, which we attribute to the compound’s impact on electron-supplying enzymes upstream of the bc1 complex, although this inhibition is not sufficient to fully abolish ATP production as observed with Atovaquone treatment.

      Reviewer #3 (Public review):

      Summary:

      The authors describe an exciting 400-drug screening using a MMV pathogen box to select compounds that effectively affect the medically important Toxoplasma parasite bradyzoite stage. This work utilises a bradyzoites culture technique that was published recently by the same group. They focused on compounds that affected directly the mitochondria electron transport chain (mETC) bc1-complex and compared them with other bc1 inhibitors described in the literature such as atovaquone and HDQs. They further provide metabolomics analysis of inhibited parasites which serves to provide support for the target and to characterise the outcome of the different inhibitors.

      Strengths:

      This work is important as, until now, there are no effective drugs that clear cysts during T. gondii infection. So, the discovery of new inhibitors that are effective against this parasite stage in culture and thus have the potential to battle chronic infection is needed. The further metabolic characterization provides indirect target validation and highlights different metabolic outcomes for different inhibitors. The latter forms the basis for new studies in the field to understand the mode of inhibition and mechanism of bc1-complex function in detail.

      The authors focused on the function of one compound, MMV1028806, that is demonstrated to have a similar metabolic outcome to burvaquone. Furthermore, the authors evaluated the importance of ATP production in tachyzoite and bradyzoites stages and under atovaquone/HDQs drugs.

      Weaknesses:

      Although the authors did experiments to identify the metabolomic profile of the compounds and suggested bc-1 complex as the main target of MMV1028806, they did not provide experimental validation for that.

      In our updated manuscript we performed additional experiments such as oxygen consumption assay to further qualify the bc1 complex as the target. We also toned down some of our statements to make sure that no false claims are made.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Introduction: It would be helpful to briefly describe what the pathogen Box is, what compounds are in it, and the rationale for using a drug screen to better understand mitochondrial function in cysts.

      Thank you for this suggestion, we added an introduction of the MMV pathogen box and outlined our rationale for our experimental approach in lines 90 to 99.

      Please explain why dual-active drugs were useful for understanding differences, rather than just seeking drugs that might target bradyzoites alone.

      We focused on dually active compounds for two reasons. First, these are the most promising and potent targets to develop drugs against. Both stages might occur simultaneously and these dually active drugs may eliminate the need for treatment with a drug combination. Second, we speculated that monitoring the responses to inhibition of the same process in both parasite stages would reveal its functional consequences. Dually active compounds enable this direct comparison. Bradyzoite-specific compounds may be interesting from a developmental perspective but may require a reverse genetic follow-up to compare differences between stages. The lack of a well-established inducible expression system in bradyzoites that allows short term and synchronized knock-down makes metabolomic approaches difficult. We added these two points in brief to the results section (line 378 – 381).

      Figure 4: this is a very important figure in understanding the significance of the work, but it is not well described in the legend. Even if these graphics have been used in other manuscripts, it would be helpful to provide better annotation in the figure legend.

      Thank you for pointing this out. We expanded the figure legend to explain the isotopologues data in more detail. Line 793 to 802.

      B,D: Explain what the three columns for each drug category represent.

      Addressed

      C,E: Explain what isotopologues are, what the M+ notation means, and what the pie charts represent. Other main figures have suitable legends.

      Addressed

      Discussion: there are several places where the reasoning is a bit hard to follow, and rearrangement to provide a clear logical flow would be helpful. In particular, the reasoning for why HDQ impairs active but non-essential processes could be laid out more clearly.

      We added additional clarifications to the discussion section and re-wrote the HDQ paragraph. We hope that our reasoning is now easier to follow.

      Abbreviations: A list of abbreviations for the entire manuscript would be helpful.

      This is a good idea and we now provide an abbreviations list.

      Minor typos:

      P12, 2d paragraph: sentence beginning with: Consistent with this hypothesis... "cysts" is used twice

      Corrected

      P15, top of the second paragraph: "nano" and "molar" should be one word

      Corrected

      Reviewer #2 (Recommendations for the authors):

      Major comments (not already covered in the weaknesses section of the public review)

      (1) Figure 2 and the related description of these experiments in the methods section (page 3). The approach for calculating IC50 values for the compounds against tachyzoites is unclear. How did the authors determine the time point for calculating IC50 vacuoles? Was this when the DMSO control wells reached maximum fluorescence? This could be described in a clearer manner. A concern with calculating IC50 values on different days is that parasites will have undergone more lytic cycles after 7 days compared to 4 days, which means that the IC50 values for fast- vs slow-acting compounds might be quite different between these days. As a more minor comment on these experiments, the methods section does not describe whether the test compound was removed after 7 days, as the experimental scheme in Figure S1A seems to imply. Please clarify in the methods section.

      This is a very good point and we clarified this in the methods section, line 157–160. In brief, we choose the latest time point when exponential growth could be observed in the fastest growing cultures, generally this was in mock treated cultures and at day 4 post infection. We also clarified that we changed media and removed treatment after 7 days.

      Minor Comments

      (2) Page 2. "we employed a recently developed human myotube-based culture system to generate mature T. gondii drug-tolerant bradyzoites". What makes these bradyzoites 'drug-tolerant' or to which drugs are they tolerant? This isn't clear from the description.

      We added these details in the introduction (line 94 to 96) and state that these cysts develop resistance against anti-folates, bumped kinase inhibitors and HDQ, a Co-enzyme Q analog.

      (3) Figure 1E. The number of compounds in this pie chart adds up to 384, whereas the methods describe that 371 compounds were tested. What explains this discrepancy in numbers?

      We understand the confusion. We now updated the pie chart to reflect only compounds that were included in the primary screen (371) as reflected in Supplementary Table S1. We separately analysed 29 compounds that were previously tested against tachyzoites by Spalenka et al., and found an additional 13 compound, that were originally included in the pie chart. In a secondary test the activity of 10 of these 13 compounds could be confirmed. All in all we found the 16 compounds shown in Fig. 2 E-G.

      (4) Page 3. The resazurin assays for measuring host cell viability could be explained in a clearer manner. What host cells were used? Were the host cells confluent when the drug was added (and the assay conducted) or was the drug added when the host cells were first seeded? How long were the host cells cultured in the candidate inhibitors before the assays were performed? What concentration (or concentration range) were the compounds tested? The host inhibition data are not easily accessible to the reader - the authors might consider including these data as part of Table S2D.

      The necessary information was added to the methods section (line 145 to 153). We tested for host toxicity in both HFF and KD3 myotubes during the primary screen at 10 µM in triplicates. The colorimetric assay was performed after tachyzoite growth assays in HFFs 7 days post infection and after completion of the 4 week re-growth phase of bradyzoites in myotubes. The resulting data is already part of Supplementary File 1. In addition, we performed concentration dependent resazurin assays after secondary concentration dependent growth inhibition assays and also included data in Supplementary File 1. For the bradyzoite growth assay we performed visual inspection after drug exposure for one week and before tachyzoite re-growth to detect missing or damaged monolayer. Also, this data is included in the Supplementary File 1. We also included the cytotoxicity data as suggested into Table S2D.

      (5) Page 7. "Except for four compounds (MMV021013, MMV022478, MMV658988, MMV659004), minimal lethal concentrations were higher in bradyzoites". The variation in these data seems quite large to be making this claim. Consider a statistical analysis of these data to compare potencies in tachyzoites vs bradyzoites.

      With this sentence we aimed to describe the results and not to make a statement. We toned down the sentence to “… minimal lethal concentrations appear generally higher in bradyzoites… “ line 344 to 347. We also added a line 1 µM in the charts to facilitate easier comparison of compound efficacies.

      (6) It would be helpful to readers to include the structures of hit compounds in the figures (perhaps as part of Figure 3).

      This is a good idea and would improve the manuscript. To not overburden figure 3 we added structures to Fig S3.

      (7) Page 8. "Infected monolayers were treated for three hours with a 3-fold of respective IC50 concentrations". 3-fold higher than IC50 concentrations? This isn't clear.

      Thank you for noticing this: We clarified the sentence and also corrected the concentration, corresponding to five times their IC50s as stated in the methods section: “Infected monolayers were treated for three hours with compound concentrations five times their respective IC<sub>50</sub> values or the solvent DMSO.” Line 374 - 376

      (8) Page 9. "buparvaquone, which we found to be dually active against T. gondii tachyzoites and bradyzoites, targets the bc1-complex in Theileria annulata (McHardy et al. 1985) and Neospora caninum (Müller et al. 2015) and was recently found active against T. gondii tachyzoites (Hayward et al. 2023)." The latter paper showed that buparvaquone targets the bc1 complex in T. gondii tachyzoites as well.

      Yes, it was found to inhibit O2 consumption rate in tachyzoites. We changed the sentence accordingly. Line 407 to 411.

      (9) Page 9. "Anaplerotic substrates were also affected by all three treatments, most notably a strong accumulation of aspartic acid." It is interesting that the M+3 isotopologue of aspartate (presumably synthesised from pyruvate) is the predominant form (rather than the M+2 and M+4 isotopologues that would derive from the TCA cycle, and as the diagram in Figure 4A seems to suggest). Given that aspartate is a precursor of pyrimidine biosynthesis that is upstream of the DHODH reaction, it is conceivable that its accumulation is related to the depletion of pyrimidine biosynthesis (so would tie into the point about the accumulation of DHO and CarbAsp noted earlier in the paragraph).

      Yes, we assume the same. We altered the text and summarized the changes in Asp as a result of DHOD inhibition, as we also already do in the next paragraph using <sup>15</sup>N-glutamine labelling. Line: 416 - 418

      (10) Figure 6 and Page 10. Regarding the metabolomic experiments that show increased levels of acyl-carnitines. The authors note that "Since [beta-oxidation] is thought to be absent in T. gondii, we attribute these changes to inhibition of host mitochondria". This is conceivable, although the T. gondii genome does encode homologs of the proteins necessary for beta-oxidation (e.g. see PMID 35298557). If the carnitine is coming from host mitochondria, is host contamination a concern for interpreting the metabolomic data? Or do the authors think that parasites are scavenging carnitine from host cells? It is curious that the carnitine accumulation is observed in parasites treated with buparvaquone (and MMV1028806) but not atovaquone, even though buparvaquone and atovaquone (and possibly MMV1028806) target the same enzyme. Do the authors have any thoughts on why that might be the case?

      Yes, thank you for raising this point. We changed the discussion elaborating on this and included the debated presence of beta-oxidation: line 640: “We also detect elevated levels of acyl-carnitines in BPQ and MMV1028806 treated bradyzoites. These molecules act as shuttles for the mitochondrial import of fatty acids for β-oxidation. However, this pathway has not been shown to be active and is deemed absent in T. gondii (35298557, 18775675). The presence of acyl-carnitines in bradyzoites might reflect import from the host. It is conceivable that their elevation in response to buparvaquone and MMV1028806 indicates compromised functionality of the host bc1-complex and subsequently accumulating β-oxidation substrates. Indeed, BPQ has a very broad activity across Apicomplexa (Hudson et al. 1985) and kinetoplastids (Croft et al. 1992).“ Regarding the existence of beta-oxidation: some potential enzymes might be conserved, but those could in part take part in branched chain amino acid degradation pathways. On a separate note: we looked extensively on beta-oxidation using stable isotope labelling and became convinced that any activity occurred in the host cell only but not in the parasite (unpublished).

      (11) Page 11. "the mitochondrial [electron] transport chain in bradyzoites".

      Corrected.

      (12) Figure S6B. Were these optimization experiments performed in tachyzoites or bradyzoites? If the former, and given that bradyzoites have apparently smaller amounts of ATP per parasite (Figure 7C), are these values in the linear range for 10^5 bradyzoites?

      Yes, we do think that the assay remains linear for these lower concentrations. Tachyzoites give a linear response starting from 10^3 parasites per sample. In the actual experiment we used 10^5 parasites, both tachyzoites and bradyzoites. Under the tested conditions bradyzoites maintain 10% of the ATP pools of tachyzoites, which should be well within the linear range of the assay. Also in Atovaquone-treated bradyzoites ATP concentration could be lower to 10% and still remain in the linear range of the assay. For practical reasons, we simply acknowledge this limitation and consider it acceptable within the scope of this study.

      Reviewer #3 (Recommendations for the authors):

      Major comments

      (1) The authors should provide a negative control for the experiment on Figure 5. I would suggest doing the same experiment with an inhibitor that has no effect on mitochondrial potential.

      We addressed this criticism by repeating the assay on tachyzoites and additionally including inhibitors that do not have the mitochondrial electron transport chain as their primary target (Pyrimethamine, Clindamycin, 6-Diazo-5-oxo-L-norleucin). The results are summarized in the supplementary Fig S5, line 445 – 449) and show that there is no effect of these inhibitors on the mitochondrial membrane potential. This supports the specificity of the assay and suggests that MMV1028806 and BPQ indeed target a mitochondrial process in this stage. Also, in this repetition ATQ, BPQ and MMV1028806 did significantly deplete the Mitotracker signal.

      (2) Figure 5 - Did the authors perform this experiment in 3 biological replicates? This requires clarification of the figure legend.

      No, we did not perform the experiment in 3 biological replicates. After establishing the assay thoroughly, we performed it once on tachyzoites and bradyzoites. The sampling was done on every vacuole we encountered during microscopy going through the slide from left to right. That is the reason the sample size varies from treatment to treatment. The sample size is mentioned in the caption of figure 5. However, we repeated the experiment with additional controls (see Fig. S5), which showed that the Mitotracker signals were significantly depleted in a very similar manner in ATQ, BPQ and MMV1028806 treated parasites.

      (3) The authors identify that MMV1028806 has bc1-complex as the main target. I suggest that they should perform a complex III activity assay to affirm this. Also, it would be good to test if other mETC complexes are affected by this compound to prove its specificity. There is only one paper showing complex III activity in tachyzoites (PMID:37471441) and no papers in bradyzoites. So if the authors cannot do this assay, I suggest that they should change the text indicating that bc-1 complex could be the main target of the compound but more experimental validation is needed.

      We hope to have satisfied the reviewer’s request by performing an oxygen consumption assay on tachyzoites. Together with metabolic profiling and labelling data, this shows that both upstream and downstream processes are impacted by MMV1028806 and strongly suggest the bc1-complex as a target (Fig 5E).

      (4) Figure S5 - Are the differences shown in the EM experiment statistically supported?

      We analyzed 28 images and measured the areas in 12 to 26 images. We substituted the table of means in Fig S6B by a graph showing individual values. These areas are indeed statistically different between DMSO and ATQ / MMV treated parasites. We changed the wording in the results section accordingly “Analysis by thin section electron microscopy revealed a largely unaffected sub-mitochondrial ultrastructure but the areas of mitochondrial profiles were changed in comparison to control after exposure with ATQ and MMV1028806 but not with BPQ (Fig. S6)“. The description of Fig S6B was changed to “(B) Measured areas of mitochondrial profiles from 21, 12, 15 and 26 images showing DMSO, ATQ, BPQ and MMV1028806 treated parasites (* denotes p < 0.05 in Mann-Whitney tests)”.

      Minor comments:

      (1) What was the criteria to choose the example compounds in Figure 1B and 1D? The authors should clarify this in the text.

      These graphs are shown for illustrative purposes and were chosen based on their display of different drug efficacies. We considered this helpful for interpreting the screening data.

      (2) Figure 2G - add statistical analysis.

      We added Mann-Whitney tests and updated the figure legend and results text accordingly in line 344 – 347.

      (3) The authors should provide more insights in the discussion about why this new compound is the next step in drug discovery compared to atovaquone or burvaquone - for example, do you expect better availability in the brain, etc.

      We used MMV1028806 and the other hits ATQ and BPQ to make the point that the bc1-complex is a good target in bradyzoites that allows curative treatment. We do not suggest that the compound itself is a good starting point. We point to other actively developed candidates such as ELQ series in the discussion, line 719.

      (4) Scale bars in Figure 5 should be aligned and have equal thickness.

      We re-formatted the scale bars and aligned them when not obscuring parasites.

      (5) The authors should be consistent with font sizes and styles in all the figures.

      We adjusted the font styles to match each other.

    1. eLife Assessment

      This study provides valuable data regarding gene expression and molecular changes that occur in the mouse spinal cord from exercise and motor activity. Overall, the findings and methods are solid, although additional independent validation experiments would improve the rigor of the study. The work provides resources for neuroscientists who investigate communication between neurons and non-neurons and both basic and translational scientists with interests in how physical activity impacts the nervous system function, with potential therapeutic outcomes.

    2. Reviewer #1 (Public review):

      Summary:

      The authors integrated bulk proteomics, single-nucleus RNA sequencing, and cellular communication pipelines to map molecular changes in the mouse lumbar spinal cord following endurance training versus acute exhaustive exercise. This kind of data is currently missing in the literature for the healthy spinal cord; therefore, this work represents a useful resource for the community for the investigation of cellular mechanisms of exercise-induced neuroplasticity. The authors found that endurance training elicited robust plastic transcriptional changes in the glia, in genes involved in synaptic modulation, axon development, and intercellular signaling, with cell-specific differences. Acute exhaustive exercise triggered a more nuanced biphasic temporal response in metabolic and synaptic genes, which was different in trained versus sedentary mice. Although cholinergic neurons did not show robust gene expression changes, they were found to be central hubs for communication with glia, suggesting that their cues may act as upstream regulators of glial plasticity.

      Strengths:

      Nuclei fixation minimized unwanted RNA degradation and tissue processing-driven expression changes. However, in the text, it needs to be acknowledged that the fixation step was performed only after nuclei isolation, and not at the stage of spinal cord tissue collection. The time course study design allowed for the distinction of different temporal gene expression trajectories.

      Weaknesses:

      No clear indication of the number of biological replicates is given. No validation of the key findings with alternative methods is presented.

      Some aspects of data analysis need to be clarified:

      (1) Methods

      a) Voluntary exercise: the authors should indicate whether the mice were singly housed, and, if not, clarify that the indicated mean km/day is an average of the mice in the cage.

      b) The Authors should indicate more precisely which lumbar level of the spinal cord was used and the number of biological replicates.

      c) The Authors should indicate the number of highly variable features and PCs (dims) used for Seurat and provide a QC metric table.

      (2) Results and Figures

      a) Bulk proteomic analysis: The authors used Pval-and not FDR- to assess differentially abundant proteins. Can the author indicate how many proteins passed a more stringent FDR cutoff? For GO analysis: the authors should indicate what background/reference was used.

      b) Figure 1B and Figure S1B-C: the differences in total mass and relative lean mass are very subtle, even if statistically significant. This needs to be acknowledged in the relevant sentences.

      c) Figure 2 and Figure S2E panels G and H are inverted.

      d) Heatmaps in Figures 1F and 2 Figure 2E-F: some of the proteins and genes listed in the text are not present in the heatmaps (TIM22 and FABP4; Smap25 and Slc4a4). Please check the correspondence of the text with the heatmap, and indicate with an arrow the listed proteins and genes.

      e) Page 9 "trained mice displayed a modest increase of oligodendrocytes 24h": from the plot, it looks to me like a decrease rather than an increase.

      f) Figure 4 depicts expression changes in selected metabolism and synaptic activity-related genes: it would be useful to add a table as a supplementary file with expression data of all the synaptic and metabolic genes in addition to the ones that were selected.

    3. Reviewer #2 (Public review):

      Mansingh et al., investigate the impact of voluntary wheel training and acute physical exercise on the transcriptomic and proteomic profile of spinal cord tissues from young adult mice. They first describe the proteomic and transcriptomic differences between sedentary mice and mice provided with running wheels for voluntary exercise. They show that voluntary physical exercise induces changes at a transcriptional level as well as at a proteomic level, with most of these effects restricted to glial cells. They further analyze the putative cell interactions that are induced in the context of physical training and describe the specificity of transcriptional changes in the different cell populations. Using the same multi-omics pipeline, the authors assess dynamic changes in sedentary and trained mice 6 and 24 hours following a bout of physical exercise until exhaustion. Importantly, they demonstrate that the impact of this single bout to exhaustion is modified in mice that have access to running wheels compared with sedentary mice, with a reduced amplitude of the reaction and a faster resolution of changes caused by exercise until exhaustion.

      Altogether, this study provides a useful description of the transcriptional changes at play following voluntary physical training and, importantly, uncovers the role of this training in shaping future transcriptomic reactions to a stressful bout of exercise until exhaustion. However, the conclusions of the manuscripts would be strengthened by the clarification of the methods, a better use of the proteomic data regarding the transcriptomic datasets, and a cross-validation of the main claims currently based solely on transcriptomic datasets.

      (1) In this study, the housing strategy used is key as it will impact both the proteome and transcriptome of cells in the central nervous system. It can be difficult to measure the running activity of individual mice if they are not housed individually. Yet, individual housing has a major impact on the nervous system and notably on glial cells. Therefore, a better description of the housing strategy for the sedentary and trained group during the 6 weeks of training is required.

      (2) In the first part of the paper that uses the results from the first set of multi-omics data, the protocol used is not clear. From Figure 1A, it seems that the mice went through a max performance test before and after the 6-week period in which the two groups had different life experiences (voluntary running versus sedentary). Since in the methods the maximal test protocol is effectively an exercise until exhaustion, it is difficult to understand why the authors defined this first experiment as the one allowing them to test "molecular remodeling in the spinal cord at rest". Also, it is not clear how long after the max performance test the tissues were collected. If indeed the mice went through the max endurance test before tissue collection, it is not a condition at rest, and this first protocol in some way looks like a duplication of a subpart of the second experiment. If mice did not go through this max performance test, it needs to be clarified both in the text and in the figure.

      (3) One of the strengths of this study is its multi-omics approach assessing changes at both transcriptomic and proteomic levels. Yet, the use by authors of the proteomic datasets is minimal, and there are no comments on how the proteomic and transcriptomic datasets support each other. Changes at the transcriptional level do not necessarily translate into changes at the protein level. Therefore, it would improve the quality of the study if authors could use the bulk proteomic data in relation to the transcriptomic dataset. The fact that the proteomic datasets do not provide the identity of the cells from which the changes originate should not prevent authors from putting them in perspective with transcriptomic results.

      (4) None of the results from the single-nucleus RNA sequencing are cross-validated with, for instance, in situ hybridizations. It would improve the strength of the claim if some findings, in particular regarding the dynamics of the changes 6 vs 24h after exhaustion bout, were cross-validated.

      (5) Although the authors note as a limitation that cholinergic neurons were underrepresented in their dataset, since one of the main claims of the manuscript relates to them, it calls for some additional comments on the identity of the cholinergic neurons present in their dataset. There are different populations of spinal cholinergic neurons with very different functions. It would greatly improve the strength of this result if the authors could identify which cholinergic neurons show these changes (or at least which proportion of the different cholinergic population is present in their datasets). For instance, which proportion of cholinergic neurons are expressing classical markers of motor neurons versus markers of cholinergic interneurons (for instance, from the V0c population).

    4. Reviewer #3 (Public review):

      Summary:

      Mansingh et al. used single-nucleus transcriptional and bulk proteomic profiling to characterize how gene expression changes in the lumbar spinal cord of adult, healthy mice after training (voluntary wheel-running exercise) and acutely after forced treadmill exercise. They found (1) that training was associated with a number of differentially expressed proteins, (2) training was associated with cell-type specific changes in transcription, notably glial cells had the highest numbers of differentially expressed genes, and (3) that trained mice had blunted transcriptional response to an acute exercise bout compared to sedentary mice.

      Strengths:

      The characterization of the changes to the proteome and the transcriptome associated with exercise will undoubtedly be a useful resource for scientists interested in the effects of exercise on central nervous system gene expression and may inspire mechanistic follow-up studies. The authors nicely use pathway and intercellular communication analyses to distill the complex dataset into key trends.

      Weaknesses:

      Weaknesses of this paper include two aspects of the analyses that underexplored the rich dataset. The analysis fails to explicitly compare the proteome and transcriptome results. Do the differentially expressed proteins correspond to the differentially expressed genes? If so, in which cell types? If not, why not? Comparison of the GO terms from the proteome dataset and the GSEA terms from the single-nucleus transcriptome dataset suggests that the same gene families were not identified in both data sets. I expect that integrating analyses across these datasets would help make the study truly multi-omic and highlight which expression changes are the most abundant and consistent across approaches. Second, the authors emphasize that related studies do not account for inter-individual variability in both the introduction and discussion. This aspect of the authors' dataset is also underexplored - the transcriptomic data appear to be pooled across animals, and only a single panel shows protein expression from individual animals (Fig. 1F). Is the variability in Figure 1F explainable by the amount of running on the wheel?

    1. eLife Assessment

      This study provides important insights into how working memory shapes perceptual decisions, using a dual-task design, continuous mouse tracking, and hierarchical Bayesian modeling. By dissociating fast attentional capture effects from slower, sustained perceptual biases within single trials, the authors provide compelling evidence that working memory-perception interactions unfold through distinct dynamic processes rather than a single mechanism. This work will be of interest to researchers studying working memory, perception, decision-making, and mouse-tracking methodology.

    2. Reviewer #1 (Public review):

      Summary

      This study examines how working memory (WM) influences perceptual decisions, with the aim of distinguishing fast attentional capture-like effects from slower, sustained perceptual biases. The authors use a dual-task design in which a perceptual estimation task is embedded within a WM delay, combined with a time-resolved analysis of mouse tracking reports and hierarchical Bayesian modeling. This approach reveals two temporally distinct signatures of WM-perception interactions within single trials, arguing against a unitary account of WM-driven perceptual bias and instead supporting multiple processes that operate over different timescales.

      Strengths

      A major strength of the study is its innovative use of a time-resolved mouse trajectory analysis to move beyond endpoint measures and reveal the dynamic evolution of decision biases. By decomposing trajectories into components that are and are not explained by the final response, the authors provide compelling evidence for an early transient deviation and a slower, endpoint-consistent drift. The combination of rigorous experimental design, hierarchical Bayesian modeling, and converging analyses yields compelling support for the central claims and offers a valuable framework for studying top-down influences on perception.

      Weaknesses/points requiring clarification:

      (1) The primary weakness concerns the clarity of the theoretical framing linking the identified trajectory components specifically to attentional capture and representational (or perceptual) shift. While the manuscript reviews prior work on attentional and perceptual biases, the conceptual transition to the proposed distinction between capture and representational shift would benefit from a stronger connection to the existing literature. Clarifying this relationship would strengthen the interpretation of the results.

      (2) The use of the term "continuous" to describe the trajectory analyses may be confusing for readers, as it could be interpreted as referring to a continuous task rather than a time-resolved analysis of movements performed to make a discrete response.

      (3) Figures 2 and 7 present posterior distributions of hierarchical Bayesian parameter estimates for endpoint responses in Experiments 1 and 2. However, they do not show how these model estimates relate to the raw behavioral data. Including model fits alongside the observed data would help readers assess the quality of the fits and better evaluate how well the modeling captures the underlying behavioral responses. Similarly, it would be helpful to see individual means in Figure 3a, panel 2, as is done in Figure 4.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript investigates the mechanisms by which visual working memory (WM) interacts with perceptual judgements, using continuous mouse-tracking to dissociate putative attentional capture from representational shift. Across two experiments, participants maintained a color in WM while performing an intervening perceptual matching task. Analyses of mouse trajectories revealed bidirectional influences with distinct dynamics of attentional capture and representational shift components. For WM's influence on perceptual judgments, trajectories showed a fast and endpoint-inconsistent deviation (interpreted as attentional capture by WM-matching features), followed by a slower and sustained drift that closely matched the final perceptual bias. In contrast, when perceptual judgments influenced subsequent WM recall, trajectory dynamics were dominated by the sustained drift component, with minimal capture-like deviation. Together, these findings are interpreted as evidence that WM shapes perceptual decisions through at least two temporally distinct processes.

      Strengths:

      I find the paradigm to be cleverly designed and the analyses rigorous. A major strength of this work is the use of continuous mouse-tracking and time-resolved analyses to dissociate transient influences from sustained biases within single trials. The trajectory decomposition provides an elegant way to separate early deviations from later drift, which would be difficult to achieve using traditional measures that only measure the final recall. I find the observation particularly compelling that trajectories initially deviate toward WM-matching information and then correct back toward the task-relevant target, highlighting the dynamic interplay between transient priority signals and the final decision.

      Weaknesses:

      (1) The early curvature in the mouse trajectory, inconsistent with the endpoint, is interpreted as fast attentional capture. However, this signal may also reflect competition among multiple responses driven simultaneously by the WM representation and the perceptual matching item. While the current interpretation is plausible, it would be helpful if the authors could more clearly articulate why this component should be solely interpreted as attentional capture rather than early response competition.

      (2) The mouse trajectories show a clear correction back toward the target later in the movement, particularly when the cursor enters the color wheel (Figure 3a), where the correction appears most pronounced. I wonder how this corrective phase should be interpreted. For example, does this correction reflect disengagement from an initial WM-driven priority signal, increasing influence of task demands and sensory evidence, or some other control process?

      Relatedly, movement onset latency modulated the overall AUC but did not influence the final perceptual error. I wonder whether the time courses of the capture and shift components (as revealed by the destination-vector transformation) differ between early-onset and late-onset trials, and if so, when those differences emerge. Explicitly showing these comparisons would help further clarify how early capture is corrected while the endpoint bias remains stable. It may also be informative to include representative raw trajectory paths for early- and late-onset trials, as Figure 3a is currently the only figure showing raw trajectories, whereas most subsequent results are derived measures.

      (3) The contrast in destination-vector dynamics between the perceptual matching response and the WM recall response (Figure 8) is interesting. For the representational shift component, the effect appears to increase sharply after movement onset. Conceptually, one might expect the shift in WM representation to have already occurred following perceptual judgment, rather than emerging during the response itself. It would be helpful if the authors could clarify why the shift is expressed primarily during the movement phase. Additionally, although weak, there appears to be a small capture-like deviation in the WM recall trajectories. Was this effect statistically significant? It may be informative to apply the same cluster-based permutation analysis directly comparing the capture effects against zero, in addition to the paired comparisons currently reported.

    1. eLife Assessment

      This study uses a Bayesian framework to characterize latent brain state dynamics associated with memory encoding and performance in children, as measured with functional magnetic resonance imaging. The novelty of the approach offers valuable insights into memory-related brain activity, but the consideration of developmental changes in memory and brain dynamics, and the evidence to support the proposed mapping between specific states and distinct aspects of memory, are incomplete. This work will be of interest to researchers interested in cognitive neuroscience and the development of memory.

    2. Reviewer #1 (Public review):

      Summary:

      Zeng et al. characterized the dynamic brain states that emerged during episodic encoding and the reactivation of these states during the offline rest period in children aged 8-13. In the study, participants encoded scene images during fMRI and later performed a memory recognition test. The authors adopted the BSDS approach and identified four states during encoding, including an "active-encoding" state. The occupancy rate of, and the state transition rates towards, this active-encoding state positively predicted memory accuracy across participants. The authors then decoded the brain states during pre- and post-encoding rests with the model trained on the encoding data to examine state reactivation. They found that the state temporal profile and transition structure shifted from encoding to post-encoding rest. They also showed that the mean lifetime and stability (measured with self-transition probability) of the "default-mode" state during post-encoding rest predict memory performance.

      Strengths:

      How brain dynamics during encoding and offline rest support long-term memory remains understudied, particularly in children. Thus, this study addresses an important question in the field. The authors implemented an advanced computational framework to identify latent brain states during encoding and carefully characterized their spatiotemporal features. The study also showed evidence for the behavioral relevance of these states, providing valuable insights into the link between state dynamics and successful encoding and consolidation.

      Weaknesses:

      (1) If applicable, please provide information on the decoding performance of states during pre- and post-encoding rests. The Methods noted that the authors applied a threshold of 0.1 z-scored likelihood, and based on Figure S2, it seems like most TRs were assigned a reinstated state during post-encoding rest. It would be useful to know, for the decodable TRs, how strong the evidence was in favor of one state over others. Further, was decoding performance better during post- vs. pre- encoding rest? This is critical for establishing that these states were indeed "reinstated" during rest. The authors showed individual-specific correlations between encoding and post-encoding state distribution, which is an important validation of the method, but this result alone is not sufficient to suggest that the states during encoding were the ones that occurred during rest. The authors found that the state dynamics vary substantially between encoding and rest, and it would be helpful to clarify whether these differences might be related to decoding performance. I am also curious whether, if the authors apply the BSDS approach to independently identify brain states during rest periods (instead of using the trained model from encoding), they find similar states during rest as those that emerged during encoding?

      (2) During post-encoding rest, the intermediate activation state (S1) became the dominant state. Overall, the paper did not focus too much on this state. For example, when examining the relationship between state transitions and memory performance, the authors also did not include this state as a part of the analyses presented in the paper (lines 203-211). Could the author report more information about this state and/or discuss how this state might be relevant to memory formation and consolidation?

      (3) Two outcome measures from the BSDS model were the occupancy rate and the mean lifetime. The authors found a significant association with behavior and occupancy rate in some analyses, and mean lifetime in others. The paper would benefit from a stronger theoretical framing explaining how and why these two different measures provide distinct information about the brain dynamics, which will help clarify the interpretation of results when association with behavior was specific to one measure.

      (4) For performance on a memory recognition test, d' is a more common metric in the literature as it isolates the memory signal for the old items from response bias. According to Methods (line 451), the authors have computed a different metric as their primary behavioral measure (hits + correction rejections - misses - false alarms). Please provide a rationale for choosing this measure instead. Have the authors considered computing d' as well and examining brain-behavior relationships using d'?

      (5) While this study examined brain state dynamics in children, there was no adult sample to compare with. Therefore, it is hard to conclude whether the findings are specific to children (or developing brains). It would be helpful to discuss this point in the paper.

    3. Reviewer #2 (Public review):

      This paper investigates the latent dynamic brain states that emerge during memory encoding and predict later memory performance in children (N = 24, ages: 8 -13 years). A novel computational approach (Bayesian Switching Dynamic Systems, BSDS) discovers latent brain states from fMRI data in an unsupervised and parameter-free manner that is agnostic to external stimuli, resulting in 4 states: an active-encoding state, a default-mode state, an inactive state, and an intermediate state. The key finding is that the percentage of time occupied in the active-encoding state (characterized by greater activity in hippocampal, visual, and frontoparietal regions), as well as greater transitions to this state, predicts memory accuracy. Memory accuracy was also predicted by the mean lifetime and transitions to the default-mode state (characterized by greater activity in medial prefrontal cortex and posterior cingulate cortex) during post-encoding rest. Together, the results provide insights into dynamic interactions between brain regions that may be optimal for encoding novel information and consolidating memories for long-term retention.

      The approach is interesting and important for our understanding of neural mechanisms of memory during development, as we know less about dynamic interactions between memory systems in development.

      Moreover, the novel methodology may be broadly useful beyond the questions addressed in this study. The manuscript is well-written and concise. Nonetheless, there are several areas for improvement:

      (1) The study focuses on middle childhood, but there is a lack of engagement in the Introduction or Discussion about what is known about memory development and the brain during this period. Many of the brain regions examined in this study, particularly frontoparietal regions, undergo developmental changes that could influence their involvement in memory encoding and consolidation. The paper would be strengthened by more directly linking the findings to what is already known about episodic memory development and the brain.

      (2) A more thorough overview of the BSDS algorithm is needed, since this is likely a novel method for most readers. Although many of the nitty-gritty details can be referenced in prior work, it was unclear from the main text if the BSDS algorithm discovered latent states based on activation patterns, functional connectivity, or both. Figure 1F is not very informative (and is missing labels).

      (3) A further confusion about the BSDS algorithm was whether it necessarily had to work on the rest data. Figure 4A suggests that each TR was assigned one of the four states based on the maximum win from the log-likelihood estimation. Without more details about how this algorithm was applied to the rest data, it is difficult to evaluate the claim on page 14 about the spontaneous emergence of the states at rest.

      (4) Although the BSDS algorithm was validated in prior simulations and task-based fMRI using sustained block designs in adults, it is unclear whether it is appropriate for the kind of event-related design used in the current study. Figure 1G shows very rapid state changes, which is quantified in the low mean lifetime of the states (between 1-3 TRs on average) in Figure 4C. On the one hand, it is a strength of the algorithm that it is not necessarily tied to external stimuli. On the other hand, it would be helpful to see simulations validating that rapid transitions between states in fMRI data are meaningful and not due to noise.

      (5) The Methods section mentions that participants actively imagined themselves within the encoded scenes and were instructed to memorize the images for a later test during the post-encoding rest scan. This detail needs to be included in the main text and incorporated into the interpretation of the findings, as there are likely mechanistic differences between spontaneous memory replay/reinstatement vs. active rehearsal.

      (6) Information about the general linear model used to discover the 16 ROIs that showed a subsequent memory effect are missing, such as: covariates in the model (motion, etc.), group analysis approach (parametric or nonparametric), whether and how multiple-comparisons correction was performed, if clusters were overlapping at all or distinct, if the total number of clusters was 16 or if this was only a subset of regions that showed the effect.

    4. Reviewer #3 (Public review):

      Summary:

      This paper uses a novel method to look at how stable brain states and the transitions between them promote memory formation during encoding and post-encoding rest in children. I think the paper has some weaknesses (detailed below) that mean that the authors fall short of achieving their aims. Although the paper has an interesting methodological approach, the authors need better logic, and are potentially "double dipping" in their results - meaning their logic is circular. I think the method that they are using could be useful to the broader neuroimaging community, although they need to make this argument clearer in the paper.

      Strengths:

      The paper is interesting in that they use a novel method to look at brain state dynamics and how they might support memory.

      Weaknesses:

      The paper has several weaknesses:

      (1) The authors use children as their study subjects but fail to reconcile why children are used, if the same phenomena are expected to be seen in adults (or only children), and if and how their findings change with age across an age range that ranges from middle childhood into early adolescence. They need to include more consideration for the development of their subject population. The authors should make it clear why and how memory was tested in children and not adults. Are adults and children expected to encode and consolidate in a similar manner to children? Do the findings here also apply to adults? Do the findings here also apply to adults? How was the age range of 8-13-year-old children selected? Why didn't the authors look at change with age? Does memory performance change with age? Do the BSDS dynamics change with age in the authors' sample?

      (2) The authors look for brain state dynamics within a preselected set of ROIs that are selected because they display a subsequent memory effect. This is problematic because the state that is most associated with subsequent memory (S3, or State 3) is also the one that shows most activity in these regions (that have already been a priori selected due to displaying a subsequent memory effect). This logic is circular. It would be helpful if they could look at brain state dynamics in a more ROI agnostic whole brain approach so that we can learn something beyond what a subsequent memory analysis tells us. I think the authors are "double dipping" in that they selected regions for further analysis based on a subsequent memory association (remembered > forgotten contrast) and then found states within those regions showing a subsequent memory effect to further analyze for being associated with subsequent memory. Would it be possible instead to do a whole-brain analysis (something a bit more agnostic to findings) using the BSDS framework, and then, from a whole-brain perspective, look for particular brain states associated with subsequent memory? As it stands, it looks like S3 (state 3) has greater overall activation in all brain regions associated with subsequent memory, so it makes sense that this brain state is also most associated with subsequent memory. The BSDS analysis is therefore not adding anything new beyond what the authors find with the simple subsequent memory contrast that they show in Figure 1C. This particularly effects the following findings: (a) active-encoding state occupancy rate correlated positively with memory accuracy, (b) transitions to the active-encoding state were beneficial / Conversely, transitions toward the inactive state (S4) were detrimental, with incoming transitions showing negative correlations with memory accuracy / The active-encoding state serves as a "hub" configuration that facilitates memory formation, while pathways leading to this state enhance performance and transitions away from it impair encoding.

      (3) The task used to test memory in children seems strange. Why should children remember arbitrary scenes? How this was chosen for encoding needs to be made clear. There needs to be more description of the memory task and why it was chosen. Why was scene encoding chosen? What does scene encoding have to do with the stated goal of (a) "Understanding how children's brains form lasting memories", (b) "optimizing education" and (c) "identifying learning disabilities"? What was the design of the recognition memory test? How many novel scenes were included in the test, and how were they chosen? How close were the "new" images to previously seen "old" images? Was this varied parametrically (i.e., was the similarity between new and old images assessed and quantified?)

      (4) They ultimately found four brain states during encoding. It would be helpful if they could make the logic and foundation for arriving at this number clear.

      (5) There is already extant work on whether brain states during post-encoding rest predict memory outcomes. This work needs to be cited and referred to. The present manuscript needs to be better situated within prior work. The authors should look at the work by Alexa Tompary and Lila Davachi. They have already addressed many of the questions that the authors seek to answer. The authors should read their papers (and the papers they cite and that cite them) and then situate their work within the prior literature.

      More minor weaknesses:

      (1) The authors should back up the claim that "successful episodic memory formation critically depends on the temporal coordination between these systems. Brain regions must coordinate their activity through dynamic functional interactions, rapidly reconfiguring their activity and connectivity patterns in response to changing cognitive demands and stimulus characteristics." Do they have any specific evidence supporting this claim?

      (2) These claims seem overstated: "this work has broad implications for understanding memory function in children, for developing educational interventions that enhance memory formation, and enabling early identification of children at risk for learning disabilities." Can the authors add citations that would support these claims, or if not, remove them?

    1. eLife Assessment

      This important study investigates the self-assembly activity of all 109 human death-fold domains. The data collected using advanced microscopy and distributed amphifluoric FRET-based flow cytometry methods are compelling to support the "phase change battery" model that explains how signal amplification can occur without ATP consumption. This paper provides new insight into the thermodynamic control of protein phase behaviors within cells and will be of interest to those studying a variety of biological pathways involved in inflammatory responses and various forms of cell death.

    2. Reviewer #1 (Public review):

      This is a high-quality and extensive study that reveals differences in the self-assembly properties of the full set of 109 human death fold domains (DFDs). Distributed amphifluoric FRET (DAmFRET) is a powerful tool that is applied here for a comprehensive examination of the self-assembly behaviour of the DFDs, in non-seeded and seeded contexts, and allows comparison of the nature and extent of self-assembly. The work reveals the nature of the barriers to nucleation in the transition from low to high AmFRET. Alongside analysis of the saturation concentration and protein concentration in the absence of seed, the work demonstrates that the subset of proteins that exhibit discontinuous transitions to higher-order assemblies are expressed more abundantly than DFDs that exhibit continuous transitions. The experiments probing the ~20% of DFDs that exhibit discontinuous transition to polymeric form suggest that they populate a metastable, supersaturated form, in the absence of cognate signal. This is suggestive of a high intrinsic barrier to nucleation.

      The differences in self-assembly behaviour are significant and highlight mechanistic differences across this large family of signalling adapter domains, with identification of a small number of key supersaturated adapters, which exhibit higher centrality within networks, and can amplify signals and transduce them to effectors as required. The description of some supersaturated DFD adaptors as long-term, high-energy storage forms or phase change adaptors is attractive and is a framework that addresses many of the requirements for on-demand signaling and amplification in innate immunity. The identification of only a small number of key adaptors and high specificity suggests a mechanism for insulation of pathways from each other and minimisation of aberrant lethal consequences.

      An optogenetic approach is applied to initiate self-assembly of CASP1 and CASP9 DFDs, as a model for apoptosome initiation in these two DFDs with differing continuous or discontinuous assembly properties. This comparison reveals clear differences in the stability and reversibility of the assemblies, supporting the authors' hypothesis that supersaturation-mediated DFD assembly underlies signal amplification in at least some of the DFDs. The study also reveals interesting correlations between supersaturation of DFD adapters in short- and long-lived cells, suggestive of a relationship between mechanism of assembly and cellular context. Additionally, the interactions are almost all homomeric or limited to members of the same DFD subfamily or interaction network and examination of bacterial proteins from innate immunity operons suggest that their polymerisation could be driven by similar mechanisms. Future detailed studies that probe the roles and activities of DFDs identified with continuous or discontinuous barriers to nucleation, through mutational analysis, in chimeric proteins and with high resolution studies of the assemblies, can build on this methodology and database.

      The Discussion effectively places this work in the context of innate immunity effectors and adapters, explains and provides a justification of the phase change material analogy, and contrasts this mechanism with phase separation. The breadth and depth of the experimental investigations allow a new view of the role of nucleation barriers and supersaturation in DFD assembly and innate immunity pathways.

    3. Reviewer #2 (Public review):

      This work studies the self-association behavior of 109 human Death Fold Domains (DFD) in eukaryotic cells and its connection to their function in innate immune signalosomes.

      Using an amphifluoric FRET (DAmFRET) method previously developed by the authors, self-association is monitored as a function of protein concentration by Förster Resonance Energy Transfer in the cell.

      Several DFDs are found to be in a supersaturable state and are considered energy reservoirs necessary for signal amplification.

      The revised manuscript addresses most of the original concerns, resulting in a significant improvement.

      The following observations are made:

      (1) A group of DFDs shows a bimodal FRET distribution of no FRET and high FRET values at low and high protein concentration, which indicates a nucleation barrier. This conclusion is corroborated by the modification from a discontinuous to a continuous FRET transition by expressing a structural template or seed. The authors find that DFDs displaying discontinuous FRET behavior are supersaturated, and those that retain their discontinuous behavior in the context of the full-length protein correspond to protein adaptors of innate immune signalosomes.

      (2) The authors indicate that the adaptors of inflammatory signalosomes act as energy reservoirs for signal amplification. This is not demonstrated, but it is assumed that the energy stored in the supersaturated state is released upon polymerization.

      (3) This work also includes evidence showing that nonsupersaturable and supersaturable constructs of caspase-9 form puncta that dissolve or persist, respectively, upon apoptosome stimulation. The supersaturable construct also induces massive cell death, in contrast to the nonsupersaturable form. Although not demonstrated, these results could be related to the level of signal amplification.

      (4) The cell's lifespan depends on the supersaturation levels of certain DFDs.

      (5) Polymerization nucleated by interaction between DFDs from different pathways (different signalosomes) is rare.

      (6) The study demonstrates the presence of nucleation barriers, inferred from supersaturable conditions, in the adaptor orthologs of zebrafish (Danio rerio) and the model sponge Amphimedon queenslandica, which indicates that this characteristic is conserved.

    4. Author response:

      The following is the authors’ response to the current reviews.

      Both reviewers indicated broad approval of the revised work, for which we are grateful.

      Reviewer #1 requested no further changes.

      Reviewer #2’s Public review states:

      The authors indicate that the adaptors of inflammatory signalosomes act as energy reservoirs for signal amplification. This is not demonstrated, but it is assumed that the energy stored in the supersaturated state is released upon polymerization.

      The “assumed” link between supersaturation and energy release is in fact a thermodynamic necessity. Supersaturation is, by definition, a high free energy state. Our data shows that triggering nucleation via optogenetics results in an immediate avalanche of polymerization and cell death. This is not an assumption; it is a direct observation of work performed by the system when the kinetic barrier is removed.

      Reviewer #2 recommended:

      Ideally, signal amplification could be tested by determining the levels of the final product, e.g., cytokines, activated caspases...

      We did measure CASP3/7 activation, demonstrating a correlation with supersaturation of upstream adaptors. We do agree however that measuring the levels of other signaling products, including for each of the supersaturated pathways, would strengthen our claims. This will be the subject of future work.

      The authors indicate a significant anticorrelation between the saturating concentrations and the transcript abundances (Figure 2B), reporting an R = -0.285.

      This is correct… no change appears to be requested or warranted.


      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This is a high-quality and extensive study that reveals differences in the self-assembly properties of the full set of 109 human death fold domains (DFDs). Distributed amphifluoric FRET (DAmFRET) is a powerful tool that reveals the self-assembly behaviour of the DFDs, in non-seeded and seeded contexts, and allows comparison of the nature and extent of self-assembly. The nature of the barriers to nucleation is revealed in the transition from low to high AmFRET. Alongside analysis of the saturation concentration and protein concentration in the absence of seed, the subset of proteins that exhibited discontinuous transitions to higher-order assemblies was observed to have higher concentrations than DFDs that exhibited continuous transitions. The experiments probing the ~20% of DFDs that exhibit discontinuous transition to polymeric form suggest that they populate a metastable, supersaturated form in the absence of cognate signal. This is suggestive of a high intrinsic barrier to nucleation.

      Strengths:

      The differences in self-assembly behaviour are significant and likely identify mechanistic differences across this large family of signalling adapter domains. The work is of high quality, and the evidence for a range of behaviours is strong. This is an important and useful starting point since the different assembly mechanisms point towards specific cellular roles. However, understanding the molecular basis for these differences will require further analysis.

      An impressive optogenetic approach was engineered and applied to initiate self-assembly of CASP1 and CASP9 DFDs, as a model for apoptosome initiation in these two DFDs with differing continuous or discontinuous assembly properties. This comparison revealed clear differences in the stability and reversibility of the assemblies, supporting the hypothesis that supersaturation-mediated DFD assembly underlies signal amplification in at least some of the DFDs.

      The study reveals interesting correlations between supersaturation of DFD adapters in short- and long-lived cells, suggestive of a relationship between the mechanism of assembly and cellular context. Additionally, the comprehensive nature of the study provides strong evidence that the interactions are almost all homomeric or limited to members of the same DFD subfamily or interaction network. Similar approaches with bacterial proteins from innate immunity operons suggest that their polymerisation may be driven by similar mechanisms.

      Weaknesses:

      Only a limited investigation of assembly morphology was conducted by microscopy. There was a tendency for discontinuous structures to form fibrillar structures and continuous to populate diffuse or punctate structures, but there was overlap across all categories, which is not fully explored.

      We agree that an in-depth exploration of aggregate morphology would be interesting, but we feel it has limited relevance to the central findings of the manuscript. Our analysis established a relationship between discontinuous transitions and ordering based on the assumption that ordered assembly by DFDs involves polymerization, for which there is much precedent in the literature. Nevertheless, polymers of similar structure can form with different kinetics and hence, polymerization does not by itself imply an ability to supersaturate. We see this empirically in the “fibrillar” column in Fig. 1B. We have now elaborated this important point more fully in the relevant results section and in the discussion. Only five of the 108 DFDs in Fig. 1B warrant additional explanation. CASP4<sup>CARD</sup> and IFIH1<sup>tCARD</sup> lacked AmFRET but formed puncta; this could result from interactions with endogenous structures or condensates. DAPK1<sup>DD</sup> and UNC5A<sup>DD</sup> were classified as continuous (low) and fibrillar, but their AmFRET values are in fact higher than monomer control revealing that the fibrils simply comprise a small fraction of the protein. The puncta of UNC5A<sup>DD</sup> additionally do not resemble the fibrillar puncta of other DFDs; we suspect it may be a false-positive resulting from localization to mitochondrial or other intracellular membranes. Finally, CASP2<sup>CARD</sup> was inadvertently classified as punctate; this turns out to have been a technical artifact that has now been corrected (the fibrils wrapped around the cell perimeter to form ring-like puncta with anomalously low aspect ratios). We have now updated the methods section describing manual validation of our automated classification procedure, including which samples required reclassification. We have also now included all microscopy data in the public repository accompanying this manuscript.

      The methodology used to probe oligomeric assembly and stability (SDD-AGE) does not justify the conclusions drawn regarding stability and native structure within the assemblies.

      The reviewer is correct that SDD-AGE does not provide evidence against non-amyloid misfolding. It merely provides evidence that the DFDs are not forming amyloid (which are characteristically sarkosyl resistant). We have revised the sentence and further clarified that the distinction with amyloid specifically is important because amyloid is the only known form of ordered assembly (other than DFD polymers) with a nucleation barrier large enough to support deep supersaturation. Together with the series of interfacial mutants tested (and shown to impede assembly in all cases), the lack of sarkosyl-resistance provides evidence that the discontinuous DFDs are assembling through canonical DFD subunit interfaces.

      The work identifies important differences between DFDs and clearly different patterns of association. However, most of the detailed analysis is of the DFDs that exhibit a discontinuous transition, and important questions remain about the majority of other DFDs and why some assemblies should be reversible and others not, and about the nature of signalling arising from a continuous transition to polymeric form.

      We focused on discontinuous DFDs because this property allows for executive control over their respective pathways. They make signaling switch-like, which we argue is essential for innate immune responses. By contrast, and as illustrated in Figure 6D, supersaturation is required for a DFD to drive its own polymerization -- hence activation for a continuous DFD must be stoichiometrically coupled either with D/PAMP binding or positive feedback from downstream or orthogonal processes. We consider the principles underlying such regulation of signaling to be better established and understood than supersaturation, and hence built our narrative for this manuscript around the latter. Our original text addresses the fact that only a small fraction of DFDs are discontinuous. Specifically, this is expected in light of the fact that a) only one supersaturated DFD is needed to make a signaling pathway switch-like, and b) every supersaturated DFD renders the cell susceptible to spontaneous death. Evolution should therefore limit supersaturation to only the highly connected DFDs (i.e. adaptors), which is what is seen. In this view, the many nonsupersaturable DFDs have evolved to accessorize the central supersaturable DFDs with various sensor and effector modules. Our revised text attempts to further clarify this perspective.

      Some key examples of well-studied DFDs, such as MyD88 and RIPK,1 deserve more discussion, since they display somewhat surprising results. More detailed exploration of these candidates, where much is known about their structures and the nature of the assemblies from other work, could substantiate the conclusions here and transform some of the conclusions from speculative to convincing.

      We were likewise initially surprised about the inability of MyD88 and RIPK1 to supersaturate. We have now elaborated in the Discussion how our findings can be rationalized by the apparent supersaturability of other adaptors in MyD88 and RIPK1 signaling pathways. We additionally discuss prior evidence that MyD88 may indeed be supersaturable, and how our experimental system could have led to a false positive in the unique case of MyD88.

      The study concludes with general statements about the relationship between stochastic nucleation and mortality, which provide food for thought and discussion but which, as they concede, are highly speculative. The analogies that are drawn with batteries and privatisation will likely not be clearly understood by all readers. The authors do not discuss limitations of the study or elaborate on further experiments that could interrogate the model.

      We have now added to the discussion a section on the limitations of our study. We appreciate that our use of “privatisation” was confusing and have omitted it. However, we consider the battery analogy to accurately convey the newfound function of DFDs and anticipate that this analogy will ultimately prove valuable for biologists. To facilitate comprehension, we have now broadened our description of phase change batteries in the introduction.

      Reviewer #2 (Public review):

      Summary:

      The manuscript from Rodriguez Gama et al. proposes several interesting conclusions based on different oligomerization properties of Death-Fold Domains (DFDs) in cells, their natural abundance, and supersaturation properties. These ideas are:

      (1) DFDs broadly store the cell's energy by remaining in a supersaturated state;

      (2) Cells are constantly in a vulnerable state that could lead to cell death;

      (3) The cell's lifespan depends on the supersaturation levels of certain DFDs.

      Overall, the evidence supporting these claims is not completely solid. Some concerns were noted.

      Strengths:

      Systematic analysis of DFD self-assembly and its relationship with protein abundance, supersaturation, cell longevity, and evolution.

      Weaknesses:

      (1) On page 2, it is stated, "Nucleation barriers increase with the entropic cost of assembly. Assemblies with large barriers, therefore, tend to be more ordered than those without. Ordered assembly often manifests as long filaments in cells," as a way to explain the observed results that DFDs assemblies that transitioned discontinuously form fibrils, whereas those that transitioned continuously (low-to-high) formed spherical or amorphous puncta. It is unlikely to be able to differentiate between amorphous and structured puncta by conventional confocal microscopy. Some DFDs self-assemble into structured puncta formed by intertwined fibrils. Such fibril nets are more structured and thus should be associated with a higher entropic cost. Therefore, the results in Figure 1B do not seem to agree with the reasoning described.

      The formation of microscopically visible elongated structures necessitates ordering on the length scale of 100s of nanometers. Otherwise surface tension would favor rounded aggregates. Conventional confocal microscopy is in fact well-suited and widely used to distinguish ordered from disordered assemblies in cells based on this principle.1,2 We are unaware of any examples of isolated DFDs forming regular polymers that manifest as round puncta or nets. The reviewer may be referring to full-length ASC, which forms a roughly spherical mesh of filaments because it has two DFDs joined by a flexible linker. This is not applicable to our analysis with single DFDs. Single DFDs polymerize in effectively one dimension; hence a spherical punctum formed by a single DFD can only happen through noncanonical interactions or clustering of small filaments, both of which reduce order relative to long filaments.

      (2) Errors for the data shown in Figure 1B would have been very useful to determine whether the population differences between diffuse, punctate, and fibrillar for the continuous (low-to-high) transition are meaningful.

      We have now performed two statistical analyses to address this. First, using Fisher’s exact test, we observe a highly significant association between the DAmFRET and morphology classifications (p-value: 0.0001). Second, to specifically address whether the continuous (low to high) category has a preferred morphology, we applied an Exact Multinomial Test using the total frequencies of each morphology. This test revealed that all categories are significantly enriched for particular morphologies, as now indicated in the figure and legend.

      (3) A main concern in the data shown in Figure 1B and F is that the number of counts for discontinuous compared to continuous is small. Thus, the significance of the results is difficult to evaluate in the context of the broad function of DFDs as batteries, as stated at the beginning of the manuscript.

      Fig. 1B simply reports the numerical intersections between fluorescence distribution classifications and DAmFRET classifications. In Fig. 1F, our use of the chi-square test is justified by a sufficiently large sample size. Nevertheless, we obtain similar results with Fisher's exact test that accounts for smaller sample size (Odds Ratio: 75.0, P-value: < 0.0001). See also our response to the related critique by Reviewer 1 regarding the small number of discontinuous DFDs.

      (4) The proteins or domains that are self-seeded (Figure 1F) should be listed such that the reader has a better understanding of whether domains or full-length proteins are considered, whether other domains have an effect on self-seeding (which is not discussed), and whether there is repetition.

      We define and consistently use “DFDs” to refer to domains, and “FL” or “DFD-containing protein” to refer to FL proteins. The Figure 1 title and corresponding section title both indicate the data refer to “DFDs”. The text callout for Figure 1F also directs readers to Table S1 where we believe the self-seeding results and details of constructs are clearly presented. There is no repetition. We have modified the legend to clarify that “Each DFD was co-expressed with an orthogonally fluorescent μNS-fused version of the same DFD.” We did not systematically evaluate seeding of FL proteins. We did however previously test self-seeding on seven representative FL proteins, and have now included those data in a new supplemental figure (S5). In short, only FL proteins with discontinuous distributions are self-seedable. These are limited to adaptors that had discontinuous seedable DFDs, revealing no adverse effect of FL protein context on seedability of adaptors (unlike receptors and effectors).

      (5) The authors indicate an anticorrelation between transcript abundance and Csat based on the data shown in Figure 2B; however, the data are scattered. It is not clear why an anticorrelation is inferred.

      An anticorrelation is indicated by the clearly placed negative R value at the top of the graph and the figure legend describing the statistical analysis.

      (6) It would be useful to indicate the expected range of degree centrality. The differences observed are very small. This is specifically the case for the BC values. The lack of context and the small differences cast doubts on their significance. It would be beneficial to describe these data in the context of the centrality values of other proteins.

      The possible range of centrality scores is 0 - 1, where 1 represents a protein interacting with every other protein in the network (degree centrality) or is on the shortest path between every other pair of proteins in the network (betweenness centrality). The expected range is difficult to address, as centrality values strongly depend on the size and function of the network. We considered that the SAM domain network could provide the most relevant comparison to the DFD network, as SAM domains resemble DFDs in size and structure, function heavily in signaling, are comparably numerous (76 in humans), and many of them form homopolymers (but importantly of a geometry that does not support nucleation barriers). We found that SAM domains have much lower betweenness centrality in their physical interaction network as compared to discontinuous DFDs (p = 0. 0003) while their degree centrality is not significantly different (Figure S3F). Nevertheless, we stress that what matters for our conclusion is that the continuous and discontinuous values are significantly different among DFDs. Since there is a large overlap in the distributions of centrality scores between the two classes of DFDs, we performed a more robust permutation test with the Mann Whitney U statistic and n = 10000. These tests reiterated that continuous and discontinuous DFDs have significantly different centrality scores (Degree centrality p = 0.008; Betweenness centrality p = 0.028) (Figure S3E).

      (7) Page 3 section title: "Nucleation barriers are a characteristic feature of inflammatory signalosome adaptors." This title seems to contradict the results shown in Figure 2D, where full-length CARD9 and CARD11 are classified as sensors, but it has been reported that they are adaptor proteins with key roles in the inflammatory response. Please see the following references as examples: The adaptor protein CARD9 is essential for the activation of myeloid cells through ITAM-associated and Toll-like receptors. Nat Immunol 8, 619-629 (2007), and Mechanisms of Regulated and Dysregulated CARD11 Signaling in Adaptive Immunity and Disease. Front Immunol. 2018 Sep 19;9:2105. However, both CARD9 and CARD11 show discontinuous to continuous behavior for the individual DFDs versus full-length proteins, respectively, in contrast to the results obtained for ASC, FADD, etc.

      We rigorously counter the inconsistent usage of the term “adaptor” in the signalosome literature by quantifying the centrality of each protein in the physical interaction network of DFD proteins. Such analysis shows that BCL10, which is also described as an adaptor, is the more central member of the CARD9 and CARD11 (CBM signalosome) pathways, and is therefore more “adaptor-like”. We have now elaborated this view in the text.

      FADD plays a key role in apoptosis but shows the same behavior as BCL10 and ASC. However, the manuscript indicates that this behavior is characteristic of inflammatory signalosomes. What is the explanation for adaptor proteins behaving in different ways? This casts doubts about the possibility of deriving general conclusions on the significance of these observations, or the subtitles in the results section seem to be oversimplifications.

      We agree that our initial presentation of these results and brief description of each protein’s function was insufficient to fully justify our conclusions. We have now elaborated that while FADD was historically considered an adaptor of extrinsic apoptosis, it is now appreciated as a pleiotropic molecule with both anti- and pro-inflammatory signaling functions. FADD’s pro-inflammatory roles include inflammasome activation and activating NF-kB through the FADDosome. We have now revised our section headings to avoid oversimplification.

      (8) IFI16-PYD displays discontinuous behavior according to Figure S1H; however, it is not included in Figure 2D, but AIM 2 is.

      We only tested a subset of FL proteins spanning different functions within diverse signalosomes. IFI16 was not included. Hence it could not be meaningfully included in Fig. 2D.

      (9) To demonstrate that "Nucleation barriers facilitate signal amplification in human cells," constructs using APAF1 CARD, NLRC4 CARD, caspase-9 CARD, and a chimera of the latter are used to create what the authors refer to as apoptsomes. Even though puncta are observed, referring to these assemblies as apoptosomes seems somewhat misleading. In addition, it is not clear why the activity of caspase-9 was not measured directly, instead of that of capsae-3 and 7, which could be activated by other means.

      We agree that describing our chimeric assemblies as “apoptosomes” could be misleading, and have now refrained from doing so. We measured caspase-3/7 instead of caspase-9 for purely technical reasons -- we were unable to find any reliable caspase-9 activity assays that were also compatible with our optogenetic and imaging wavelengths. In any case, our data with the widely used caspase3/7 reporter dyes confirm comparably effective signal propagation from the CASP9 versions to their relevant endogenous substrate for apoptotic signaling (pro-caspase-3/7). The subsequent differences in cell death efficiency between the two versions of CASP9 (Fig. 3E) cannot be attributed to indirect effects of blue light stimulation, because both versions received the same treatment. Note our stated justification for using these DFDs in the HEK293T background is that these cells lack NLCR4 and CASP1 proteins and therefore the activity we measure is due to the direct optogenetic activation.

      The polymerization of caspase-1 CARD with NLRC4 CARD, leading to irreversible puncta, could just mean that the polymers are more stable. In fact, not all DFDs form equally stable or identical complexes, which does not necessarily imply that a nucleation barrier facilitates signal amplification. Could this conclusion be an overstatement?

      Figure 3C shows that the polymers don’t simply persist following the transient stimulus -- they continue to grow. That is, the soluble protein continues to join the polymers for a net increase even though there is no longer a stimulus directing them to do so. This means the drive to polymerize is independent of the stimulus, i.e. the protein is supersaturated. In the absence of supersaturation, a difference in stability would simply change the rates at which the polymers shrink. That we see continued growth instead of shrinkage therefore cannot be explained just by a difference in stability. Nevertheless, the reviewer’s critique caused us to realize that increased persistence of the CASP1CARD polymers could contribute to signal amplification independently of supersaturation if they act catalytically (i.e. where each polymerized CASP9 subunit sequentially activates multiple CASP3/7 molecules), and we had not adequately considered this. Unfortunately, the relevant experimentalist has now moved on from the lab leaving us unable to conduct the necessary experiments to resolve these two effects in a timely fashion. Consequently, we have now tempered our interpretation of these data. 

      (10) To demonstrate that "Innate immune adaptors are endogenously supersaturated," it is stated on page 5 that ASC clusters continue to grow for the full duration of the time course and that AIM2-PYD stops growing after 5 min. The data shown in Figure 4F indicate that AIM2-PYD grows after 5 mins, although slowly, and ASC starts to slow down at ~ 13 min. Because ASC has two DFDs, assemblies can grow faster and become bigger. How is this related to supersaturation?

      That AIM2-PYD assemblies appear to grow somewhat (although not significantly statistically) would be consistent with AIM2-PYD’s sequestration into the growing ASC clusters. All that matters for our conclusion regarding ASC is that ASC assemblies grow following cessation of the stimulus, which we now describe quantitatively. Supersaturation is defined as the ratio of total concentration to saturating concentration, which is an equilibrium property. For a given protein concentration, the presence of two DFDs, each contributing their own interactions to overall stability of the assembly, will increase supersaturation relative to the individual DFDs. Importantly, growth will not occur if the protein concentration lies below its C<sub>sat</sub>, no matter how many DFDs it has.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      It isn't clear what is implied by the final sentence of the Abstract. Some of the conclusions have a speculative tone and would be better described in less certain terms. The final sentence of the abstract should be omitted.

      We have revised the abstract to add appropriate nuance but consider the final sentence to be both justified by our data and important to convey our findings to a broad audience.

      How does the size and nature of the seed influence the outcome of these DFD interactions? Although some non-seeded experiments are described, the majority of the results are derived from seeded experiments. Further details about the seeds should be included. How is the size of the nucleus controlled, and will seeds of smaller or larger size generate the same pattern of results?

      This is a very important question! The seeds comprised genetic fusions of each DFD to a condensate-forming domain, as described. While this system is insufficient to explore the size-dependence of nucleation, we are developing tools to do exactly that, for example our recently published multivalent nanobody against mEos3,[3] wherein we piloted its use to compare the size-dependence of ASC versus amyloid nucleation. Much further work will be needed to fully utilize this approach for the question of interest, and that is the subject of ongoing but open-ended work in the lab.

      What is the implication of the observation that only ~20% of the DFDs exhibited a discontinuous transition from no to high AmFRET signal? Further discussion of the DFDs that exhibit a continuous transition would enrich the manuscript.

      We consider the relationship to mortality important for understanding this observation. In the discussion we now explain that each supersaturated protein in a death-inducing pathway imposes a risk of unintentional death. We speculate that evolution therefore minimizes the number of supersaturated DFDs by restricting them to central nodes in the network. That way, a small number of supersaturable DFDs can be continuously “repurposed” with new receptor proteins for each D/PAMP. Additionally, as stated in our response to the related critique, we felt it was important to focus this manuscript on the novel concept of functional supersaturation necessarily at the expense of signaling regulation through better understood mechanisms.

      Were the initial experiments with DFDs unseeded (Figure S1, F-G)? Clarify this in the text. The morphologies of all the subcellular assemblies appear similar. It is not possible to distinguish between long filaments and spherical or amorphous puncta (Figure S1F-G). Higher magnification images that allow evaluation and comparison of morphology should be provided.

      The initial experiments were unseeded, as now clarified in the legend. We believe there was a misinterpretation resulting from both panels (S1F and G) showing fibrillar examples. To clarify, we have now added panel S1H showing representative DFDs classified as “punctate”, which we hope the reviewer agrees are clearly distinct from fibrillar.

      The ASC and CARD14 assemblies in Figure S1G show very distinct fibrillar structures emerging from the mNS-DFD seeds. Please provide further explanation of the nature of these. Do these resemble ASC and CARD assemblies generated as a result of native stimuli rather than mNS-DFD seeds?

      The μNS-DFD puncta contain numerous seeding competent sites, which presumably causes multiple fibrils to initiate and emanate from them. This and potential bundling of these fibrils produces the star-like shape. We have no reason to believe the internal structure of these fibers differs from native signalosome assemblies. For example, point mutations at native subunit interfaces that were previously shown to disrupt fibrilization and signaling likewise disrupt assembly in our DAmFRET experiments (Figure S2A). To our knowledge there exist no examples of high-resolution DFD fibril structures that were induced by native stimuli. However, recent work using super-resolution imaging confirmed that nigericin-triggered endogenous ASC specks comprise a network of filaments that superficially resembles our star-like assemblies.[4]

      Figure S2B is presented as evidence that assembly is mediated by native-like interfaces rather than amyloid-like misfolding. These SDD-Age gels cannot be used to infer a native-like structure for the protein within the assemblies, only that the assemblies are (mostly) solubilised by incubation with sarkosyl. Many misfolding but non-amyloid-structure assemblies could be consistent with these results. Additionally, several of the samples appear to show insoluble aggregates within the wells, which could also be consistent with amyloid-type structures. What is the nature of these aggregates? Why is the NLRP3PYD sample so much more intense than the others? Why was FL-ZBP1 included when it does not contain a DFD? Why were no sarkosyl-resistant assemblies observed with RIPK3-RHIM when this is known to be highly amyloidogenic?

      ZBP1 and RIPK3<sup>RHIM</sup> were one of multiple proteins inadvertently included on the complete gel shown in the original figure that is not relevant to the manuscript; we have now spliced out these unnecessary lanes (indicated with dashed lines) to avoid confusion. We have found that the specific fragment of RIPK3<sup>RHIM</sup> used in this experiment -- residues 446-464 -- does not allow for robust amyloid formation. We believe this is a steric artifact due to its small size (19 residues) relative to the fused mEos3, because a longer fragment (446-518) forms amyloid robustly. However the latter construct was not available at the time this experiment was done. Nevertheless, another known amyloid protein, RIPK1<sup>RHIM</sup>, does show the expected smears on this gel and suffices for the positive control for amyloid. We do not understand why the NLRP3<sup>PYD</sup> sample is more intense than the others. However, this anomaly does not impact our conclusion that DFDs do not form sarkosyl-resistant smears that would be indicative of amyloid.

      Expand on the concept of autoinhibited oligomerisation. Is this due to structural features? What might be the advantage of autoinhibited oligomerisation for these DFDs?

      We have elaborated on this section in the results.

      End of page 3, which "former set of adaptors" are referred to here? This is ambiguous.

      We have replaced “former” with “innate immune”.

      Page 5, the authors state that a kinetic barrier governs the activity of inflammatory signalosomes. While under the circumstances generated in this particular system, there is a kinetic barrier to the formation of large fibrillar complexes, can the same be said to be true in cells that respond to signals? They experience a specific triggering event. This should be redrafted to distinguish between the specific trigger in cells (downstream of a binding-driven event) and the kinetic barrier to self-association observed in this model system.

      Yes, our findings establish that a kinetic barrier governs signalosome activation. By engineering a triggering event that is more specific than natural triggering events (see Figure 3), we exclude the possibility that the cell first responds to the signal to create conditions that stabilize inflammasome formation. This means that regardless of what may happen with a natural trigger, the driving force for assembly clearly pre-exists and is therefore held in check by a kinetic barrier.

      On page 6, the statement "...lifespan may be limited by the thermodynamic drive for inflammatory signal amplification" is not clear. While this is strictly true following the initial triggering event, isn't lifespan limited by the stochastic activation? These very general statements stray beyond what can be substantiated on the basis of the data presented here.

      We believe the source of confusion here was our misuse of the term “lifespan”. We have now replaced it with “life expectancy”, which we believe is substantiated by our statements as written.

      Overall, the work presents a compelling, comprehensive analysis of the seeded self-assembly of DFDs. It identifies distinct properties for assembly of these domains that may underlie their particular physiological roles. However, some of the statements are quite general and not substantiated.

      Page 6. Is "end cell fate" the intended phrase?

      We have revised the phrase.

      The data regarding conservation of DFD-like modules and activity is interesting and probably deserves inclusion. However, without substantial evidence of expression levels (i.e., results) and a more complete understanding of these other systems, the statement "These results suggest that the function of DFDs as energy reservoirs preceded the evolution of animals" appears as an over-reach.

      We demonstrated that sequence-encoded nucleation barriers of DFDs are shared across animal signalosomes (human, zebrafish, sponge). This is not trivial as such nucleation barriers are uncommon even among targeted screens of prion-like proteins.5 Therefore, they appear to have existed in the basal animal. We have now omitted the data concerning bacterial DFDs as these systems are indeed much less understood, and the concerned pathways lack the tripartite architecture of animal signalosomes. We therefore revised the sentence in question by replacing “evolution” with “radiation”.

      Only a small number of DFDs exhibit this behaviour, so why is the conclusion drawn that energy storage for on-demand signalling may be the principal ancestral function of DFDs?

      The totality of the data supports this conclusion. Briefly (but elaborated in the text), 1) intrinsic nucleation barriers are unusual even among self-associating proteins, the vast majority of which (e.g. condensates) would suffice for the only other major function ascribed to DFDs -- bringing effectors close enough for proximity-dependent activation (which has been repeatedly demonstrated in DFD-replacement experiments), 2) nucleation barriers are nevertheless conserved in innate immune signaling pathway, 3) that they are limited to approximately one DFD in each pathway is consistent with evolutionary selection to minimize accidental death.

      Are there any other adapters like MyD88 that are inconsistent with this hypothesis? Are any others known to be controlled by oligomer formation? How strong is the evidence for hexameric oligomers? If there is a threshold size for oligomers, how does this differ from a stable seed/nucleus that triggers assembly, as in the discontinuous transition?

      These are all good questions related to critiques that we have now addressed.

      The use of the term "privatisation" is likely not consistently understood across the community and should be explained. Is it simply meant to imply independent operation? How is it actually different from other forms of deployment of DFDs that exhibit continuous assembly? Are they not also independent? What is implied by the opposite of privatisation here? The term may introduce ambiguity in this context.

      We have now omitted this term.

      Is there strong evidence that well-validated physiologically relevant LLPS systems exhibit supersaturation at concentrations that are very different from those of the DFDs examined in this study?

      No, and this is a major point. As discussed in the text (with references), LLPS is incompatible with cell-wide supersaturation to a comparable magnitude as crystalline transitions, which precludes them from driving signal amplification. This helps to explain why the active state of DFD assemblies is ordered, when it has been repeatedly demonstrated that signal propagation itself does not require ordering.

      The paragraph discussing TIR domains and functional amyloids would be enhanced with a comparison of amyloid systems where seeded nucleation results in assembly of a polymer with significant conformational change in the constituent monomers.

      We do not yet understand how DFDs (and TIR domains) in some cases exhibit amyloid-like nucleation barriers without overt conformational differences between monomers and polymers. Work is underway in the lab to test specific hypotheses, but such discussion would be too speculative for the present paper.

      The statement "High specificity also insulates pathways from each other" should be elaborated to discuss the issue of highly similar monomers that apparently assemble into filamentous forms with minimal structural rearrangement. How is the specificity generated?

      We have elaborated the paragraph.

      The final paragraph is speculative and utilises language that detracts from the quality and rigour of the study. While important principles have been revealed, more discussion of the limitations of the work would allow readers to evaluate the significance of the study and could be used to effectively stimulate further efforts to study the multiple different mechanisms that underpin critical signalling pathways in innate immunity and control cell fate.

      We have now revised the final paragraph and included an extensive discussion of the limitations of the work.

      Reviewer #2 (Recommendations for the authors):

      (1) For clarity, it would be useful to include the names of the proteins in the bottom table of STable1, and such information at the top and bottom tables can be connected.

      We are unable to determine what is meant by this suggestion. Table S1 does not have a “top” and “bottom table”. Every entry in Table S1 and S2 contains the protein name, its most frequently used alias in the literature (when not the official name), and the corresponding Uniprot protein ID.

      (2) The language used in the abstract makes analogies between scientific and mundane terms, which compromises clarity. For example, what is meant by the terms shown below?

      (a) "......specifically templated by other DFDs....."

      We have revised this phrase.

      (b) "...function like batteries, storing and converting energy for life-or-death decisions."

      Batteries convert chemical energy into electrical energy or thermal energy. What is the electrical energy produced by DFDs? Is there any evidence that DFDs change the temperature of the cells or transfer heat?

      We have now included a familiar example of a thermal battery that operates analogously to the manner we show for DFDs. As now elaborated extensively, such batteries operate via a physical rather than chemical process -- a change in the state of matter (solute to crystalline) of a supersaturated “phase change material” (this is an established term). This is exactly what we show is happening for DFDs. While it would be illustrative to measure the heat released upon DFD polymerization in cells, the much faster rate of heat transfer relative to molecular diffusion makes that impossible with present methods. Nevertheless, such measurements are unnecessary because disorder-to-order phase transitions are fundamentally exothermic.

      (c) "....privatizing..."

      We now avoid this term.

      Using appropriate scientific terms to explain the scientific results presented in this manuscript will increase clarity. Analogously, it is difficult to understand what the title of the manuscript means, "Protein phase change batteries..."

      We appreciate this critique and have removed “batteries” from the title to make the work more accessible to biologists. However, we reject the implication that such terminology is inappropriate. We presume the reviewer meant “unfamiliar” instead of “inappropriate”. The well-reasoned application of terms from other fields is standard practice and arguably essential to convey new concepts in biology. The modern biology lexicon is built on this. For example, Robert Hooke co-opted “cell” from the architecture of monasteries. More recently cell biologists appropriated “condensates” from soft matter physics. In both cases, the term while initially foreign to biologists usefully introduced a concept that lacked recognized precedent in biology. Similarly, “phase change battery” provides an accurate analogy for the central finding of our work, and we have now elaborated this analogy in the text.

      Bibliography

      (1) Garcia-Seisdedos, H., Empereur-Mot, C., Elad, N. & Levy, E. D. Proteins evolve on the edge of supramolecular self-assembly. Nature 548, 244–247 (2017).

      (2) Alberti, S., Halfmann, R., King, O., Kapila, A. & Lindquist, S. A systematic survey identifies prions and illuminates sequence features of prionogenic proteins. Cell 137, 146–158 (2009).

      (3) Kimbrough, H. et al. A tool to dissect heterotypic determinants of homotypic protein phase behavior. Protein Sci. 34, e70194 (2025).

      (4) Glück, I. M. et al. Nanoscale organization of the endogenous ASC speck. iScience 26, 108382 (2023).

      (5) Posey, A. E. et al. Mechanistic inferences from analysis of measurements of protein phase transitions in live cells. J. Mol. Biol. 433, 166848 (2021).

    1. eLife Assessment

      The authors present useful findings demonstrating that the RNA modification enzyme Mettl5 regulates sleep in Drosophila. Through transcriptome- and proteome-wide analyses, the authors identified downstream targets affected in heterozygous mutants and proposed that Mettl5 regulates the translation and degradation of clock genes to maintain normal sleep function. Through additional analyses, the authors provided solid evidence that Mettl5 regulates translation and degradation of clock genes to maintain normal sleep cycle. The mechanistic details of Mettl5 is unclear and requires further support.

    2. Reviewer #1 (Public review):

      Here, the authors attempted to test whether the function of Mettl5 in sleep regulation was conserved in drosophila, and if so, by which molecular mechanisms. To do so they performed sleep analysis, as well as RNA-seq and ribo-seq in order to identify the downstream targets. They found that the loss of one copy of Mettl5 affects sleep, and that its catalytic activity is important for this function. Transcriptional and proteomic analyses show that multiple pathways were altered, including the clock signaling pathway and the proteasome. Based on these changes the authors propose that Mettl5 modulate sleep through regulation of the clock genes, both at the level of their production and degradation, possibly by altering the usage of Aspartate codon.

      Comments on revised version:

      The authors satisfactorily addressed my comments, even though the precise mechanism by which Mettl5 regulates translation of clock genes remains to be firmly demonstrated.

    3. Reviewer #3 (Public review):

      Xiaoyu Wu and colleagues examined a potential role in sleep of a Drosophila ribosomal RNA methyltransferase, mettl5. Based on sleep defects reported in CRISPR generated mutants, the authors performed both RNA-seq and Ribo-seq analyses of head tissue from mutants and compared to control animals collected at the same time point. A major conclusion was that the mutant showed altered expression of circadian clock genes, and that the altered expression of the period gene in particular accounted for the sleep defect reported in the mettl5 mutant. In this revision, the authors have added a more thorough analysis of clock gene expression and show that PER protein levels are increased relative to wild type animals a specific times of day, indicating increased stability of the protein. Given that PER inhibits its own transcription, the per RNA is low in the mutants. Efforts toward a more detailed understanding of how clock gene expression was altered in the mutants, as well as other clarification of sleep phenotypes throughout is appreciated. As noted above, a strength of this work is its relevance to a human developmental disorder as well as the transcriptomic and ribosomal profiling of the mutant. However, there still remain some minor weaknesses in the manuscript. This reviewer is not in agreement with the interpretation of the epigenetic experiments. Specifically, co-expression of Clk[jrk] or per[01] with the mettl5 mutant recovered the nighttime sleep phenotype, but was additive to the daytime sleep phenotype such that double mutants showed higher sleep. This effect should be acknowledged and discussed. Overall, this is an interesting paper that indicates a molecular link between mettl5 and the circadian clock in regulation of sleep.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Here the authors attempted to test whether the function of Mettl5 in sleep regulation was conserved in drosophila, and if so, by which molecular mechanisms. To do so they performed sleep analysis, as well as RNA-seq and ribo-seq in order to identify the downstream targets. They found that the loss of one copy of Mettl5 affects sleep and that its catalytic activity is important for this function. Transcriptional and proteomic analyses show that multiple pathways were altered, including the clock signaling pathway and the proteasome. Based on these changes the authors propose that Mettl5 modulate sleep through regulation of the clock genes, both at the level of their production and degradation.

      Strengths:

      The phenotypical consequence of the loss of one copy of Mettl5 on sleep function is clear and well-documented.

      Weaknesses:

      The imaging and molecular parts are less convincing.

      - The colocalization of Mettl5 with glial and neuronal cells is not very clear

      We truly appreciate your suggestion. We repeated the staining experiments. To ensure better results, we tried another antibody of ELAV (mouse) and optimized the experimental conditions. This result has been included in the Figure S1 of the revised version.

      - The section on gene ontology analysis is long and confusing

      The session is revised for clarity. To get a better flow of logic, we deleted the paragraph which describing the details of Figure S6.

      - Among all the pathways affected the focus on proteosome sounds like cherry picking. And there is no experiment demonstrating its impact in the Mettl5 phenotype

      Thank you for the comments. The changes of period oppositely at transcriptional versus translational levels puzzled us a while until we found the ubiquitin pathway components changes. The regulation of Period protein degradation by ubiquitin-proteasome pathway has been well documented (Grima et al., 2002; Ko et al., 2002; Chiu et al., 2008). In addition, previous reports indicated that N6 methyladenosine (m6A) regulates ubiquitin proteasome pathway in skeletal muscle physiology (Sun et al., 2023). This information has been included in the revised manuscript in the last paragraph under the title: Mettl5 regulates the clock gene regulatory loop.

      Indeed, we haven’t found a proper way to manipulate proteasome levels in genetic tests. Proteasome is a large protein complex which is composed of many subunits. Enhancing the its activity by overexpressing its components was not applicable. Moreover, proteasome has important function during many biological processed. Disrupting its function by simply MG132 treatment which we tried results in lots of side effects.

      In this study, we also noticed the codon usage alteration caused by mettl5 mutant. Please refer to the answers to the following question for details. Previous reports also found the regulation of mettl5 on translation in other systems (Rong et al, 2020; Peng et al., 2022). Based on these analyses, it is possible that both the regulation on translation and protein degradation contributed the period protein upregulation found in mettl5 mutant. This idea has been included in the Discussion session of the revised manuscript.

      References

      Sun J, Zhou H, Chen Z, et al. Altered m6A RNA methylation governs denervation-induced muscle atrophy by regulating ubiquitin proteasome pathway. J Transl Med. 2023;21(1):845. Published 2023 Nov 23. doi:10.1186/s12967-023-04694-3

      Grima, B. et al. The F-box protein slimb controls the levels of clock proteins period and timeless. Nature 420, 178–182 (2002).

      Ko, H. W., Jiang, J. & Edery, I. Role for Slimb in the degradation of Drosophila period protein phosphorylated by doubletime. Nature 420, 673–678 (2002).

      Chiu, J. C., Vanselow, J. T., Kramer, A. & Edery, I. The phosphooccupancy of an atypical SLIMB-binding site on PERIOD that is phosphorylated by DOUBLETIME controls the pace of the clock. Genes Dev. 22, 1758–1772 (2008).

      - The ribo seq shows some changes at the level of translation efficiency but there is no connection with the Mettl5 phenotypes. In other words, how the increased usage of some codons impact clock signalling. Are the genes enriched for these codons?

      Thank you for raising this point. In our analysis, we observed an increased usage of the codons for Asp in the Mettl5 mutant. Prior work has reported a possible connection between codon usage and per protein activity. In the report, a per version with optimized codon cannot rescue circadian rhythmicity caused by per mutant, in contrast to WT version (Fu J et al. 2016). Further study indicated that dPER protein levels were also elevated in the mutant flies, suggesting a role for codon optimization in enhancing dPER expression (Figure 2B in Fu J et al. 2016). Consistent with this, we analyzed the region of codon optimization in Fu J et al. 2016. The result indicated that that GAC has a relatively high usage rate in these regions (indicated in the following two Author response image charts by the red arrow), suggesting that the Mettl5 mutation may influence per protein accumulation through altered GAC usage. Further experiments are needed to confirm this possibility. We included these details in the second last paragraph of the Discussion session.

      Author response image 1.

      15-21

      SDSAYSN

      Author response image 2.

      43-316

      SSGSSGYGGKPSTQASSSDMIIKRNKEKSRKKKKPKCIALATATTVSLEGTEESPLPANGGCEKVLQELQDTQQLGEPLVVTETQLSEQLLETEQNEDQNKSEQLAQFPLPTPIVTTLSPGIGPGHDCVGGASGGAVAGGCSVVGAGTDKTSELIPGKLESAGTKPSQERPKEESFCCVISMHDGIVLYTTPSISDVLGFPRDMWLGRSFIDFVHHKDRATFASQITTGIPIAESRGCMPKDARSTFCVMLRRYRGLNSGGFGVIGRAVNYEPF

      Fu J, Murphy KA, Zhou M, Li YH, Lam VH, Tabuloc CA, Chiu JC, Liu Y. Codon usage affects the structure and function of the Drosophila circadian clock protein PERIOD. Genes Dev. 2016 Aug 1;30(15):1761-75.

      - A few papers already demonstrated the role of Mettl5 in translation, even at the structural level (Rong et al, Cell reports 2020) and this was not commented by the authors. In Peng et al, 2022 the authors show that the m6A bridges the 18S rRNA with RPL24. Is this conserved in Drosophila?

      Thanks for the reminder. We discussed and cited these papers in the revised version.

      Rong B, Zhang Q, Wan J, et al. Ribosome 18S m<sup>6</sup>A Methyltransferase METTL5 Promotes Translation Initiation and Breast Cancer Cell Growth. Cell Rep. 2020;33(12):108544. doi:10.1016/j.celrep.2020.108544

      Peng H, Chen B, Wei W, et al. N<sup>6</sup>-methyladenosine (m<sup>6</sup>A) in 18S rRNA promotes fatty acid metabolism and oncogenic transformation. Nat Metab. 2022;4(8):1041-1054. doi:10.1038/s42255-022-00622-9

      - The text will require strong editing and the authors should check and review extensively for improvements to the use of English.

      Thanks. The text of the paper are thoroughly revised.

      Conclusion

      Despite the effort to identify the underlying molecular defects following the loss of Mettl5 the authors felt short in doing so. Some of the results are over-interpreted and more experiments will be needed to understand how Mettl5 controls the translation of its targets. References to previous works was poorly commented.

      Thanks for your suggestion. We have incorporated the references mentioned above. However, our efforts have thus far fallen short of elucidating a precise picture of METTL5's functional mechanism. To address this, the limitations of the current study have been discussed more thoroughly in the revised main text.

      Reviewer #2 (Public review):

      Summary:

      The authors define the m6A methyltransferase Mettl5 as a novel sleep-regulatory gene that contributes to specific aspects of Drosophila sleep behaviors (i.e., sleep drive and arousal at early night; sleep homeostasis) and propose the possible implication of Mettl5-dependent clocks in this process. The model was primarily based on the assessment of sleep changes upon genetic/transgenic manipulations of Mettl5 expression (including CRISPR-deletion allele); differentially expressed genes between wild-type vs. Mettl5 mutant; and interaction effects of Mettl5 and clock genes on sleep. These findings exemplify how a subclass of m6A modifications (i.e., Mettl5-dependent m6A) and possible epi-transcriptomic control of gene expression could impact animal behaviors.

      Strengths:

      Comprehensive DEG analyses between control and Mettl5 mutant flies reveal the landscape of Mettl5-dependent gene regulation at both transcriptome and translatome levels. The molecular/genetic features underlying Mettl5-dependent gene expression may provide important clues to molecular substrates for circadian clocks, sleep, and other physiology relevant to Mettl5 function in Drosophila.

      Weaknesses:

      While these findings indicate the potential implication of Mettl5-dependent gene regulation in circadian clocks and sleep, several key data require substantial improvement and rigor of experimental design and data interpretation for fair conclusions. Weaknesses of this study and possible complications in the original observations include but are not limited to:

      (1) Genetic backgrounds in Mettl5 mutants: the heterozygosity of Mettl5 deletion causes sleep suppression at early night and long-period rhythms in circadian behaviors. The transgenic rescue using Gal4/UAS may support the specificity of the Mettl5 effects on sleep. However, it does not necessarily exclude the possibility that the Mettl5 deletion stocks somehow acquired long-period mutation allelic to other clock genes. Additional genetic/transgenic models of Mettl5 (e.g., homozygous or trans-heterozygous mutants of independent Mettl5 alleles; Mettl5 RNAi etc.) can address the background issue and determine 1) whether sleep suppression tightly correlates with long-period rhythms in Mettl5 mutants; and 2) whether Mettl5 effects are actually mapped to circadian pacemaker neurons (e.g., PDF- or tim-positive neurons) to affect circadian behaviors, clock gene expression, and synaptic plasticity in a cell-autonomous manner and thereby regulate sleep. Unfortunately, most experiments in the current study rely on a single genetic model (i.e., Mettl5 heterozygous mutant).

      We believe that the multiple rescue experiments presented in Figure 1H-L and Figure 2H-L have effectively addressed the background concern. To further confirm this, we have subsequently repeated sleep and circadian rhythm assays using RNAi lines, aiming to further eliminate any remaining concerns in this regard. It appears to replicate the reduced sleep phenotype seen at night. This result has been included in the Figure S1. It is true that we have not specifically addressed whether the effects of Mettl5 are mapped to circadian pacemaker neurons in this study. We acknowledge this as a limitation and appreciate the importance of this question. Further investigations focusing on circadian pacemaker neurons, such as PDF- or tim-positive neurons, would be necessary to clarify the precise role of Mettl5 in regulating circadian behaviors and related molecular mechanisms.

      (2) Gene expression and synaptic plasticity: gene expression profiles and the synaptic plasticity should be assessed by multiple time-point analyses since 1) they display high-amplitude oscillations over the 24-h window and 2) any phase-delaying mutation (e.g., Mettl5 deletion) could significantly affect their circadian changes. The current study performed a single time-point assessment of circadian clock/synaptic gene expression, misleading the conclusion for Mettl5 effects. Considering long-period rhythms in Mettl5 mutant clocks, transcriptome/translatome profiles in Mettl5 cannot distinguish between direct vs. indirect targets of Mettl5 (i.e., gene regulation by the loss of Mettl5-dependent m6A vs. by the delayed circadian phase in Mettl5 mutants).

      In the revised version, we provided data collected at multiple time points. Specifically, we reexamined the per expression at both transcriptional and translational levels at different timepoints. The corresponding results were incorporated in Figure 4 D-F. We also dissected fly brains from UAS-DenMark, UAS-syt.eGFP/+; pdf-GAL4/+ and UAS-DenMark, UAS-syt.eGFP/+; pdf-GAL4/Mettl5<sup>1bp</sup> at these four time points to quantify the synaptic structures of PDF neurons. The result has been included in revised Figure 6.

      (3) The text description for gene expression profiling and Mettl5-dependent gene regulation was very detailed, yet there is a huge gap between gene expression profiling and sleep/behavioral analyses. The model in Figure 5 should be better addressed and validated.

      Thank you for your suggestion. We added data to better confirm the expression changes of PER protein at different time points. Indeed, what you mention is the weak point of this paper. We did analysis thoroughly during the revision process.

      The opposing changes in Period at the transcriptional versus translational levels puzzled us for some time until we identified alterations in the ubiquitin pathway components. The regulation of Period protein degradation by the ubiquitin-proteasome pathway is well-documented (Grima et al., 2002; Ko et al., 2002; Chiu et al., 2008). Additionally, previous studies have shown that N6-methyladenosine (m6A) modulates the ubiquitin-proteasome pathway in skeletal muscle physiology (Sun et al., 2023). We have incorporated this information into the revised manuscript in the last paragraph under the section titled: Clock gene regulatory loop regulating circadian rhythm was affected by Mettl5<sup>1bp</sup>

      Indeed, we have not yet identified an effective method to manipulate proteasome levels in genetic tests. The proteasome is a large protein complex composed of numerous subunits, making it impractical to enhance its activity simply by overexpressing individual components. Furthermore, the proteasome plays a critical role in many biological processes. Disrupting its function—such as through MG132 treatment, which we attempted—leads to significant off-target effects.

      Sun J, Zhou H, Chen Z, et al. Altered m6A RNA methylation governs denervation-induced muscle atrophy by regulating ubiquitin proteasome pathway. J Transl Med. 2023;21(1):845. Published 2023 Nov 23. doi:10.1186/s12967-023-04694-3

      Grima, B. et al. The F-box protein slimb controls the levels of clock proteins period and timeless. Nature 420, 178–182 (2002).

      Ko, H. W., Jiang, J. & Edery, I. Role for Slimb in the degradation of Drosophila period protein phosphorylated by doubletime. Nature 420, 673–678 (2002).

      Chiu, J. C., Vanselow, J. T., Kramer, A. & Edery, I. The phosphooccupancy of an atypical SLIMB-binding site on PERIOD that is phosphorylated by DOUBLETIME controls the pace of the clock. Genes Dev. 22, 1758–1772 (2008).

      Reviewer #3 (Public review):

      Xiaoyu Wu and colleagues examined the potential role in sleep of a Drosophila ribosomal RNA methyltransferase, mettl5. Based on sleep defects reported in CRISPR generated mutants, the authors performed both RNA-seq and Ribo-seq analyses of head tissue from mutants and compared to control animals collected at the same time point. While these data were subjected to a thorough analysis, it was difficult to understand the relative direction of differential expression between the two genotypes. In any case, a major conclusion was that the mutant showed altered expression of circadian clock genes, and that the altered expression of the period gene in particular accounted for the sleep defect reported in the mettl5 mutant. As noted above, a strength of this work is its relevance to a human developmental disorder as well as the transcriptomic and ribosomal profiling of the mutant. However, there are numerous weaknesses in the manuscript, most of which stem from misinterpretation of the findings, some methodological approaches, and also a lack of method detail provided. The authors seemed to have missed a major phenotype associated with the mettl5 mutant, which is that it caused a significant increase in period length, which was apparent even in a light: dark cycle. Thus the effect of the mutant on clock gene expression more likely contributed to this phenotype than any associated with changes in sleep behavior.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Some of the questions that the authors should address are the following ones:

      How does Mettl5 control the translation of the clock genes ? Why the level of some genes are specifically increased or decreased? What is the relation with the effect on uORF and dORF, overlapping and non overlapping ones? The observation of these defects is interesting but how they occurs and how they impact clock signaling is missing.

      Thank you for your suggestion. This is the weak point of this paper. We did analysis thoroughly during the revision process.

      The opposing changes in Period at the transcriptional versus translational levels puzzled us for some time until we identified alterations in the ubiquitin pathway components. The regulation of Period protein degradation by the ubiquitin-proteasome pathway is well-documented (Grima et al., 2002; Ko et al., 2002; Chiu et al., 2008). Additionally, previous studies have shown that N6-methyladenosine (m6A) modulates the ubiquitin-proteasome pathway in skeletal muscle physiology (Sun et al., 2023). We have incorporated this information into the revised manuscript in the last paragraph under the section titled: Clock gene regulatory loop regulating circadian rhythm was affected by Mettl5<sup>1bp</sup>.

      Indeed, we have not yet identified an effective method to manipulate proteasome levels in genetic tests. The proteasome is a large protein complex composed of numerous subunits, making it impractical to enhance its activity simply by overexpressing individual components. Furthermore, the proteasome plays a critical role in many biological processes. Disrupting its function—such as through MG132 treatment, which we attempted—leads to significant off-target effects.

      In this study, we also observed codon usage alterations caused by the mettl5 mutant. For details, please refer to our responses to 4th question of the weakness session above. Previous studies have reported mettl5's role in translational regulation in other systems (Rong et al., 2020; Peng et al., 2022). Based on these findings, we propose that both translational regulation and protein degradation may contribute to the upregulation of Period protein in the mettl5 mutant. This hypothesis has been included in the Discussion section of the revised manuscript.

      “The mechanism by which METTL5 regulates translation warrants further investigation. Previous studies have demonstrated that METTL5 influences translation (Rong et al., 2020; Peng et al., 2022), but whether the mechanisms identified here are conserved across other systems remains an intriguing question. In our analysis, we observed increased usage of aspartate (Asp) codons in Mettl5 mutants. Notably, prior work has linked codon usage to PER protein function—specifically, a codon-optimized version of PER failed to rescue circadian rhythmicity in per mutant flies, unlike the wild-type version (Fu et al., 2016). Further analysis revealed that PER protein levels were elevated in these mutants, suggesting that codon optimization enhances PER expression (Figure 2B in Fu et al., 2016). Strikingly, when we examined the codon-optimized region from Fu et al. (2016), we found that GAC (Asp) was highly enriched, raising the possibility that Mettl5 mutation affects PER protein accumulation by altering GAC codon usage. Additional experiments will be needed to validate this hypothesis. Furthermore, we detected changes in upstream open reading frames (uORFs) in Mettl5 mutants, but their relationship to translational regulation requires further exploration.”

      References

      Sun J, Zhou H, Chen Z, et al. Altered m6A RNA methylation governs denervation-induced muscle atrophy by regulating ubiquitin proteasome pathway. J Transl Med. 2023;21(1):845. Published 2023 Nov 23. doi:10.1186/s12967-023-04694-3

      Grima, B. et al. The F-box protein slimb controls the levels of clock proteins period and timeless. Nature 420, 178–182 (2002).

      Ko, H. W., Jiang, J. & Edery, I. Role for Slimb in the degradation of Drosophila period protein phosphorylated by doubletime. Nature 420, 673–678 (2002).

      Chiu, J. C., Vanselow, J. T., Kramer, A. & Edery, I. The phosphooccupancy of an atypical SLIMB-binding site on PERIOD that is phosphorylated by DOUBLETIME controls the pace of the clock. Genes Dev. 22, 1758–1772 (2008).

      Rong B, Zhang Q, Wan J, et al. Ribosome 18S m<sup>6</sup>A Methyltransferase METTL5 Promotes Translation Initiation and Breast Cancer Cell Growth. Cell Rep. 2020;33(12):108544. doi:10.1016/j.celrep.2020.108544

      Peng H, Chen B, Wei W, et al. N<sup>6</sup>-methyladenosine (m<sup>6</sup>A) in 18S rRNA promotes fatty acid metabolism and oncogenic transformation. Nat Metab. 2022;4(8):1041-1054. doi:10.1038/s42255-022-00622-9

      Fu J, Murphy KA, Zhou M, Li YH, Lam VH, Tabuloc CA, Chiu JC, Liu Y. Codon usage affects the structure and function of the Drosophila circadian clock protein PERIOD. Genes Dev. 2016 Aug 1;30(15):1761-75.

      Reviewer #2 (Recommendations for the authors):

      Please find my comments to improve the quality of your manuscript.

      Major comments

      (1) The quality of text writing in English needs to be at publishable levels. It is not a trivial problem, but it literally impairs the readability of your work. So please have professionals edit your manuscript text appropriately.

      We have carefully revised the language throughout the manuscript during the revision process.

      (2) Fig 1O: please include the total sleep profile and other analyses for rebound sleep phenotypes in control vs. Mettl5 to better validate that both genotypes were comparably sleep-deprived, but the latter shows less sleep rebound.

      Thank you for your suggestion, The other reviewer also suggested to reanalyze the sleep rebound data. We did the analysis according to the following reference. We included data sleep profiles of both genotypes in original Fig 1O. Total sleep profile and other analyses for rebound sleep phenotypes are included in the revised panel. As shown in this revised panel (now Figure 1K, L), both genotypes were comparably sleep-deprived.

      Cirelli C, Bushey D, Hill S, Huber R, Kreber R, Ganetzky B, Tononi G. 2005. Reduced sleep in Drosophila Shaker mutants. Nature 434:1087-92.

      (3) Line 90: the authors did not actually address this critical question. Additional Gal4 mapping (e.g., Mettl5 rescue or Mettl5 RNAi) will determine which cells/neural circuits are important for Mettl5-dependent sleep.

      This sentence has been revised into “The observed expression pattern of Mettl5 further supports its sleep regulatory function.”

      (4) Fig 1H-L; Fig 2H-L: the authors should check if overexpression of wild-type or mutant Mettl5 in control backgrounds could affect nighttime sleep to better define the transgenic effects among overexpression, rescue, and dominant-negative.

      Thank you for the comment. We added the overexpression phenotypes in the revised version.

      (5) Lines 225-226. Fig S11: The neural projections from PDF-expressing neurons should be better imaged and quantified. Current images can visualize PDF projections onto the optic lobe but not others (e.g., dorsal, POT), so the conclusion is not validated.

      Thank you for the suggestion. We acknowledge the limitation in the current images of PDF-expressing neuronal projections. We included new, higher-resolution images to better visualize and quantify the neural projections, including the dorsal and POT regions, to ensure the conclusion is well-supported.

      (6) Lines 230-232: per RNA/PER protein expression oscillates daily, so the authors should perform time-point experiments to conclude Mettl5 effects on clock gene expression, including per.

      Thank you for the insightful comment. We performed experiments in the Mettl5 mutant background at four time points to analyze PER protein expression using both RT-PCR and Western blot (anti-PER). The updated results have been included in Figure 4D-F.

      (7) Lines 235-238: the authors should note that Mettl5 effects on sleep in Clk or per mutant backgrounds are actually opposite to those in w1118/control one. Mettl5 deletion promotes daytime or nighttime sleep in Clk or per mutants, respectively. Any explanation? 

      We are trying to use epistasis analysis to determine which gene is upstream here. Epistasis (or epistatic effect) in genetics refers to the interaction between different genes where the expression of one gene (the epistatic gene) masks or modifies the expression of another gene (the hypostatic gene). The epistatic gene (masking gene) usually functions downstream in the pathway because its effect overrides the output of the hypostatic gene. The double mutant showed the similar phenotype as downstream genes. Thus, Clk or per functions downstream of Mettl5.

      (8) Fig 6: The dorsal PDF projections actually show time-dependent plasticity. Results from the single time-point are not conclusive.

      Thank you for the insightful comment. we further dissected fly brains from UAS-DenMark, UAS-syt.eGFP/+; pdf-GAL4/+ and UAS-DenMark, UAS-syt.eGFP/+; pdf-GAL4/Mettl5<sup>1bp</sup> at these four time points to analyze the morphology of PDF neurons. The results have been included in figure 6.

      Minor comments

      (1) Please avoid simple bar graphs in the data presentation-include individual data points or use a different graph showing the distribution of raw data (e.g., violin plot, box plot, etc.).

      Thank you for the suggestion. In the revised version of the manuscript, we have included individual data points, violin plots, and box plots to present the data, effectively showing both the distribution and differences in the raw data.

      (2) Line 19: "Clock" indicates the gene name or general terminology such as "circadian clock". Please clarify it and revise the font accordingly.

      This has been revised into“clock”

      (3) The overall flow in the Abstract/Summary is somewhat challenging for a general audience to follow.

      We have revised the text, especially the overall flow in the Abstract/Summary.

      (4) Fonts for the names of genes and gene products (i.e., mRNA, protein) should be appropriately corrected throughout the manuscript.

      We have checked the text and made changes where necessary.

      (5) Methods: the authors should provide detailed information on the methods. For instance, there is little description of how they generate Mettl5 deletions (e.g., sgRNA/target sequence). Also, they should clarify whether they test heterozygous vs. homozygous mutants of Mettl5 deletions in each experiment since the genotype description in the figure appears mixed-up (e.g., Fig 1B vs. Fig 1I-L).

      Thank you for pointing this out. In the updated version, we provided detailed information about the strains used, including the sgRNA/target sequences for generating Mettl5 deletions. Regarding the genotypes, Figure 1B represents homozygous mutants, while Figures 1I-L represent heterozygous mutants. This distinction has been clarified in the figure legends, and the genotype notation for Figures 1I-L will be revised for consistency and clarity.

      (6) Fig 1: the figure panels should be re-arranged based on the order of their text description (i.e., Fig 1H-L should go after Fig 1M-O).

      Thank you for the suggestion. In the revised version, we rearranged the figure panels so that Figures 1H-L appear after Figures 1M-O, following the order of their description in the text.

      (7) Sleep education in Trmt112 RNAi looks different from that in Mettl5 mutant het. Any explanation?

      The functional divergence between Trmt112 and Mettl5 may also contribute to the observed sleep phenotype. While Trmt112 and Mettl5 share some downstream targets, they each regulate many unique genes, some of which could influence sleep. Sleep is a highly sensitive trait that can be modulated by numerous genetic factors. Previous studies have also suggested that sleep behaves more like a quantitative trait, reflecting the combined effects of multiple genes (Mackay and Huang, 2018).

      Mackay TFC, Huang W. Charting the genotype-phenotype map: lessons from the Drosophila melanogaster Genetic Reference Panel. Wiley Interdiscip Rev Dev Biol. 2018;7(1):10.1002/wdev.289. doi:10.1002/wdev.289

      Reviewer #3 (Recommendations for the authors):

      A detailed critique is provided below. Generally, the authors can greatly improve this manuscript if they focus more rigorously on the circadian phenotype associated with the Mettl5 mutant, which could be the basis for the apparent sleep phenotype.

      (1) Please provide more information as to how each of the mettl5 mutants were generated. This information should include, specifically, the gRNA sequences, plasmids generated for the 5' and 3' arms, and anything related to the CRISPR approach for generating the mutants. Was any sequencing done to verify the CRISPR alleles, or was this limited to the analysis of mettl5 expression and behavior? Please indicate where the qPCR primers (used in Fig 1B) are located relative to the mutant loci. The figure legend is also incomplete in that there is no reference to the boxed area in Fig 1A.

      In the updated version, we have provided detailed information about the how each of the mettl5 mutants were generated. The sequence was verified by sequencing following PCR. The following references to the boxed area were added in the revised version.

      Reference

      Iyer LM, Zhang D, Aravind L. Adenine methylation in eukaryotes: Apprehending the complex evolutionary history and functional potential of an epigenetic modification. Bioessays. 2016 Jan;38(1):27-40. doi: 10.1002/bies.201500104.

      (2) As noted, I am not in agreement with the interpretation of findings for the sleep defect reported in the mettl5[1b]/+ mutants. There is a clear increase in morning sleep in the mutants that may not have reached significance by lumping the data in 12h increments (Fig1C-E). Were the overall 24h sleep values between the mutants and controls the same? The sleep profile appears to be shifted, such that nighttime sleep onset in the mutants occurs much later than wild type, and daytime waking is also much later, all pointing to a long period phenotype, which is very strongly supported by the data in Table 1, as well as the RNA- and ribo-seq data. The implications for this leading to sleep disturbances in humans is very exciting. An additional suggestion to the authors here is to report the nighttime sleep latency values (time to onset of the first sleep bout after lights off).

      We appreciate your insightful observation. As shown in Table 1, the Mettl51bp/+ mutant exhibits a robust long-period phenotype, with circadian rhythms significantly extended to 28.3 ± 0.4 hours compared to the wild-type's 23.9 ± 0.05 hours. This prolonged period perfectly aligns with the observed behavioral phenotypes, including delayed nighttime sleep onset, later daytime waking, and the overall shift in sleep profile. This is indeed quite similar to previous report on Period3 variant (Zhang et al., 2016). We agree that the prolonged circadian period contributes to the observed sleep phenotype. However, since total sleep time was significantly reduced in the mutant, we cannot attribute the phenotype solely to period lengthening. Furthermore, our 24-hour PER expression analysis in mettl5 mutants revealed elevated PER protein levels at ZT1 and ZT18, while ZT6 and ZT12 showed no significant changes, with no apparent phase shift. These findings collectively suggest that the phenotype primarily results from PER protein stabilization and accumulation.

      Importantly, genetic rescue experiments restoring wild-type Mettl5 function (UAS-Mettl5/Mettl5-Gal4; Figure 1 and Table 1) completely normalized the circadian period to 24 ± 0.02 hours, providing compelling evidence that these phenotypes specifically result from loss of Mettl5 function. Together with the sleep architecture data, these findings establish Mettl5 as a crucial regulator of circadian rhythms, with important implications for understanding human sleep disorders. To further substantiate these observations, we have now included quantitative nighttime sleep latency measurements in the revised manuscript to better document the delayed sleep onset in mutants (Figure S1G).

      We have discussed this in the third paragraph of the Discussion session and included the reference in the revised manuscript.

      Zhang L, Hirano A, Hsu PK, et al. A PERIOD3 variant causes a circadian phenotype and is associated with a seasonal mood trait. Proc Natl Acad Sci U S A. 2016;113(11):E1536-E1544. doi:10.1073/pnas.1600039113.

      (3) The description for how circadian behavior was measured and analyzed (Table 1) is missing from the methods section.

      We have included a detailed description of the methods used to measure and analyze circadian behavior, as presented in Table 1, in the revised methods “Sleep behavior assays” section.

      (4) Please explain what the "awake %" values reported in Figs 1G, 1L, Fig 2G, and 2L, Fig 4G and 4M are. Is this simply the number of flies that are awake at a given time point? This does not provide useful information beyond what is already reported for the sleep profiling in other parts of these figures. If it is an arousal threshold assay, as shown in supplementary Fig 1H, please indicate this. The description for "sleep arousal" in the methods (lines 368-371) is also concerning. If most of the mutant flies are already awake at ZT 14, then I would expect that this assay would not work at this time of day. A more suitable time point would be ZT 19, or later, when the mutants are falling asleep. Moreover, calculating the number of flies awakened as long as 5 minutes after a stimulus pulse cannot be distinguished from a spontaneous awakening, and so is not really a metric of arousal threshold. The number of sleeping flies awakened by the stimulus should be calculated within, at most, one minute afterward.

      Thank you for your suggestion. Regarding the 'awake %' metric, it indicates that at specific time points (e.g., ZT14), the percentage of awake fruit fly population at that moment. In the revised version, we further clarify the definition and significance of 'awake %'. Additionally, we have reevaluated the time points for the arousal threshold assay, selecting a more appropriate time (e.g., ZT19) to better reflect the sleep state of the mutants. Based on your suggestion, we calculate the number of flies awakened within one minute after the stimulus to ensure a more accurate measurement of arousal threshold. This has been included in the revised Figure 1M.

      (5) Fig1M-O is problematic. First, is it possible that expression of Mettl5 mRNA fluctuates with time-of-day and is not affected by sleep loss? There are no undisturbed controls collected at equivalent time points. The method used for quantifying sleep rebound in Fig 1O (lines 365-367) does not make sense, as negative values would be expected. Moreover, since the Mettl5 mutants show high sleep amounts in the morning and very low sleep amounts from ZT 12-18, this analysis would be severely confounded. Also, the sleep deprivation applied would not produce equivalent amounts of sleep loss as compared to wild type controls, so this also needs to be corrected. The authors should consider consulting Cirelli et al (2005, DOI: 10.1038/nature03486 ) as an approach for quantifying sleep homeostasis in a short-sleeping mutant. Please also show the sleep profiling in the mutants for these experiments.

      Thank you for your valuable suggestions. Regarding the possibility that Mettl5 mRNA expression fluctuates with circadian rhythms rather than being affected by sleep deprivation, we acknowledge that collecting undisturbed control samples at equivalent time points would provide critical insights. In the revised version, we included undisturbed controls to distinguish between circadian-driven fluctuations and the effects of sleep deprivation on Mettl5 expression.

      For the quantification of sleep rebound in Figure 1O, we agree that the current method may not fully capture the dynamics of sleep recovery, especially in Mettl5 mutants, where sleep patterns differ significantly from wild-type. We have referred to the method proposed by Cirelli et al. paper for quantifying sleep homeostasis in short-sleeping mutants, ensuring a more accurate evaluation of sleep rebound. The results have been included in Figure 1K-L of the revised version.

      (6) Fig 3B and C (minor) - while the volcano plots are clear, it is not clear whether "down" or "up" means for the mutant relative to wild type or the other way around? Please clarify. In Fig 3P, the legend indicates a depiction of the "top 5 pathway associated genes", but it seems there are 10 pathways depicted. Which of these are the "top 5"?

      In the volcano plots (Fig. 3B and 3C), “up” and “down” refer to genes that the mutant relative to the wild-type strain. In Fig. 3P, the legend was mislabeled as “top 5” pathway-associated genes. In fact, we displayed the top 10 pathway-associated genes. We apologize for the confusion and will correct both the figure legend and the corresponding text in our revised manuscript.

      (7) Fig 4 D-E, and F,G do not have sufficient information to draw the conclusion that Per mRNA/protein expression is increased in the Mettl5 mutant. Since both mRNA protein of this gene oscillates significantly throughout the day, it is still possible that the single time point shown in this figure might indicate a disruption in cycling rather than overall expression level. Please first indicate what time of day the tissue was collected, second, consider adding more time points to both assays. For the first part of this figure, A and B, per and Clock gene expression are expected to be in different phases, and so this aspect is not unexpected. However, it is notable that it is reversed in the mutant vs wild type. Again, an alternate interpretation of this finding that the authors have not considered is a change in period duration of gene cycling.

      Thank you for your suggestion. For the PER WB experiments, we have included multiple time points in the revised version to more comprehensively evaluate PER expression in the Mettl5 mutant and better understand its circadian rhythm changes. We appreciate your observation regarding the potential changes in the period duration of gene cycling. This has been discussed in the 3<sup>rd</sup> paragraph of the Discussion session of the revised version.

      (8) The data shown in Figs 4H-M does not support the conclusion that "Clock and Per genes were downstream of Mettl5" (line 236-237). The daytime sleep phenotype, in particular, appears additive between both circadian genes and mutant because the morning sleep of the double mutant is much higher than either mutant by itself. Statistical comparisons between the double mutant and each clock mutant are also noticeably missing. These data are difficult to interpret. One potential explanation is that Mettl5 alters gene expression of non-circadian genes, and that the phenotypes become additive when both clock and Mettl5 genes are missing. A full molecular analysis of clock gene cycling in the Mettl5 mutant may help improve understanding of the relationship between the circadian clock Mettl5 gene expression. It may also be worthwhile checking whether Mettl5 gene expression itself shows a daily oscillation.

      Thank you for your suggestion. In the revised version, we have included four additional time points to analyze the oscillatory expression of Per and Clock in the Mettl5 mutant, providing a more comprehensive understanding of their circadian rhythm changes. In Figs 4H-M, we are trying to use epistasis analysis to determine which gene is upstream here. Epistasis (or epistatic effect) in genetics refers to the interaction between different genes where the expression of one gene (the epistatic gene) masks or modifies the expression of another gene (the hypostatic gene). The epistatic gene (masking gene) usually functions downstream in the pathway because its effect overrides the output of the hypostatic gene. The double mutant showed the similar phenotype as downstream genes. Thus, Clk or per functions downstream of Mettl5. Statistical comparisons between the double mutant and each clock mutant are added.

      (9) In Fig 6, what time of day were the flies collected? PDF terminal morphology is known to change throughout the day; this is another piece of data that could indicate a defect in circadian function rather than a chronic change in synaptic morphology.

      The flies were collected around ZT14. We included additional dissection time points in future experiments. Differences between the control and Mettl5 mutants are observed consistently across multiple time points, suggesting that Mettl5 has an impact on synaptic plasticity.

      Minor:

      There are letter indicators, presumably for statistical comparisons, depicted in Figs 1 and 2 (panels I-L), but no explanation as to what these mean in the figure legends.

      We have added notes in the revised version.

      What is the purpose of the boxed regions shown in Fig S1A-F? There is no explanation of these in the figure legend nor in the text.

      The boxed regions highlight the significant co-localization of two proteins. We have included this explanation in the figure legend in the revised version.

      The statement (lines 310-311) that per and clock genes "exhibit more pronounced sleep rebound after sleep deprivation" is inaccurate. The article cited for this (Shaw et al 2002) showed that it was female mutants of the cycle gene which showed prolonged sleep rebound; other clock mutants were normal.

      Thank you for pointing out this. We revised the statement accordingly.

      Overall, the manuscript may benefit from editing or writing assistance to improve the language. There were many incomplete sentences, grammatical errors, etc.

      We have carefully refined the language throughout the manuscript during the revision process.

    1. eLife Assessment

      This fundamental work advances our understanding of the role of human hippocampal theta oscillations in memory encoding and retrieval. The evidence supporting the conclusions is convincing, using both scopolamine administration and intracranial EEG recordings. This work will be of broad interest to neuroscientists and has translational implications.

    2. Reviewer #1 (Public review):

      Summary:

      The authors report intracranial EEG findings from 12 epilepsy patients performing an associative recognition memory task under the influence of scopolamine. They show that scopolamine administered before encoding disrupts hippocampal theta phenomena and reduces memory performance, and that scopolamine administered after encoding but before retrieval impairs hippocampal theta phenomena (theta power, theta phase reset) and neural reinstatement but does not impair memory performance. This is an important study with exciting, novel results and translational implications. The manuscript is well written, the analyses are thorough and comprehensive, and the results seem robust.

      Strengths:

      - Very rare experimental design (intracranial neural recordings in humans coupled with pharmacological intervention);

      - Extensive analysis of different theta phenomena;

      - Well-established task with different conditions for familarity versus recollection;

      - Clear presentation of findings;

      - Translational implications for diseases with cholinergic dysfuction (e.g., AD);

      - Findings challenge existing memory models and the discussion presents interesting novel ideas.

    3. Reviewer #2 (Public review):

      Summary:

      In this study, performed in human patients, the authors aimed at dissecting out the role of cholinergic modulation in different types of memory (recollection-based vs familiarity and novelty-based) and during different memory phases (encoding and retrieval). Moreover, their goal was to obtain the electrophysiological signature of cholinergic modulation on network activity of the hippocampus and the entorhinal cortex.

      Strengths:

      Authors combined cognitive tasks and intracranial EEG recordings in neurosurgical epilepsy patients. The study confirms previous evidence regarding the deleterious effects of scopolamine, a muscarinic acetylcholine receptor antagonist, on memory performance when administered prior the encoding phase of the task. During both encoding and retrieval phases scopolamine disrupts the power of theta oscillations in terms of amplitude and phase synchronization. These results raise the question on the role of theta oscillations during retrieval and the meaning of scopolamine effect on retrieval-associated theta rhythm without cognitive changes. The authors clearly discussed this issue in the discussion session.

      A major point is the finding that scopolamine-mediated effect is selective for recollection-based memory and not for familiarity- and novelty-based memory.

      The methodology used is powerful and the data underwent a detailed and rigorous analysis.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors report intracranial EEG findings from 12 epilepsy patients performing an associative recognition memory task under the influence of scopolamine. They show that scopolamine administered before encoding disrupts hippocampal theta phenomena and reduces memory performance, and that scopolamine administered after encoding but before retrieval impairs hippocampal theta phenomena (theta power, theta phase reset) and neural reinstatement but does not impair memory performance. This is an important study with exciting, novel results and translational implications. The manuscript is well-written, the analyses are thorough and comprehensive, and the results seem robust.

      Strengths:

      (1) Very rare experimental design (intracranial neural recordings in humans coupled with pharmacological intervention).

      (2) Extensive analysis of different theta phenomena.

      (3) Well-established task with different conditions for familiarity versus recollection.

      (4) Clear presentation of findings and excellent figures.

      (5) Translational implications for diseases with cholinergic dysfunction (e.g., AD).

      (6) Findings challenge existing memory models, and the discussion presents interesting novel ideas.

      Weaknesses:

      (1) One of the most important results is the lack of memory impairment when scopolamine is administered after encoding but before retrieval (scopolamine block 2). The effect goes in the same direction as for scopolamine during encoding (p = 0.15). Could it be that this null effect is simply due to reduced statistical power (12 subjects with only one block per subject, while there are two blocks per subject for the condition with scopolamine during encoding), which may become significant with more patients? Is there actually an interaction effect indicating that memory impairment is significantly stronger when scopolamine is applied before encoding (Figure 1d)? Similar questions apply to familiarity versus recollection (lines 78-80). This is a very critical point that could alter major conclusions from this study, so more discussion/analysis of these aspects is needed. If there are no interaction effects, then the statements in lines 84-86 (and elsewhere) should be toned down.

      The reviewer highlights important concerns regarding the statistical power of the behavioral effects. We address these concerns in the revised manuscript in two ways: (1) we provide a supplemental analysis using a matched number of blocks between the placebo and scopolamine conditions to avoid statistical bias related to differing trial counts, and (2) we include a supplemental figure illustrating paired comparisons between blocks.

      (2) Further, could it simply be that scopolamine hadn't reached its major impact during retrieval after administration in block 2? Figure 2e speaks in favor of this possibility. I believe this is a critical limitation of the experimental design that should be discussed.

      The reviewer raises an important methodological concern regarding the time required for scopolamine's effect to manifest and the subsequent impact on the study outcomes. Previous studies report that the average time to maximum serum concentration after intravenous (IV) scopolamine administration is approximately 5 minutes (Renner et al., 2005), with the corresponding clinical onset estimated at 10 minutes. In our study, the retrieval period in Block 2 commenced at 15 ± 0.2 post-injection across all subjects. Given this timing, there is sufficient reason to conclude that scopolamine had reached its major impact during the Block 2 retrieval phase. Furthermore, the observation of significant disruptions to theta oscillations during this same retrieval phase provides strong evidence that the drug was in full effect at that time.

      (3) It is not totally clear to me why slow theta was excluded from the reinstatement analysis. For example, despite an overall reduction in theta power, relative patterns may have been retained between encoding and recall. What are the results when using 1-128 Hz as input frequencies?

      Slow theta (2–4 Hz) was excluded from the reinstatement analysis to avoid potential confounding effects. Given the observed disruption to slow theta power following scopolamine administration, any subsequent changes in slow theta reinstatement would be causally ambiguous, potentially arising directly from the power effects. Therefore, we would be unable to determine whether changes in slow theta reinstatement were genuinely independent of changes in power.

      (4) In what way are the results affected by epileptic artifacts occurring during the task (in particular, IEDs)?

      To exclude abnormal events and interictal activity, a kurtosis threshold of 4 was applied to each trial, effectively filtering out segments exhibiting significant epileptic artifacts.

      Reviewer #2 (Public review):

      Summary:

      In this study, performed in human patients, the authors aimed at dissecting out the role of cholinergic modulation in different types of memory (recollection-based vs familiarity and novelty-based) and during different memory phases (encoding and retrieval). Moreover, their goal was to obtain the electrophysiological signature of cholinergic modulation on network activity of the hippocampus and the entorhinal cortex.

      Strengths:

      The authors combined cognitive tasks and intracranial EEG recordings in neurosurgical epilepsy patients. The study confirms previous evidence regarding the deleterious effects of scopolamine, a muscarinic acetylcholine receptor antagonist, on memory performance when administered prior to the encoding phase of the task. During both encoding and retrieval phases, scopolamine disrupts the power of theta oscillations in terms of amplitude and phase synchronization. These results raise the question of the role of theta oscillations during retrieval and the meaning of scopolamine's effect on retrieval-associated theta rhythm without cognitive changes. The authors clearly discussed this issue in the discussion session. A major point is the finding that the scopolamine-mediated effect is selective for recollection-based memory and not for familiarity- and novelty-based memory.

      The methodology used is powerful, and the data underwent a detailed and rigorous analysis.

      Weaknesses:

      A limited cohort of patients; the age of the patients is not specified in the table.

      To comply with human subject privacy protection policies, age was not reported; however, we did not find any significant effects of age on the behavioral or neural measures.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Regarding dosage, did you take the patients' body weight into account? Do the effects hold when controlling for it?

      We controlled for participant weight, yet the observed effects were more strongly correlated with the absolute scopolamine dosage, irrespective of weight. This outcome indicates that scopolamine likely rapidly crosses the blood-brain barrier, producing swift effects that are not initially influenced by metabolic variability.

      (2) Line 96: Corrected for what kind of multiple comparisons?

      We apologize for this confusion. The statistical analysis presented in this line does not require multiple-comparison correction, and we will therefore remove the annotation.

      (3) Line 165: These are very interesting results. How do they relate to Rizzuto et al., NeuroImage, 2006?

      Our findings show that successful retrieval is tied to an encoding-retrieval phase match, which is a refinement and application of the Rizzuto et al. (2006) work. Rizzuto et al. showed that memory events are phase-locked; we show that maintaining a specific, matched phase relationship between encoding and retrieval events is critical for memory success, and that this process is dependent on the cholinergic system.

      Reviewer #2 (Recommendations for the authors):

      Figure 1b: It would be useful for clarity to have the cartoon of the treatment paradigm for the encoding phase (blocks 3 and 4).

      The treatment paradigm only involved a single intravenous (IV) injection of scopolamine (or saline, for the placebo condition). The injections were administered by the participant's attending nurse, with a board-certified anesthesiologist present at the time of injection and available throughout the experiment. These details are fully documented in the Methods section.

    1. eLife Assessment

      This valuable manuscript investigates the localisation of nutrient receptors in bloodstream stage trypanosomes, with implications for both nutrient uptake and immune evasion. Results after direct fixation of the cells in culture medium (as opposed to fixation after centrifugation) provide compelling evidence that the amounts of receptors on the surface of the cell, as opposed to the flagellar pocket, have previously been severely underestimated.

    2. Reviewer #1 (Public review):

      Summary:

      An interesting manuscript from the Carrington lab is presented investigating the behavior of single vs double GPI-anchored nutrient receptors in bloodstream form (BSF) T. brucei. These include the transferrin receptor (TfR), the HpHb receptor (HpHbR), and the factor H receptor (FHR). The central question is why these critical proteins are not targeted by host acquired immunity. It has generally been thought that they are sequestered in the flagellar pocket (FP), where they are subject to rapid endocytosis - any Ab:receptor complexes would be rapidly removed from the cell surface. This manuscript challenges that assumption by showing that these receptors can be found all over the outer cell body and flagella surfaces - if one looks in an appropriate manner (rapid direct fixation in culture media).

      Strengths and weaknesses:

      (1) The presence of a second ESAG6 gene in the BES7 expression site was noted in the previous review. This is now noted and discussed appropriately in the current version.

      (2) Surface binding studies: The ability of cells to bind tagged-Tf while in complete media was challenged and it was suggested that classic competition studies be performed to validate saturable ligand binding. This has been done now and the results confirm that this is so. A reasonable discussion of the results is presented.

      (3) Variable TfR expression in different BESs: The claim that specific ES environment is the dominant factor controlling TfR expression levels was challenged in that the presented results could be due to technical issues. RNA seq has now been performed confirming that the differences in TfR abundance is indeed directly related to mRNA levels

      (4) Surface immuno-localization of receptors: In regard to the novel immunofluorescence (direct fixation) methodology used to demonstrate TfR on the cell surface the authors were asked of they had attempted more traditional methods that involve centrifugation/washing. These data are now provided (Fig S5) and do indicate that centrifugation does reduce signal, likely due to shedding and/or internalization during the procedure. Nevertheless, significant signal is present after centrifugation leaving the issue of why others have never detected significant surface TfR.

      These responses address all the major concerns with the original submission and a greatly improved manuscript is now submitted.

    3. Reviewer #2 (Public review):

      The revised data support the conclusion that methodological differences can influence apparent receptor localization. However, key claims regarding functional surface engagement of TfR and hydrodynamic clearance remain based largely on indirect evidence and model-based interpretation. These conclusions should therefore be phrased more cautiously.

      I thank the authors for their careful rebuttal and the additional experiments included in the revised manuscript. The new fixation comparisons and transferrin competition assays substantially strengthen the technical basis of the study and address several of the original concerns.

      However, some conclusions remain more inferential than directly supported by the data. While the fixation and washing controls demonstrate that methodology influences apparent TfR localisation, they do not directly establish that previous protocols quantitatively redistribute surface TfR into the flagellar pocket. Statements implying such redistribution should therefore be phrased more cautiously.

      Similarly, the added transferrin binding controls argue against non-specific interactions, but functional engagement of surface-exposed TfR in intact bloodstream-form parasites remains supported mainly by indirect evidence. The proposed explanation involving rapid on/off rates and newly arriving receptors is plausible but should be more clearly identified as an inference.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      An interesting manuscript from the Carrington lab is presented investigating the behavior of single vs double GPI-anchored nutrient receptors in bloodstream form (BSF) T. brucei. These include the transferrin receptor (TfR), the HpHb receptor (HpHbR), and the factor H receptor (FHR). The central question is why these critical proteins are not targeted by host-acquired immunity. It has generally been thought that they are sequestered in the flagellar pocket (FP), where they are subject to rapid endocytosis - any Ab:receptor complexes would be rapidly removed from the cell surface. This manuscript challenges that assumption by showing that these receptors can be found all over the outer cell body and flagella surfaces, if one looks in an appropriate manner (rapid direct fixation in culture media).

      The main part of the manuscript focuses on TfR, typically a GPI1 heterodimer of very similar E6 (GPI anchored) and E7 (truncated, no GPI) subunits. These are expressed coordinately from 15 telomeric expression sites (BES), of which only one can be transcribed at a time. The authors identify a native E6:E7 pair in BES7 in which E7 is not truncated and therefore forms a GPI2 heterodimer. By in situ genetic manipulation, they generate two different sets of GPI1:GPI2 TfR combinations expressed from two different BESs (BES1 and BES7). Comparative analyses of these receptors form the bulk of the data.

      The main findings are:

      (1) Both GPI1 and GPI2 TfR can be found on the cell body/flagellar surface.

      (2) Both are functional for Tf binding and uptake.

      (3) GPI2 TfR is expressed at ~1.5x relative to GPI1 TfR

      (4) Ultimate TfR expression level (protein) is dependent on the BES from which it is expressed.

      Most of these results are quite reasonably explained in light of the hydrodynamic flow model of the Engstler lab and the GPI valence model of the Bangs lab. Additional experiments, again by rapid fixation, with HpHbR and FHR, show that these GPI1 receptors can also be seen on the cell surface, in contrast to published localizations.

      It is quite interesting that the authors have identified a native GPI2 TfR. However, essentially all of the data with GPI2 TfR are confirmatory for the prior, more detailed studies of Tiengwe et al. (2017). That said, the suggestion that GPI2 was the ancestral state makes good evolutionary sense, and begs the question of why trypanosomes prefer GPI1 TfR in 14 of 15 ESs (i.e., what is the selection pressure?)

      Strengths and weaknesses:

      (1) BES7 TfR subunit genes (BES7_Tb427v10): There are actually three (in order 5'3'): E7gpi, E6.1 and E6.2. E6.1 and E6.2 have a single nucleotide difference. This raises the issue of coordinate expression. If overall levels of E6 (2 genes) are not down-regulated to match E7 (1 gene), this will result in a 2x excess of E6 subunits. The most likely fate of these is the formation of non-functional GPI2 homodimers on the cell surface, as shown in Tiengwe et al. (2017), which will contribute to the elevated TfR expression seen in BES7.

      We would like to thank the reviewer for pointing out that there are two ESAG6 genes in BES7, we had relied on the publicly available annotation and should have known better.

      For transferrin expression levels, see the discussion in response to reviewer 1 point 3 below

      (2) Surface binding studies: This is the most puzzling aspect of the entire manuscript. That surface GPI2 TfR should be functional for Tf binding and uptake is not surprising, as this has already been shown by Tiengwe et al. (2017), but the methodology for this assay raises important questions. First, labeled Tf is added at 500 nM to live cells in complete media containing 2.5 uM unlabeled Tf - a 5x excess. It is difficult to see how significant binding of labeled TfR could occur in as little as 15 seconds under these conditions.

      The k<sub>on</sub> for transferrin is very rapid (BES1 TfR / bovine transferrin at pH7.4 = 4.5 x 10<sup>5</sup> M<sup>-1</sup>s<sup>-1</sup> (Trevor et al., 2019) and binding would occur to unoccupied receptors within 15 sec. The k<sub>off</sub> is also fast (BES1 TfR / bovine transferrin at pH7.4 = 3.6 x 10<sup>-2</sup> s<sup>-1</sup> (Trevor et al., 2019) and there would be exchange of transferrin within the time taken for endocytosis. These values are in vitro with purified proteins, the in vivo values may be affected by the VSG coat.

      The failure to bind canine transferrin (Supp. Figure 4B) acts as a control for specificity of the interaction.

      We have now performed a competition experiment as an additional control; cells in culture were supplemented with: A, 0.5 µM labelled transferrin; B, 0.5 µM labelled and 2.5 µM unlabelled transferrin; C, 0.5 µM labelled and 5 µM unlabelled transferrin, fixed after 60 s and visualised by fluorescence microscopy (Figure S4C). There was effective competition and greatly reduced binding of transferrin was seen in the presence of a 10-fold excess of unlabelled. We would like to thank the reviewer for suggesting this experiment.

      Second, Tiengwe et al. (2017) found that trypanosomes taken directly from culture could not bind labeled Tf in direct surface labelling experiments. To achieve binding, it was necessary to first culture cells in serum-free media for a sufficient time to allow new unligated TfR to be synthesized and transported to the surface. This result suggests that essentially all surface TfR is normally ligated and unavailable to the added probe.

      As part of the preliminary experiments for this paper we found that centrifugation followed by resuspension in either complete or serum free (but 1% BSA) medium resulted in a reduction is total cellular TfR and determined by western blotting. We have now included this experiment (Figure S4D). The inference from this experiment is that centrifugation and subsequently incubation will have an effect on receptor detection and endocytosis rates for a discreet time period.

      The amount of binding of labelled transferrin to cells in culture will depend on the specific activity of the labelled transferrin. This reasoning was behind the use of 0.5 µM labelled transferrin when roughly 1 in 6 molecules in the culture medium are labelled and there was only a small effect on the overall concentration of transferrin.

      Third, the authors have themselves argued previously, based on binding affinities, that all surface-exposed TfR is likely ligated in a natural setting (DOI:10.1002/bies.202400053). Could the observed binding actually be non-specific due to the high levels of fixative used?

      The absence of binding/uptake of canine transferrin argues against a non-specific interaction. In our previous publication, we did not pay enough attention to the on and off rates which allow for a degree of exchange and, here, TfR newly appearing on the cell surface has a 1 in 6 chance of binding a labelled transferrin.

      (3) Variable TfR expression in different BESs: It appears that native TfR is expressed at higher levels from BES7 compared to BES1, and even more so when compared to BES3. This raises the possibility that the anti-TfR used in these experiments has differential reactivity with the three sets of TfRs. The authors discount this possibility due to the overall high sequence similarities of E6s and E7s from the various ESs. However, their own analyses show that the BES1, BES3, and BES7 TfRs are relatively distal to each other in the phylogenetic trees, and this Reviewer strongly suspects that the apparent difference in expression is due to differential reactivity with the anti-TfR used in this work. In the grand scheme, this is a minor issue that does not impact the other major conclusions concerning TfR localization and function, nor the behavior of HpHbR and FHR. However, the authors make very strong conclusions about the role of BESs in TfR expression levels, even claiming that it is the 'dominant determinant' (line 189).

      This point is valid but exceptionally difficult to address at the protein level. As an orthogonal approach, we performed RNAseq analysis of the ‘wild type’ BES1, BES3, and BES7 cell lines to determine whether differences in receptor mRNA levels were consistent with the proposed difference in protein levels (Table S1). The analysis showed total ESAG6/7 mRNA levels to vary in a similar manner to the protein estimates with BES3 < BES1 < BES7 providing support for the differences in protein levels.

      The strongest evidence for the expression site determining the TfR level is the comparison of the cell lines in which the VSG were exchanged. This had no effect on TfR levels and so there is no evidence that the identity of the VSG alters TfR expression.

      (4) Surface immuno-localization of receptors: These experiments are compelling and useful to the field. To explain the difference with essentially all prior studies, the authors suggest that typical fixation procedures allow for clearance of receptor:ligand complexes by hydrodynamic flow due to extended manipulation prior to fixation (washing steps). Despite the fact that these protocols typically involve ice-cold physiological buffers that minimize membrane mobility, this is a reasonable possibility. Have the authors challenged their hypothesis by testing more typical protocols themselves? Other contributing factors that could play a role are the use of deconvolution, which tends to minimize weak signals, and also the fact that investigators tend to discount weak surface signals as background relative to stronger internal signals.

      We have added preliminary experiments that compared fixation protocols in two parts. First the effect on TfR levels of washing and resuspending cells discussed above (Figure S4D), and second how different fixation protocols alter apparent TfR immunolocalisation (Supp Figure S5A-B). The comparison shows that both the absence of glutaraldeyde and the use of washing alters the outcome.

      (5) Shedding: A central aspect of the GPI valence model (Schwartz et al., 2005, Tiengwe et al., 2017) is that GPI1 reporters that reach the cell body surface are shed into the media because a single dimyristoylglycerol-containing GPI anchor does not stably associate with biological membranes. As the authors point out, this is a major factor contributing to higher steady-state levels of cell-associated GPI2 TfR relative to GPI1 TfR. Those studies also found that the size/complexity of the attached protein correlated inversely with shedding, suggesting exit from the flagellar pocket as a restricting factor in cell body surface localization. The amount of newly synthesized TfR shed into the media was ~5%, indicating that very little actually exits the FP to the outer surface. In this regard, is it possible to know the overall ratio of cell surface:FP:endosomal localized receptors? Could these data not be 'harvested' from the 3D structural illumination imaging?

      A ratio could be determined but we did not do this as it would only be valid if the antibody has equal access to the internal TfR in a diluted VSG environment and the external VSG embedded in a densely packed and cross-linked VSG layer As such, we would have no confidence in the accuracy of any estimate.

      Reviewer #2 (Public review):

      The work has significant implications for understanding immune evasion and nutrient uptake mechanisms in trypanosomes.

      While the experimental rigor is commendable, revisions are needed to clarify methodological limitations and to broaden the discussion of functional consequences.

      The authors argue that prior studies missed surface-localized TfR due to harsh washing/fixation (e.g., methanol). While this is plausible, additional evidence would strengthen the claim.

      Preliminary experiments that compared fixation protocols are now included to show that method affects outcome.

      It remains unclear how centrifugation steps of various lengths (as in previous publications) can equally and quantitatively redistribute TfR into the flagellar pocket. If this were the case, it should be straightforward for the authors to test this experimentally.

      Not aware of previous studies that demonstrate equal and quantitative redistribution to the flagellar pocket. In previous reports, there is variation in cell surface/flagellar pocket localisation depending on expression levels, for example (Mussmann et al., 2003) (Mussmann et al., 2004), it’s worth noting that the increase in TfR expression in these papers is similar to the difference in the cell lines used here. In addition, most report the presence of TfR in endosomal compartments. In the experiments here, there are cells where the majority of signal from labelled transferrin is present in the flagellar pocket and the argument is that this is a stage of a continuous process in which the receptor picks up a transferrin on the cell surface and is swept towards the pocket.

      If TfR is distributed over the cell surface, live-cell imaging with fluorescent transferrin should be performed as a control. Modern detection limits now reach the singlemolecule level, and transient immobilization of live trypanosomes has been established, which would exclude hydrodynamic surface clearance as a confounding factor.

      This is non-trivial and is a longer-term aim. The immobilisation involves significant manipulation of the cells prior to restraining.

      In most images, TfR is not evenly distributed on the surface but rather appears punctate. Could this reflect localization to membrane domains? Immuno-EM with high-pressure frozen parasites could resolve this question and is relatively straightforward.

      There is a non-uniform appearance in the super-resolution images for both TfR and FHR. We cannot distinguish whether this represents random variation in receptor density over the cell surface or results from a biological phenomenon. Whatever the cause, the experiments showed unambiguous cell surface localisation.

      The authors might consider discussing whether differences in parasite life cycle stages (procyclic versus bloodstream forms) or culture conditions (e.g., cell density) affect localization. The developmentally regulated retention of GPI-anchored procyclin in the flagellar pocket might be worth mentioning.

      The aim of this paper was to determine the localisation of receptors in proliferating bloodstream form trypanosomes in culture. TfR and HpHbR are not expressed in insect stages in culture. FHR is expressed in insect stages and is present all over the cell surface (Macleod et al., 2020). A procyclin-based reporter was distributed over the whole cell surface in one report (Schwartz et al. 2005). In other reports, the retention of procyclin in the flagellar pocket of proliferating bloodstream forms is probably dependent on structure/sequence as other single GPI-anchored proteins, such as FHR (Macleod et al., 2020) and GPI-anchored sfGFP (Martos-Esteban et al., 2022) can access the surface.

      References:

      MacGregor, P., Gonzalez-Munoz, A. L., Jobe, F., Taylor, M. C., Rust, S., Sandercock, A. M., Macleod, O. J. S., Van Bocxlaer, K., Francisco, A. F., D’Hooge, F., Tiberghien, A., Barry, C. S., Howard, P., Higgins, M. K., Vaughan, T. J., Minter, R., & Carrington, M. (2019). A single dose of antibody-drug conjugate cures a stage 1 model of African trypanosomiasis. PLoS Neglected Tropical Diseases, 13(5), e0007373. https://doi.org/10.1371/journal.pntd.0007373

      Macleod, O. J. S., Bart, J.-M., MacGregor, P., Peacock, L., Savill, N. J., Hester, S., Ravel, S., Sunter, J. D., Trevor, C., Rust, S., Vaughan, T. J., Minter, R., Mohammed, S., Gibson, W., Taylor, M. C., Higgins, M. K., & Carrington, M. (2020). A receptor for the complement regulator factor H increases transmission of trypanosomes to tsetse flies. Nature Communications, 11(1), 1326. https://doi.org/10.1038/s41467-020-15125-y

      Martos-Esteban, A., Macleod, O. J. S., Maudlin, I., Kalogeropoulos, K., Jürgensen, J. A., Carrington, M., & Laustsen, A. H. (2022). Black-necked spitting cobra (Naja nigricollis) phospholipases A2 may cause Trypanosoma brucei death by blocking endocytosis through the flagellar pocket. Scientific Reports, 12(1), 6394. https://doi.org/10.1038/s41598-02210091-5

      Mussmann, R., Engstler, M., Gerrits, H., Kieft, R., Toaldo, C. B., Onderwater, J., Koerten, H., van Luenen, H. G. A. M., & Borst, P. (2004). Factors affecting the level and localization of the transferrin receptor in Trypanosoma brucei. The Journal of Biological Chemistry, 279(39), 40690–40698. https://doi.org/10.1074/jbc.M404697200

      Mussmann, R., Janssen, H., Calafat, J., Engstler, M., Ansorge, I., Clayton, C., & Borst, P. (2003). The expression level determines the surface distribution of the transferrin receptor in Trypanosoma brucei. Molecular Microbiology, 47(1), 23–35. https://doi.org/10.1046/j.13652958.2003.03245.x

      Schwartz, K. J., Peck, R. F., Tazeh, N. N., & Bangs, J. D. (2005). GPI valence and the fate of secretory membrane proteins in African trypanosomes. Journal of Cell Science, 118(Pt 23), 5499–5511. https://doi.org/10.1242/jcs.02667

      Trevor, C. E., Gonzalez-Munoz, A. L., Macleod, O. J. S., Woodcock, P. G., Rust, S., Vaughan, T. J., Garman, E. F., Minter, R., Carrington, M., & Higgins, M. K. (2019). Structure of the trypanosome transferrin receptor reveals mechanisms of ligand recognition and immune evasion. Nature Microbiology, 4(12), 2074–2081. https://doi.org/10.1038/s41564-019-0589-0

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Major Recommendations:

      (1) 2 E6 gene in BES7s: This does not affect the overall conclusions, but the text should be modified to reflect the existence of the second gene, and to discuss the ramifications.

      This has been corrected

      (2) Surface binding studies: To clarify this issue, two experimental approaches are strongly recommended. First: additional excess unlabelled Tf should be added. If binding is truly receptor-mediated, it must by definition be saturable at some experimentally achievable level. Second: TfR expression should be abrogated by RNAi silencing to show that binding is TfR-dependent. Without some validation of specific binding by one or both of these approaches, these counter-intuitive results must be questioned.

      The excess unlabelled transferrin experiment is now included (we would like to thank the reviewer for this suggestion). The absence of binding of canine transferrin provides strong evidence for the specificity.

      (3) Variable TfR expression in different BESs: To make such claims, quantitative RTPCR should be performed with conserved primers to assess the actual relative expression at the transcriptional level. Absent this, the claims should be eliminated, or at the very least greatly tempered.

      This has been done using an RNAseq analysis.

      (4) Surface immuno-localization of receptors: An example of discounting weak signals as background can be seen in Figure 8 of Duncan et al. (2024). It has also been shown that at least one other GPI1 reporter (procyclin) is readily detected on the outer cell surface under ectopic expression in BSF trypanosomes (Schwartz et al., 2005) using typical fixation procedures. This could be cited, and the authors could discuss the fact that procyclin is not a receptor and may not be susceptible to hydrodynamic drag.

      Yes

      Minor issues:

      (1) Fully appreciating the data presented requires an understanding of the hydrodynamic flow and GPI valence models of the Engstler and Bangs labs, respectively. For the uninitiated,d it might perhaps be useful to include brief summaries of each in the Introduction.

      Added to the introduction

      (2) Lines 110-112: ISG65 and ISG75 both have strong localizations in endosomal compartments. This should be noted with citation of any of the work from the Field lab.

      Added

      (3) Lines 121-132: This passage presents the role of GPI anchors (1 vs 2) in a rather digital manner (in or out). Schwartz et al (2005) present a much more nuanced view of what is likely taking place. This is one reason summaries of hydrodynamic flow and GPI valence would be helpful.

      Modified

      (4) Lines 182-184: The increased size of GPI-anchored E7 is in part due to the presence of the GPI itself, as the authors state, but there are also 24 additional amino acid residues in this protein that contribute.

      Modified

      (5) Lines 212-214: Do p>0.95 and p>0.99 indicate statistical significance? This must be a typo.

      Thank you, corrected

      (6) Lines 218-219: The better references documenting GPI number in regard to turnover/shedding are Schwartz et al. 2005 and Tiengwe et al. 2017.

      Changed

      (7) Line 241 and Figures 3, 4, and 6: The transverse sections add little to the presentation. That there is signal variation in all dimensions is readily apparent from the images themselves, and similar profiles would be obtained regardless of the transect. Was there some process/rationale in the selection of the individual transects intended to make a broader point? If so, a description of the process should be provided.

      The point was to show that the signal had a pattern consistent with plasma membrane (two distal peaks) as opposed to cytoplasm (single central peak). As such, we think it is important.

      (8) Lines 582-596: Methodology for quantitation of cellular fluorescent signals should be provided.

      Has been expanded

      Reviewer #2 (Recommendations for the authors):

      (1) As a less critical but still useful control, antibody accessibility assays on live versus fixed parasites could test whether VSG coats limit detection.

      This could only be quantified by using a range of monoclonal antibodies which are not available.

      (2) The rapid transferrin uptake (15-60 seconds) could reflect fast endocytic recycling rather than stable surface residency. A pulse-chase experiment tracking receptor movement would clarify this (though I acknowledge that this is technically challenging).

      We agree that endocytic recycling is probably the main source of unoccupied TfR on the cell surface. It is hard to see how the pulse chase experiment could be performed without centrifugation which will affect the outcome – see above.

      (3) Statistical and quantitative reporting

      Added as Table S2- S4

      (4) Report confidence intervals (e.g., for fluorescence intensity comparisons in Figure 3B) to contextualize claims of "no significant difference."

      We do not claim ‘no significant difference’ and the SD overlap due to a high level of variation in the population

      (5) Specify the number of biological replicates and cells analyzed per condition in the figure legends.

      Added

      (6) The study notes that surface-exposed receptors avoid antibody detection, but does not explore how.

      We don’t claim that receptors avoid detection and have published evidence to the contrary. The cell has evolved mechanisms to reduce/minimise the effect of antibody binding.

      (7) Comparing antibody binding to TfR in VSG221 versus VSG224 coats.

      This is already present in Figure 3D

      (8) Testing whether receptor shedding or conformational masking contributes to immune evasion.

      A lifetime’s work

      (9) Evolutionary trade-offs: Discuss why T. brucei maintains ~15 TfR variants if the GPI-anchor number has minimal impact on function (Figure 3).

      The possible reason for the evolution of ~15 TfR variants was discussed in a previous publication.

      (10) How do their findings align with recent studies on ISG75 surface exposure?

      If this refers to the finding that ISG75 is an Ig Fc receptor, this has been included

      (11) Add scale bars to 3D reconstructions (Figure 5).

      Added

      (12) Include a schematic summarizing key findings in the main text.

      Chosen not to do

      (13) Explicitly state where raw microscopy images, flow cytometry data, and analysis scripts are deposited.

      Microscope Images have deposited in Bioimage Archive repository at EMBL/EBI No flow cytometry used

      (14) Correct inconsistent GPI-anchor terminology (e.g., "glycosylphosphoinositol" to "glycosylphosphatidylinositol").

      Our typo, corrected

      (15) Clarify ambiguous phrases (e.g., "subtle mechanisms" in the Discussion).

      Corrected

    1. eLife Assessment

      This fundamental study measures the functional specialization of distinct subregions within the mouse posterior parietal cortex (PPC) using mesoscopic two-photon calcium imaging during visual discrimination and choice history-dependent tasks. It presents compelling evidence supporting the existence of functional specialized subregions within the PPC. The work will be of interest to system and computational neuroscientists interested in decision-making, working memory, and multisensory integration.

    2. Reviewer #1 (Public review):

      Summary:

      This study examined the functional organization of the mouse posterior parietal cortex (PPC) using meso-scale two-photon calcium imaging during visually-guided and history-guided tasks. The researchers found distinct functional modules within the medial PPC: area A, which integrates somatosensory and choice information, and area AM, which integrates visual and choice information. Area A also showed a robust representation of choice history and posture. The study further revealed distinct patterns of inter-area correlations for A and AM, suggesting different roles in cortical communication. These findings shed light on the functional architecture of the mouse PPC and its involvement in various sensorimotor and cognitive functions.

      Strengths:

      Overall, I find this manuscript excellent. It is very clearly written and built up logically. The subject is important, and the data supports the conclusions without overstating implications. Where the manuscript shines the most is the exceptionally thorough analysis of the data. The authors set a high bar for identifying the boundaries of the PPC subareas, where they combine both somatosensory and visual intrinsic imaging. There are many things to compliment the authors on, but one thing that should be applauded in particular is the analysis of the body movements of the mice in the tube. Anyone working with head-fixed mice knows that mice don't sit still but that almost invariable remains unanalyzed. Here the authors show that this indeed explained some of the variance in the data.

      Comments on revisions:

      I only had minor comments on the first version of the manuscript and these concerns were fully addressed after revision.

    3. Reviewer #2 (Public review):

      Summary:

      The posterior parietal cortex (PPC) has been identified as an integrator of multiple sensory streams and guides decision making. Hira et al observe that dissection of the functional specialization of PPC subregions requires simultaneous measurement of neuronal activity throughout these areas. To this end, they use widefield calcium imaging to capture the activity of thousands of neurons across the PPC and surrounding areas. They begin by delineating the boundaries between the primary sensory and higher visual areas using intrinsic imaging and validate their mapping using calcium imaging. They then conduct imaging during a visually guided task to identify neurons that respond selectively to visual stimuli or choice. They find that vision and choice neurons intermingle primarily in the anterior medial (AM) area, and that AM uniquely encodes information regarding both the visual stimulus and the previous choice, positioning AM as the main site of integration of behavioral and visual information for this task.

      Strengths:

      There is an enormous amount of data and results reveal very interesting relationships between stimulus and choice coding across areas and how network dynamics relate to task coding.

      Weaknesses:

      The enormity of the data and the complexity of the analysis makes the manuscript hard to follow. Sometimes it reads like a laundry list of results as opposed a cohesive story.

      Comments on revisions:

      The authors have addressed our concerns.

    4. Reviewer #3 (Public review):

      Summary:

      This work from Hira et al leverages mesoscopic 2-photon imaging to study large neural populations in different higher visual areas, in particular areas A and AM of the parietal cortex. The focus of the study is to obtain a better understanding of the representation of different task-related parameters, such as choice formation and short-term history, as well as visual responses in large neural populations across different cortical regions to obtain a better understanding of the functional specialization of neural populations in each region as well as the interaction of neural populations across regions. The authors image a large number of neurons in animals that either perform a visual discrimination or a history-dependent task to test how task demands affect neural responses and population dynamics. Furthermore, by including a behavioral perturbation of animal posture they aim to dissociate the neural representation of history signals from body posture. Lastly, they relate their functional findings to anatomical data from the Allen connectivity atlas and show a strong relation of functional correlations on anatomical connectivity patterns.

      Strengths:

      Overall, the study is very well done and tackles a problem that should be of high interest to the field by aiming to obtain a better understanding of the function and spatial structure of different regions in the parietal cortex. The experimental approach and analyses are sound and of high quality and the main conclusions are well supported by the results. Aside from the detailed analyses, a particular strength is the additional experimental perturbation of posture to isolate history-related activity which supports the conclusion that both posture and history signals are represented in different neurons within the same region.

      Weaknesses:

      The work does not focus on functional overlap at the single-cell level but on the spatial distribution of functional classes across areas. A minor weakness is therefore that it does not explicitly address how the finding of functional clusters relate to established notions of mixed selectivity within PPC.

    5. Author response:

      The following is the authors’ response to the original reviews.

      We sincerely appreciate your constructive feedback. Based on the comments from the three reviewers, we were able to substantially improve the manuscript. Below, we provide our point-by-point responses.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study examined the functional organization of the mouse posterior parietal cortex (PPC) using meso-scale two-photon calcium imaging during visually-guided and history-guided tasks. The researchers found distinct functional modules within the medial PPC: area A, which integrates somatosensory and choice information, and area AM, which integrates visual and choice information. Area A also showed a robust representation of choice history and posture. The study further revealed distinct patterns of inter-area correlations for A and AM, suggesting different roles in cortical communication. These findings shed light on the functional architecture of the mouse PPC and its involvement in various sensorimotor and cognitive functions.

      Strengths:

      Overall, I find this manuscript excellent. It is very clearly written and built up logically. The subject is important, and the data supports the conclusions without overstating implications. Where the manuscript shines the most is the exceptionally thorough analysis of the data. The authors set a high bar for identifying the boundaries of the PPC subareas, where they combine both somatosensory and visual intrinsic imaging. There are many things to compliment the authors on, but one thing that should be applauded in particular is the analysis of the body movements of the mice in the tube. Anyone working with head-fixed mice knows that mice don't sit still but that almost invariable remains unanalyzed. Here the authors show that this indeed explained some of the variance in the data.

      Weaknesses:

      I see no major weaknesses and I only have minor comments.

      Reviewer #2 (Public review):

      Summary:

      The posterior parietal cortex (PPC) has been identified as an integrator of multiple sensory streams and guides decision-making. Hira et al observe that dissection of the functional specialization of PPC subregions requires simultaneous measurement of neuronal activity throughout these areas. To this end, they use wide-field calcium imaging to capture the activity of thousands of neurons across the PPC and surrounding areas. They begin by delineating the boundaries between the primary sensory and higher visual areas using intrinsic imaging and validate their mapping using calcium imaging. They then conduct imaging during a visually guided task to identify neurons that respond selectively to visual stimuli or choices. They find that vision and choice neurons intermingle primarily in the anterior medial (AM) area, and that AM uniquely encodes information regarding both the visual stimulus and the previous choice, positioning AM as the main site of integration of behavioral and visual information for this task.

      Strengths:

      There is an enormous amount of data and results reveal very interesting relationships between stimulus and choice coding across areas and how network dynamics relate to task coding.

      Weaknesses:

      The enormity of the data and the complexity of the analysis make the manuscript hard to follow. Sometimes it reads like a laundry list of results as opposed to a cohesive story.

      Reviewer #3 (Public review):

      Summary: This work from Hira et al leverages mesoscopic 2-photon imaging to study large neural populations in different higher visual areas, in particular areas A and AM of the parietal cortex. The focus of the study is to obtain a better understanding of the representation of different task-related parameters, such as choice formation and short-term history, as well as visual responses in large neural populations across different cortical regions to obtain a better understanding of the functional specialization of neural populations in each region as well as the interaction of neural populations across regions. The authors image a large number of neurons in animals that either perform visual discrimination or a history-dependent task to test how task demands affect neural responses and population dynamics. Furthermore, by including a behavioral perturbation of animal posture they aim to dissociate the neural representation of history signals from body posture. Lastly, they relate their functional findings to anatomical data from the Allen connectivity atlas and show a strong relation between functional correlations on anatomical connectivity patterns.

      Strengths:

      Overall, the study is very well done and tackles a problem that should be of high interest to the field by aiming to obtain a better understanding of the function and spatial structure of different regions in the parietal cortex. The experimental approach and analyses are sound and of high quality and the main conclusions are well supported by the results. Aside from the detailed analyses, a particular strength is the additional experimental perturbation of posture to isolate history-related activity which supports the conclusion that both posture and history signals are represented in different neurons within the same region. Weaknesses: The main point that I found hard to understand was the fairly strong language on functional clusters of neurons while also stating that neurons encoded combinations of different types of information and leveraging the encoding model to dissociate these contributions. Do the authors find mixed selectivity or rather functional segregation of neural tuning in their data? More details on this and some other points are below.

      We thank the three reviewers for their accurate and expert evaluations.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) It wasn't clear to me why the authors focused on areas A and AM, but not RL. After all, at the beginning of the results, the authors ask: "PPC has been reported to have functions including visually guided decision-making and working memory. Do these functions differ among RL, A, and AM?".

      Thank you for the comment. The manuscript first characterizes AM as a region involved in visually guided decision-making and A as a region related to history and/or working memory. Subsequently, when discussing correlation structure, we stated the following:

      “In particular, based on the critical functional differences between A and AM that we found, A and AM may belong to distinct cortical networks that consist of different sets of densely interacting cortical areas.”

      Thus, the logical flow of our analysis is to first reveal the functional contrast between A and AM through comparative functional analyses across RL, A, and AM, and then to focus on this contrast. We speculate that RL may exhibit more distinctive functional properties in tasks that rely on whisker-based processing or related modalities. We have therefore revised the text as described below to avoid the impression that the manuscript places disproportionate emphasis on RL.

      Line 137: “PPC has been reported to have functions including visually guided decisionmaking and working memory. Do these functions differ among A, AM, and RL?”

      (2) Figures 2 E, F, and Figure 3A, could the authors indicate the trial structure better on these plots?

      Thank you for the comment. We have added explanations of the bar meanings to the figure legends.

      Figure 2:

      “(E) Representative vision neurons (ROI 1-4 in I). The red bars indicate sampling periods during video presentation, and the brown bars indicate sampling periods without video stimulation. Vertical black lines mark the onset of the sampling period. F. Representative choice neuron (ROI 5-8 in I) and a non-selective neuron (ROI 9). Light blue lines indicate the response periods in trials with left choices, and purple lines indicate the response periods in trials with right choices. Vertical black lines mark the onset of the response period.”

      Figure 3:

      “(A) The representative history neurons. Numbers correspond to that of panel B and C. Light blue lines indicate rewards delivered from the left lick port, and purple lines indicate rewards delivered from the right lick port. Vertical white lines mark the onset of the sampling period.”

      (3) There are several typos that need correcting. Also, small and big capital letters to demark the panel names in the legends have been mixed.

      Thank you for the comment. We have corrected the panel labels as described below.

      Figure 2 legend:

      “Representative choice neuron (ROI 5-8 in I) and a non-selective neuron (ROI 9)”

      Figure 3 legend:

      “..than the next choice. I. The decoding accuracy of the next choice …”

      Figure 3 legend:

      “Error bars, mean ± s.e.m. in I, 95% confidence interval in G. M, and O.”

      Supplementary Figure 6:

      “…neurons with rt ≥ 0.3 (blue) were shown. B. Trial-to-trial activity fluctuation … (rt ≥ 0.3, panel B) was color coded…”

      We thoroughly checked the manuscript for typographical errors and corrected the issues.

      (4) Many in the field still use the Paxinos nomenclature for PPC subfields, could the authors write something short about how these two nomenclatures correspond?

      We have described the relationship between our area definitions and those of Paxinos in the main text as follows.

      Line 702: “In addition to our definition, previous studies have also defined posterior parietal cortex (PPC) to include the higher visual areas A, AM, and RL (Glickfeld and Olsen, 2017; Wang et al., 2011). These areas partially overlap with the parietal association regions defined in the Paxinos atlas, including MPtA, LPtA, PtPD, and PtPR. For a detailed discussion of the correspondence and variability among these regional definitions, see Lyamzin and Benucci (2019).”

      (5) Analyzing choice history may be affected by the long fluorescence Ca transients and will depend on excellent event deconvolution. Could the authors show some more zoomed-in examples of how well their deconvolution works?

      We provide enlarged, trial-by-trial activity traces of the four example neurons shown in Figure 3A in Supplementary Figure 3G. In all neurons, multiple small calcium transients occur repeatedly throughout the delay period, which lasts longer than 10 s. If the sustained activity during the delay were simply due to a long decay time constant, one would expect a large calcium transient in the preceding trial that slowly decays over the delay period. However, such a pattern is not observed in the actual data. Also, since the decay time constant of GCaMP6s is on the order of ~1 s, signals persisting for ~10 s cannot be explained by slow decay alone.

      (6) The authors write: "the history neurons exhibited properties of working memory." However, note that this is not a working memory task since the mice don't need to keep evidence in memory, the direction to lick can be made at the very beginning of a trial.

      Behaviorally, demonstrating that an animal maintains working memory requires showing that its behavior changes based on retained information when new information is introduced, as in delayed match-to-sample tasks. In the present task, however, the correct action for the next trial is determined at the moment the action in the previous trial is completed, such that animals can simply switch to motor preparation at that point. Thus, from a strictly behavioral perspective, working memory is not required.

      On the other hand, during the inter-trial interval (ITI), information from the previous trial dominates over information from the upcoming trial (Fig. 3H), which is more consistent with retention of past information than with motor preparation. Moreover, trials in which neural activity maintained information about the previous trial’s action were associated with a higher probability of correct performance in the subsequent trial. In other words, retaining past information contributes to guiding correct behavior in the next trial.

      Based on these neural analyses, we interpret that mice retain information about their previous trial’s action history in working memory and use it to determine behavior in the subsequent trial. Accordingly, we consider ITI activity in PPC to reflect working memory rather than motor preparation. Nevertheless, we acknowledge that your concern is valid, and we have therefore revised the text as follows:

      Line 234: “These results suggest that the history neurons exhibited properties of working memory.”

      (7) In the section about the Choice History Task, the authors write: "Since the visual stimuli were randomly presented during the sampling period, the mice had to ignore the visual stimuli." Why continue to present the visual stimuli?

      Thank you for the suggestion. By designing the vision task and the history task to have identical structures, we can apply the same encoding and decoding models to both tasks, which facilitates direct comparison between them. This design makes it easier to examine how neuronal activity patterns change depending on task demands.

      Reviewer #2 (Recommendations for the authors):

      (1) I don't understand the logic of Figure S7 and the neuropil analysis in general. Neuropil activity is purported to represent input, so it seems unsurprising that nearby neurons would exhibit similar dynamics.

      Thank you for your comment. Your argument is correct, and it is not at all surprising that neuropil signals correlate with the activity of surrounding neurons. Here, we quantitatively examined the relationship between neuropil activity and the average activity of nearby neurons. In addition, in a separate analysis, we clarified the relationship between connectome information and neuropil activity. Taken together, these analyses reveal the relationship between connectome information and the local average of neuronal activity. We describe this point as follows:

      “Indeed, the trial-to-trial variation of a neuropil activity could be approximated by the average of 1,000–10,000 neurons within several hundred micrometers from the center (Figure S7).”

      Although we analyzed this phenomenon in the cases of areas A and AM, this finding should not be considered specific to A and AM but instead has broader, general significance. Accordingly, we added a new Results subsection and revised the manuscript as follows.

      Line 448: “Constraints and limits of anatomical connectivity on neuronal population activity Although we have so far focused on the differences between A and AM, our data provide broader insights into the relationship between anatomical connectivity and neuronal population activity. First, based on Figure S7 and the considerations above, anatomical input correlations strongly constrain the correlations between local averages of activity across thousands of neurons. We then asked whether this anatomical constraint extends beyond mean activity, and how anatomical input correlations relate to relationships between neuronal population activities (population vectors).

      The correlation between CC<sub>t</sub> and r<sub>anatomy</sub> was moderate (r = 0.60, Figure 6L). This moderate correlation did not change when the coupling neurons were eliminated (r = 0.61). Interestingly, the largest canonical component was the most unpredictable from the anatomical data (Figure 6M). Thus, while inter-area correlations based on the mean activity of neuronal populations are largely determined by anatomical input correlations, correlations between population vectors contain additional structure that cannot be captured by anatomical input correlations alone.

      One possible source of this additional structure is globally shared activity, which may reflect behavior, brain state, or levels of neuromodulators. To evaluate the contribution of global activity on the canonical correlation between areas, we first compared the canonical coefficient vectors (CCV). We found that the first CCV had a similar orientation, regardless of the paired areas (Figure6N). This indicates that the largest components of correlated activity in the CCA analysis are globally shared fluctuations. We also directly evaluated the correlated activity components across all 8 areas with generalized canonical correlation analysis. The first CCV also had a similar orientation to the first generalized canonical coefficient vector (GCCV) (Figure 6O). These results indicate that the largest canonical component reflects a global correlation across all cortical areas imaged. Such global correlations may be driven by factors beyond cortico-cortical or thalamo-cortical inputs, such as the animal’s behavioral state as we recently characterized (H. Imamura et al., 2025; F. Imamura et al., 2025). We also confirmed the robustness of these results by repeating analyses using only the 40% highly active neurons after denoising with non-negative deconvolution (36828 out of 91397 neurons; Figure S9).”

      (2) Furthermore, the neuropil signal likely contains signals from out-of-focus neurons that are presumably functioning similarly to the in-focus cells. Wouldn't the interesting question be to what extent the local neuropil signal in, for example, area A resembled that of neuronal activity in S1t?

      Thank you very much for your comment. We agree with your point. Based on the evaluation in Figure S7, the neuropil signal likely contains the average activity of several thousand local neurons, including out-of-focus contributions. The neuropil signal in area A may also partially reflect neuronal activity from the neighboring S1t area. In particular, neurons that show little correlation with the local population average (i.e., the neuropil signal) within the same area are sometimes referred to as “soloists” (M. Okun et al., 2015). If such soloist neurons were found to exhibit strong correlations with the neuropil signal of an adjacent area, this would be a highly interesting result. However, such an analysis would go beyond the scope of the present manuscript and would require a new line of discussion; therefore, we plan to address this issue in future work.

      (3) I generally found the final Results section (Relationship between mesoscale functional correlation and anatomical connections) to be hard to follow. The motivation for this analysis should be better explained.

      We fully incorporated your suggestion and rewrote the final section of the Results accordingly. Please refer to our responses to the two comments above.

      (4) The question of brain state/neuromodulation as a driver of the globally shared activity may be addressable by considering its correlation with pupillometry data.

      We fully agree with your suggestion. In our experiments, visual stimuli change continuously, and thus pupil diameter changes are most likely driven primarily by changes in visual input. Although state-dependent fluctuations of brain activity may also be present, they are likely masked by the larger effects induced by visual stimulation. Therefore, analyzing pupil-linked signals as a factor of globally shared activity would be more appropriately addressed in experiments without visual stimulation. We plan to investigate this issue in future studies. Here, we have added the following description regarding pupil dynamics and their associated relationships.

      Line 292: “We found that the neurons related to the tail and forepaws were similarly distributed around the parietal cortex including S1 and A, while the pupil-size related neurons were mapped around visual areas (Figure 4C). Changes in pupil diameter may influence neuronal activity through multiple mechanisms, including behavioral state or noradrenergic level [REF], nonlinear interactions with visual stimulation, and changes in the amount of light reaching the retina.”

      Minor issues

      (1) The authors deploy sophisticated mathematical techniques with essentially no explanation outside the Methods section. A brief introduction of jPCA and CCA in the main text would help the reader understand the value of these analyses.

      Thank you for the comment. We added the following explanation.

      Line 238: “In this task, left and right selection are alternated, so the activity of the history neuron is a sequence that repeats in two consecutive trials. We used jPCA<sup>49</sup> to visualize and quantify this activity pattern (Figure 3K). jPCA identifies low-dimensional projections of population activity that maximize rotational dynamics across time.”

      Line 374: “Next, to investigate r<sub>t</sub> of the population activity (r<sub>t_population</sub>), we first reduced the dimension of population activity in each area into 10 by using PCA (principal component analysis) (Figure S6B,C). Then, “fluctuation activity” was recalculated for each dimension and trial type, analogous to the single-neuron analysis described above, but here representing noise in population-level activation patterns. We applied CCA (canonical correlation analysis) to each pair of areas and obtained an average of 10 canonical correlations (CC<sub>t</sub>) as r<sub>t_population</sub>. CCA identifies pairs of linear combinations of population activity from two areas that maximize their correlation across trials, thereby capturing shared population-level fluctuations. The CC<sub>t</sub> structure between areas was similar across task types (Figure 5H) indicating that this structure reflects the underlying functional connectivity independent of the task. The CC<sub>t</sub> between A and S1t was the largest among all the pairs (Figure 5H), whereas when the CC<sub>t</sub> was averaged across all connections for each area, A and AM had the largest and second largest C<sub>t</sub>, respectively (Figure 5I). The dominance in CC<sub>t</sub> in A and AM disappeared when the neurons with r<sub>t_single</sub> >0.3 were removed. Notably, the CC<sub>t</sub> of AM and the other areas was uniform regardless of the paired areas across all 10 canonical components (Figure 5J). Thus, area AM is an integration hub of interareal communication, whereas A simply coupled with S1t, and such correlation structure at the population level critically depends on this subset of neurons.”

      (2) The manuscript contains numerous typos ("hoice"), spelling errors ("parameters", "costom"), abbreviations that are not defined (ex: RL/rostrolateral), and minor grammatical issues that should be addressed by a round of copy editing.

      We thank the reviewer for pointing this out. We have thoroughly corrected these typographical and grammatical errors, and have described the revisions in detail in our response to Reviewer 1, comment (3). In addition, we have clarified the abbreviations in the manuscript as follows.

      Line 94: “rostrolateral area (RL)”

      Figure 1 legend: “Abbreviations: RL, rostrolateral HVA; PM, posteromedial HVA; RSC, retrosplenial cortex.“

      (3) Figure 3K unlabeled axes.

      Thank you for the comment. We have added the axis labels.

      (4) Figure 3K caption, first "(right)" should be "(left)".

      Thank you very much for your careful attention to detail. We have made the requested correction.

      (5) Figure 6 is hard to read. Panel A is too small, and the interpretation of G is difficult.

      - For panel A, we added an enlarged view with images from a larger number of trials in Figure S7A.

      - G represents the connectivity matrix. The sources correspond to the injection sites, and the targets correspond to voxels in the cerebral cortex. Because the latter may not be immediately clear, we explicitly indicated in the figure that the targets are cortical voxels.

      (6) Figure S4C has a double compass.

      Thank you for the comment. We have revised the manuscript accordingly.

      Reviewer #3 (Recommendations for the authors):

      While I have some questions and additional suggestions to further improve the clarity of the manuscript, I already found it to be highly interesting and well done in its current form.

      Major points:

      (1) The t-SNE comes up rather abruptly and is not well-explained in the main text or the figure caption. It would be good to provide some more information on the rationale of this analysis and how to interpret it. In particular, I don't see clear clusters in Figure 2H although the description of the authors seems to indicate that they observe clear functional classes such as choice, stimulus, and history neurons. Similarly, in Figure 3B, I don't see a clear separation between history and choice neurons in the t-SNE map. The example cells in Figure 3A appear to be delayed or long-tailed choice neurons rather than a dedicated group of 'history neurons'. It would be helpful for the interpretation of the t-SNE plots to show different PSTHs for different regions of the t-SNE map to better illustrate what different regions within the t-SNE projection represent and what distinguishes these cells.

      Thank you for the comment. The absence of clearly defined clusters in the t-SNE map suggests that neuronal activity forms a continuum rather than discrete classes. Importantly, the purpose of the t-SNE map here is not to identify sharp clusters, but to demonstrate that the functional categorization provided by our encoding model broadly and comprehensively spans the major structures present in the unsupervised t-SNE map. We have revised the relevant text in the manuscript accordingly as follows.

      Line 158: “To examine whether the neuron groups labeled by this model broadly capture the diversity of neuronal activity, we performed unsupervised clustering of neuronal activity using t-SNE. The functional labels revealed by this encoding model were consistent with the t-SNE clusters, indicating the validity of the encoding model (Figure 2H; Figure S4B; materials and methods).”

      The issue regarding History neurons was also raised in Reviewer #1’s comment (5). We provide an enlarged view of Figure 3A in Figure S3A. Each History neuron exhibits multiple calcium transients repeatedly and asynchronously following the previous reward acquisition. Therefore, rather than being “choice neurons with a long tail,” these neurons are better interpreted as neurons whose activity is sustained during this delay period.

      (2) Although the authors mention that neurons represent a mixture of features, they then use the encoding model to isolate clusters, such as vision or choice neurons. In general, the language throughout the manuscript suggests that there are various clusters of functionally segregated neurons (vision, choice, history, or coupling neurons). However, it is not clear to me to what extent this is supported by the data. Couldn't a choice neuron also be a vision neuron if both variables make significant contributions to the model? Similarly, are 'history' and 'choice' separate labels from the encoding model, or could a cell be given multiple labels? If a cell could be given multiple labels how did the authors create the colored plots on the right-hand side of Figures 2H and 3B? The example history cells in Figure 3J also appear to be highly selective for the contralateral choice, so again this seems to argue against a clear separation of choice and history neurons.

      Each label is assigned based on whether the corresponding coefficient is significant in the encoding model, and therefore neurons that are both vision- and choice-selective do exist. The presence of mixed selectivity neurons in PPC is well established (e.g., MJ Goard et al., 2016 elife). In this manuscript, however, we focus not on functional overlap at the single neuron level, but on the spatial distribution of functional classes, and thus do not explicitly address mixed selectivity. Although the colors in Figure 2H and Figure 3B overlap, the underlying data for each are presented separately in Figure S4B and S4D, respectively. As shown there, each color generally occupies distinct regions in the t-SNE map.

      (3) The decoding analysis in Figure 3F also suggests that a potential reason why there are more choice history signals in areas S1 and A is that neural activity is simply larger rather than due to the activity of a dedicated group of history neurons. Are the authors interpreting this differently? Could the duration of stored choice information also be affected by the dynamics of the calcium indicator?

      Thank you for the comment. Simply having larger neural activity in S1t or A would not result in calcium transients with a ~1-s time constant persisting throughout a delay period lasting up to 10 seconds. As also noted in comment (1), History neurons exhibit sustained and repeated calcium transients, and therefore their activity cannot be explained merely by elevated neural activity levels. One could argue that all cortical areas carry history-related information but that the signal-to-noise ratio is higher in S1t or A, which might make such signals more detectable there. If this were the case, however, differences across areas in all forms of selectivity should similarly depend on signal-to-noise ratio. This is not what we observe in our data.

      (4) I'm confused as to why the decoding accuracy is so high for areas A and S1t at time -3 relative to the choice in Figure 3F. Shouldn't this be the same as predicting the next choice in Figure 3H? Why is the decoding accuracy lower in this case?

      Thank you for the comment. The analysis shown in Figure 3F includes only trials in which the choice was correct. This is the reason why the decoding performance in Figure 3H is lower. We have added this clarification to the main text.

      Figure 3F: “Decoding accuracy of choice, outcome, and visual stimuli by the activity of 20 neurons from each area using only correct trials, before and after the choice onset, reward delivery, and the end of the visual stimuli, respectively. Line colors corresponded to the areas shown in panel G.”

      (5) In general, the text is not very detailed about the statistics. While test scores and p-values are mentioned, it would be good to also state what is actually compared and what the n is (e.g. how many neurons, neuron pairs, areas, sessions, or animals) for each case. How do the authors account for the nested experiment design where many neurons are coming from a low number of animals?

      Thank you for the comment. In our decoding analyses, we generally treat the number of animals as the independent variable. In contrast, for the encoding model analyses, we treat the number of neurons as the independent variable. As you correctly pointed out, because we recorded activity from a large number of neurons, statistical tests that treat individual neurons as independent samples can readily yield significant p-values even with a small number of animals. We have therefore confirmed that our conclusions are not driven by a large effect from a single animal. When making qualitative claims, we rely not only on statistical significance (p-values) but also require clear differences in effect size. We have added the following clarification to the Statistics section accordingly.

      Line 1049: ”For the decoding analyses, the number of animals was treated as the independent variable, whereas for the encoding model analyses, the number of neurons was treated as the independent variable. To ensure that the results were not driven by a single animal, we repeated the statistical tests while systematically excluding data from one animal at a time and confirmed that statistical significance was preserved in all cases. Furthermore, qualitative interpretations were made only when differences in effect size were clearly observed.”

      (6) How was the grouping in Figure 2O done? Specifically, how were the thresholds for the dashed lines selected to separate PM and V1 from AM and RL as association areas? It seems to me like this grouping was done rather arbitrarily as the difference in choice decoding accuracy is not particularly large between these areas.

      This line does not have a specific quantitative basis, but we consider it useful as an illustrative aid. We have added this clarification to the figure legend.

      Figure 2O: “Decoding accuracies of time in video presentation and choice direction indicate that AM would be the best position for associating these two signals. The background color and dashed lines are provided as visual aids for illustrative purposes.”

      (7) The fact that neurons with high rt_single tend to share the same function might also indicate the approach is insufficient to remove all effects of tuning to trial types from the neural data. Since the authors subtract the average of each trial type, the average trial-type related information is removed but type-specific variations that are not equally presented in the average might remain. For choice neurons for example, attentive vs in-attentive choices could be represented differently and thus remain in the data since the average would be a mixture of both. The same goes for other factors that would drive a particular modulation in the choice - or stimulus - related part of the trial which could still tie these neurons together. One way to circumvent this concern could be to first compute the mean activity for all time points in each trial and then compute the trial-to-trial variability across all trials of the same type. Alternatively, I would be curious how the results play out when using data when the animal is not actively performing the task to compute rt_single.

      Thank you for the comment. The concern raised by the reviewer applies to all noise-correlation analyses and highlights an important limitation of this approach, namely that factors other than the observed variables are treated as noise. By subtracting the trial-averaged activity, information related to sensory input and the direction of the first lick at choice can be removed. However, other factors cannot be eliminated if they are not observed. For example, if right hindlimb movements tend to occur only in trials with visual stimulation combined with left choice, such effects cannot be removed because they are not measured. The same issue remains even when restricting the analysis to a single trial type. Based on these considerations, we have added the following text to the manuscript.

      Line 932: “Correlation of trial-to-trial variance of activity between a pair of single neurons was defined as r<sub>t_single</sub>. To calculate r<sub>t_single</sub>, we averaged the activity of individual neurons over the sampling period, and the average across each trial type was subtracted from this value. The trial types consisted of four sets of pairs of stimuli and responses, that is, the video stimulation and left choice, the video stimulation and right choice, the black screen and left choice, and the black screen and right choice. By this operation, we extracted the fluctuating components of single-neuron activity that are independent of the trial types. Although the finding that neurons with high r<sub>t_single</sub> tend to share the functional properties we propose is not a trivial consequence of the analysis. At the same time, it remains possible that high r<sub>t_single</sub> reflects the degree to which neurons share unobserved features, and that such features are correlated with our functional classification. Thus, while this analysis suggests that correlated fluctuations across cortical areas may contribute to the determination of functional types, establishing an exclusive conclusion will require more fine-grained behavioral measurements, tighter control of internal states, and causal identification through targeted interventions.”

      Minor points:

      (1) Why did the authors use the activity of 50 neurons for the decoder analysis in Figure 2K? Didn't they have many more neurons available? How were these selected?

      We found that the conclusions were identical when using datasets consisting of either 50 neurons or 20 neurons across all analyses. Because the total number of recorded PM neurons did not reach 100 in at least one mouse, we standardized the analyses to 50 neurons in order to match the number of neurons across all cortical areas and animals.

      (2) The authors mention that some PPC neurons showed complex dynamics rather than encoding a specific feature such as visual or choice information but do not mention actual numbers on this point. It would be good to quantify to what extent neurons in different regions represent such mixed selectivity and whether there are clear differences in selectivity. This would also be interesting to discuss in context to earlier work on mixed selectivity in the parietal cortex, such as Raposo et al 2015.

      Thank you for the comment. Your point is entirely valid. However, as explained in our response to your major comment, our analyses focus not on how individual neurons are classified, but rather on the spatial distribution of these functional categories.

      (3) I have a hard time understanding what the length of the bars in the right panel of Figure 2k indicates. Does this plot show more than the decoder accuracy before and after the choice? Is the bar length related to the standard deviation? The same question for the visualization in panel 2n. It looks nice but I'm confused about what it shows exactly.

      These bars represent confidence intervals. Although this is stated at the end of the Figure 2 legend, we agree that it may not be sufficiently clear, and we have therefore added this information to the Statistics section.

      Line 1046: “In Figure 2K and N, and Figure 3G, L, M, and O, the bars indicate the 95% confidence intervals. All other bars denote s.e.m., unless otherwise noted.”

      (4) Is Figure 3D showing the same association index as in Figure 2j, thus showing the same result as in the vision task or is this meant to show something new? It was not clear to me from the wording, so it would be good to clarify.

      You are correct that the magenta trace in Fig. 3D is the same as in Fig. 2J. This panel was included to explicitly illustrate that, in areas A and AM, the separation between History and Association approximately overlaps. We have added the following clarification to the figure legend accordingly.

      Figure 3D: “The percentage of history neurons and the association index (as defined in Fig. 2J) were overlaid for comparison.”

      (5) When computing the Pseudo R2 for regressor contribution, how was the null model computed? From shuffling all regressors in the model? I think this is fine but it's not fully clear what the intended effect of this procedure is. For the description of Figure 4C it would be good to add a sentence explaining how to interpret the pseudo R^2.

      The null model predicts a fixed value that is independent of the explanatory variables, i.e., it predicts only the intercept. This provides a useful correction term when performing cross-validation, particularly in cases where baseline values differ across folds. In Figure 4C, the analysis shows the contribution of adding body part positions and pupil diameter to the model for predicting neural activity. We have added the following text to the Methods section.

      Line 881: “To estimate the contribution of parameters for the left forelimb, the right forelimb, the tail, and the pupil, we repeated the same analysis with a reduced model where each set of predictors was eliminated from the full model (Figure 4B). Then, the pseudo-R<sup>2</sup> was obtained for each set of predictors by (MSE<sub>reduced</sub>MSE<sub>full</sub>) /MSE<sub>null</sub>, where MSE is the mean squared error, MSE<sub>reduced</sub> is MSE for the reduced model, MSE<sub>full</sub> is the MSE of the full model, and MSE<sub>null</sub> is the null model. The null model predicts a fixed value that is independent of the explanatory variables; specifically, it simply outputs the mean of the training data. For example, we constructed a regression model without the parameters regarding the left forelimb (green shade of Figure 4B), obtained MSE<sub>reduced</sub> for the left forelimb, and the pseudo-R<sup>2</sup> was calculated as above by comparing the MSE of the full model and the null model. This value reflects the extent to which the position of the left forelimb contributes to the prediction of neuronal activity.”

      (6) It seems surprising that the pupil-size-related neurons were mapped around visual areas although the pupil should carry clear luminance information. Is this because the luminancerelated information in the pupil can also be explained by the stimulus variable in the model?

      Pupil size changed markedly before and after visual stimulus presentation (Figure S5C), dilating during the black stimulus and constricting during the video stimulus. This likely reflects changes relative to the luminance of the gray screen presented in the absence of visual stimuli. In our encoding model, visual stimuli are included as independent regressors for each corresponding time window. Therefore, pupil fluctuations that are temporally locked to visual stimulation are explained by these visual regressors. Neuronal activity that is better explained by pupil size changes not accounted for by the visual regressors is classified as pupil-related. At least three mechanisms may underlie the influence of pupil size on neuronal activity. First, fluctuations in pupil diameter have been linked to behavioral state or noradrenergic level [REF], which can act as variables independent of visual stimulation. Second, pupil fluctuations may be amplified in a stimulus-dependent manner, reflecting nonlinear interactions between visual input and brain state. Third, changes in pupil diameter alter the amount of light reaching the retina, which can modulate activity in visual cortical areas. The latter two mechanisms are therefore expected to predominantly affect visual areas and may explain why pupil-related neurons are more frequently observed there. The first mechanism is likely related to global brain state, and its association with behavior may account for the presence of pupil-related neurons in S1. However, these interpretations require confirmation through more refined causal manipulations. Accordingly, we limited the addition to the manuscript to the following statement.

      Line 292: “We found that the neurons related to the tail and forepaws were similarly distributed around the parietal cortex including S1 and A, while the pupil-size related neurons were mapped around visual areas (Figure 4C). Changes in pupil diameter may influence neuronal activity through multiple mechanisms, including behavioral state or noradrenergic level [REF], nonlinear interactions with visual stimulation, and changes in the amount of light reaching the retina.”

      (7) What is meant by 'external control parameters such as a video frame' when explaining the encoding model?

      Thank you for the comment. We added the following explanation.

      Line 151: “In the encoding model, the activity of each neuron was fitted by a weighted sum of external control parameters, such as video frames, and behavioral parameters, such as choice and reward direction. Because the visual stimulus changes continuously over time, sliding time windows were placed during the visual stimulus period.”

      (8) What does the trace in Figure 2G show? Is this a single-cell example? What are the axes here?

      We added an explanation to the figure legend.

      Figure 2G: “Schematic of our encoding model. The bottom right panel shows an example of single-neuron activity with an overlay of the fitting obtained by the encoding model.”

      (9) There seems to be a word missing in the sentence that describes the results for Figure 3O in the main text.

      Thank you for the comment. We added the following description related to Fig. 3O.

      Line 247: “resulting in the decoding accuracy of time after a specific choice being lower than in A (Figure 3O).”

      (10) The abbreviation RP is used when describing Figure S5A. It should be mentioned that this refers to the response period.

      Thank you for the comment. We added the following description related to Figure S5A.

      Line 283: “We found that the angle of the tail was significantly different from the baseline values several seconds after the response period (RP) (Figure S5A)”

      (11) I can't see the color difference between the traces in Figure 2E. There are probably red and green but this is hard to see for readers with red-green color blindness. Does the black indicate the time of visual stimulation? Is the line in Figure 2F the time when the spouts move in?

      Thank you for the comment. In Fig. 2E, we improved visibility by changing the line opacity. In addition, the vertical line in Fig. 2E indicates the onset of the visual stimulus, and the vertical line in Fig. 2F indicates the onset of the response period. We have added the following explanations to the figure legend.

      Figure 2: E. “Representative vision neurons (ROI 1-4 in I). The red bars indicate sampling periods during video presentation, and the brown bars indicate sampling periods without video stimulation. Vertical black lines mark the onset of the sampling period. F. Representative choice neuron (ROI 5-8 in I) and a non-selective neuron (ROI 9). Light blue lines indicate the response periods in trials with left choices, and purple lines indicate the response periods in trials with right choices. Vertical black lines mark the onset of the response period.”

      (12) It might be useful to provide a short explanation in the results or methods of why the harmonic mean was used for the computation of the association index. I think it makes sense but since it is not commonly used this could be helpful for the reader to understand the approach.

      Thank you for the comment. We added the following explanation to the main text.

      Line 869: “The association index was determined by the harmonic mean of the rates of vision neurons and choice neurons. The harmonic mean approaches the arithmetic mean when the two values are similar, but becomes closer to the smaller value when the two values differ substantially. Therefore, the association index takes a large value when both vision neurons and choice neurons are abundant.”

      (13) I don't fully understand how coupling diversity is computed. If there are six preference vectors, what is meant by taking the average of angles between all pairs of the two vectors?

      Which two are meant here?

      Thank you for the comment. We revised the explanation as follows.

      Line 950: “To quantify the diversity of coupling patterns across clusters, we computed the angle between every pair of preference vectors. We then averaged these pairwise angles and defined this quantity as the “coupling diversity.”

      (14) The results text states that the high correlation between r_anatomy and r_neuropil (Figure 6I) is evidence for the functional correlations being driven by cortico-cortical connectivity. However, Figure 6J shows that correlations for either cortico-cortical or thalamo-cortical connectivity are below 0.94 and generally higher for thalamo-cortical connectivity. This doesn't negate the general point of the authors but it would be good to clarify this section so it is easier to understand if r_anatomy includes both cortico-cortical and thalamo-cortical data and how the results in Figure I and J go together with the description in the results section.

      You are correct. We have revised the text to clarify that the analysis reflects the combined effects of both cortico-cortical and thalamo-cortical inputs.

      Line 436: “This correspondence suggests that the mesoscale interarea correlation is determined by the cortico-cortical and thalamo-cortical common input at mesoscale. Figure S8: A. Using Allen connectivity atlas, the axonal density of cortico-cortical and thalamo-cortical projection was analyzed.”

      (15) I'm not very familiar with canonical correlation analysis and found this part hard to follow. Some additional explainer sentences would be helpful here. For example, what does it mean to take the average of the top 10 canonical correlations as rt_population? What exactly are the canonical correlation vectors? It was also not clear to me what exactly the results in Figure 5J signify.

      Thank you for the comment. We have clarified the description in the main text related to CCA and the associated analyses as follows.

      Line 374: “Next, to investigate r<sub>t</sub> of the population activity (r<sub>t_population</sub>), we first reduced the dimension of population activity in each area into 10 by using PCA (principal component analysis) (Figure S6B,C). Then, “fluctuation activity” was recalculated for each dimension and trial type, analogous to the single-neuron analysis described above, but here representing noise in population-level activation patterns. We applied CCA (canonical correlation analysis) to each pair of areas and obtained an average of 10 canonical correlations (CC<sub>t</sub>) as r<sub>t_population</sub>. CCA identifies pairs of linear combinations of population activity from two areas that maximize their correlation across trials, thereby capturing shared population-level fluctuations. The CC<sub>t</sub> structure between areas was similar across task types (Figure 5H) indicating that this structure reflects the underlying functional connectivity independent of the task. The CC<sub>t</sub> between A and S1t was the largest among all the pairs (Figure 5H), whereas when the CC<sub>t</sub> was averaged across all connections for each area, A and AM had the largest and second largest CC<sub>t</sub>, respectively (Figure 5I). The dominance in CC<sub>t</sub> in A and AM disappeared when the neurons with r<sub>t,single</sub> >0.3 were removed. Notably, the CC<sub>t</sub> of AM and the other areas was uniform regardless of the paired areas across all 10 canonical components (Figure 5J). Thus, area AM is an integration hub of interareal communication, whereas A simply coupled with S1t, and such a correlation structure at the population level critically depends on this subset of neurons.”

    1. eLife Assessment

      This valuable study proposes a novel rapid-entry mechanism for S. aureus, involving the rapid release of calcium from lysosomes. The paper's strength lies in its very interesting hypothesis. The methods used are solid and adequately support the conclusions.

    2. Reviewer #2 (Public review):

      In the manuscript Ruhling et al propose a rapid uptake pathway that is dependent on lysosomal exocytosis, lysosomal Ca2+ and acid sphingomyelinase, and further suggest that the intracellular trafficking and fate of the pathogen is dictated by the mode of entry. Overall, this is manuscript argues for an important mechanism of a 'rapid' cellular entry pathway of S.aureus that is dependent on lysosomal exocytosis and acid sphingomyelinase and links the intracellular fate of bacterium including phagosomal dynamics, cytosolic replication and host cell death to different modes of uptake.

      Key strength is the nature of the idea proposed, while continued reliance on inhibitor treatment combined with lack of phenotype / conditional phenotype for genetic knock out is a major weakness.

      In the revised version, the authors perform experiments with ASM KO cells to provide genetic evidence of the role for ASM in S. aureus entry through lysosomal modulation. The key additional experiment is the phenotype of reduced bacterial uptake in low serum, but not in high serum conditions. The authors suggest this could be due to the SM from serum itself affecting the entry. While this explanation is plausible, prolonged exposure of cells to low serum is well documented to alter several cellular functions, particularly in the context of this manuscript, lysosomal positioning, exocytosis and Ca2+ signaling. A better control here could be WT cells grown in low serum. If SM in serum can interfere, why do they see such pronounced phenotype on bacterial entry in WT cells upon chemical inhibition?

      While the authors argue a role for undetectable nano-scale Cer platforms on the cell surface caused by ASM activity, results do not rule out a SM independent role in the cellular uptake phenotype of ASM inhibitors.

      The authors have attempted to address many of the points raised in the previous revision. While the new data presented provide partial evidence, the reliance on chemical inhibitors and lack of clear results directly documenting release of lysosomal Ca2+, or single bacterial tracking, or clear distinction between ASM dependent and independent processes dampen the enthusiasm.

      I acknowledge the author's argument of different ASM inhibitors showing similar phenotypes across different assays as pointing to a role for ASM, but the lack of phenotype in ASM KO cells is concerning. The author's argument that altered lipid composition in ASM KO cells could be overcoming the ASM-mediated infection effects by other ASM-independent mechanisms is speculative, as they acknowledge, and moderates the importance of ASM-dependent pathway. The SM accumulation in ASM KO cells does not distinguish between localized alterations within the cells. If this pathway can be compensated, how central is it likely to be ?

      The authors allude to lower phagosomal escape rate in ASM KO cells compared to inhibitor treatment, which appears to contradict the notion of uptake and intracellular trafficking phenotype being tightly linked. As they point out, these results might be hard to interpret. Could an inducible KD system recapitulate (some of) the phenotype of inhibitor treatment? If S. aureus does not escape phagosome in macrophages, could it provide a system to potentially decouple the uptake and intracellular trafficking effects by ASM (or its inhibitor treatment) ?

      The role of ASM on cell surface remains unclear. The hypothesis proposed by the authors that the localized generation of Cer on the surface by released ASM leads to generation of Cer-enriched platforms could be plausible, but is not backed by data, technical challenges to visualize these platforms notwithstanding. These results do not rule out possible SM independent effects of ASM on the cell surface, if indeed the role of ASM is confirmed by controlled genetic depletion studies.

      The reviewer acknowledges technical challenges in directly visualizing lysosomal Ca2+ using the methods outlined. Genetically encoded lysosomal Ca2+ sensor such as Gcamp3-ML1 might provide better ways to directly visualize this during inhibitor treatment, or S. aureus infection.

    3. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #2 (Public review):

      In the manuscript, Ruhling et al propose a rapid uptake pathway that is dependent on lysosomal exocytosis, lysosomal Ca2+ and acid sphingomyelinase, and further suggest that the intracellular trafficking and fate of the pathogen is dictated by the mode of entry. Overall, this is manuscript argues for an important mechanism of a 'rapid' cellular entry pathway of S.aureus that is dependent on lysosomal exocytosis and acid sphingomyelinase and links the intracellular fate of bacterium including phagosomal dynamics, cytosolic replication and host cell death to different modes of uptake.

      Key strength is the nature of the idea proposed, while continued reliance on inhibitor treatment combined with lack of phenotype for genetic knock out is a major weakness.

      We agree with the reviewer that a S. aureus invasion phenotype in ASM K.O. cells would unequivocally demonstrate the importance of ASM for the process. In the revised manuscript, we report an invasion phenotype in ASM K.O. cells. The absence of an invasion phenotype in ASM K.O. cells in our original experiments was likely caused by SM accumulation in ASM-depleted cells originating from FBS (see Figure 2I, in the revised manuscript).

      We thus cultured cells for up to three days in 2% FBS and then reduced the concentration to 1% FBS one day prior to experimentation. Under these conditions reduced S. aureus invasion in ASM K.O.s was observed when compared to wildtype cells.

      This was not detected when we cultured the cells in medium containing the common concentration of 10% FBS. Our new data supports the results we acquired with three different ASM inhibitors.

      The invasion defect in ASM K.O.s cultured in low FBS was more pronounced at 10 min p.i. when compared to the 30 minute time point (Figure 2K), further corroborating that the ASM-dependent invasion pathway is relevant early in infection. This is consistent with the invasion dynamics we observed upon interference with lysosomal Ca<sup>2+</sup> signaling [TPC1 K.O. (Figure 1C), BAPTA-AM (Figure 3D)], lysosomal exocytosis [Syt7 K.O. (Figure 2F), Ionomycin (Figure 3D)] and ASM activity by inhibitor treatment (Figure 3D).

      Originally, we had hypothesized that changes in the sphingolipidome induced by absence of ASM may have caused the lack of an S. aureus invasion phenotype. We thus compared the sphingolipidome of ASM K.O.s cultured in 1% and 10% FBS. Indeed, SM accumulation was less severe when we cultured the cells in 1% FBS (Figure 2M and Supp. Figure 3). Hence, we think that strong SM accumulations in ASM K.O. cells cultured in 10% FBS may facilitate ASM-independent invasion mechanisms and thus, the absence of ASM-dependent invasion could not be detected by analyzing the number of invaded bacteria. This is supported by experiments, where we treated ASM K.O.s with the ASM inhibitor ARC39, which only slightly affected S. aureus invasion, whereas we detected a strong reduction of internalized bacteria by ARC39 treatment of WT cells (Figure 2 J). We think that this experiment and the reduced invasion in ASM K.O.s rule out an ASM/SM-independent effect of the inhibitors.

      - While the authors argue a role for undetectable nano-scale Cer platforms on the cell surface caused by ASM activity, results do not rule out a SM independent role in the cellular uptake phenotype of ASM inhibitors.

      We agree with reviewer that we do not show formation of ceramide-enriched platforms, and we thus changed the manuscript accordingly (see below).

      - The authors have attempted to address many of the points raised in the previous revision. While the new data presented provide partial evidence, the reliance on chemical inhibitors and lack of clear results directly documenting release of lysosomal Ca2+, or single bacterial tracking, or clear distinction between ASM dependent and independent processes dampen the enthusiasm.

      We shared the reviewer’s desire to discriminate between ASM-dependent and ASM-independent processes, but we are limited by cell biology and the simultaneous occurrence of processes - here the uptake of bacteria by multiple pathways.

      However, we were able to address ASM-dependency of our rapid uptake mechanism by observing a genetic phenotype in SMPD1 knockout-cells.

      We here do not make any assumptions on the centrality of the pathway and its importance in vivo. As scientists we were interested in the fact that such an ASM dependent pathway existed. In different as of yet still unidentified cell lines such a pathway may pose the main entry point for bacteria. Or maybe it represent an ASM-dependent mode of receptor uptake which we have identified with the bacteria piggy-backing into the cells.

      - I acknowledge the author's argument of different ASM inhibitors showing similar phenotypes across different assays as pointing to a role for ASM, but the lack of phenotype in ASM KO cells is concerning. The author's argument that altered lipid composition in ASM KO cells could be overcoming the ASM-mediated infection effects by other ASM-independent mechanisms is speculative, as they acknowledge, and moderates the importance of ASM-dependent pathway. The SM accumulation in ASM KO cells does not distinguish between localized alterations within the cells. If this pathway can be compensated, how central is it likely to be?

      We are convinced that our new genetic evidence of an S. aureus invasion phenotype in ASM K.O.s will eliminate the reviewer’s concerns about the role of ASM during the bacterial invasion.

      The new lipidomics data of ASM K.O.s cultured in 1% and 10% FBS (Figure 2, M, Supp. Figure 3) and inhibitor-treated WT cells (Figure 2L, Supp. Figure 3) show a correlation between SM accumulation and the invasion phenotype.

      We agree with the reviewer, however, that the reason why changes in sphingolipidome increase ASM-independent S. aureus internalization by host cells remains elusive. One possible explanation is a dysfunction of the lipid raft-associated protein caveolin-1 upon strong SM accumulation, which was previously shown to appear in ASM-deficient cells (1, 2). A lack of caveolin-1 results in strongly increased host cell entry of S. aureus (3, 4). Characterization of the mechanism behind these observations requires further experimentation and is beyond the scope of the current manuscript.

      Host cells possess mechanisms to prevent infections, while pathogens developed strategies to circumvent these defense processes. In the present scenario, a physiological membrane composition of the host cell represents such a pathogen defense mechanism (as shown e.g. for caveolin-1 that restricts invasion of S. aureus in healthy cells). If a defense mechanism is disabled (as we speculate it is the case upon strong SM accumulation in ASM K.O.s cultured in 10%FBS), infection is facilitated. In healthy WT cells, these mechanisms (e.g. caveolin-1) are functional and, hence, we would not expect a “compensation” of ASM-dependent invasion. We here analyze invasion events that cannot be prevented by host defense mechanisms as they occur in untreated WT cells and are absent upon interfering with the ASM-dependent invasion pathway (by inhibitors and genetic K.O.). Thus, we think the ASM-dependent pathway, which mediates 50-70% of bacteria internalized by healthy WT cells 10 min p.i., is central for the infection.

      - The authors allude to lower phagosomal escape rate in ASM KO cells compared to inhibitor treatment, which appears to contradict the notion of uptake and intracellular trafficking phenotype being tightly linked. As they point out, these results might be hard to interpret.

      We measured phagosomal escape of S. aureus JE2 in ASM K.O. cells cultured in 1% FBS. Again, we infected cells for 10 or 30 min and determined the escape rates 3h p.i. However, the results are similar to escape rates determined with 10% FBS (Author response image 1).

      Escape rates of S. aureus were significantly decreased in absence of ASM regardless of the FBS concentration in the medium. We therefore think that prolonged absence of ASM has other side effects. For instance, certain endocytic pathways could be up- or down-regulated to adapt for the absence of ASM or could be affected by other changes in the lipidome (that can be minimized but not completely prevented by culturing cells in 1% FBS). This could, for instance, affect maturation of S. aureus-containing phagosomes and hence phagosomal escape.

      Author response image 1.

      As it is unclear how prolonged absence of ASM can affect cellular processes, we think other experiments investigating the role of ASM-dependent invasion for phagosomal escape are more reliable. Most importantly, bacteria that enter host cell early during infection (and thus, predominantly via the “rapid” ASM-dependent pathway) possess lower phagosomal escape rates than bacteria that entered host cells later during infection (Figure 5, D and E). This is confirmed by higher escapes rates upon blocking ASM-dependent invasion with Vacuolin-1 (Figure 4E) and three different ASM inhibitors (Figure 4C and D). We further demonstrate that sphingomyelin on the plasma membrane during invasion influences phagosomal escape, while sphingomyelin levels in the phagosomal membrane did not change phagosomal escape (Figure5 a and b). This is summarized in Figure 5F.

      - Could an inducible KD system recapitulate (some of) the phenotype of inhibitor treatment ? If S. aureus does not escape phagosome in macrophages, could it provide a system to potentially decouple the uptake and intracellular trafficking effects by ASM (or its inhibitor treatment)?

      Inducible knock-downs in our laboratory are based on the vector pLVTHM in cells co-expressing the repressor TetR fused to a KRAB domain. It needs to be stated that for optimal knock-downs the induction has to be performed by doxycycline supplementation in the medium for 7 days thus leading to several days of growth of the cells, which will allow the cells to adapt their lipid metabolism thus reflecting a situation that we encounter for the K.O.s.

      ASM-dependent uptake of S. aureus in macrophages has been demonstrated before (5). However, the course of infection in macrophages differs from non-professional phagocytes (6). E.g. in macrophages, S. aureus replicates within phagosomes, whereas in non-professional phagocytes replicates in the host cytosol. Absence of ASM therefore may influence the intracellular infection of macrophages with S. aureus in a distinct manner.

      - The role of ASM on cell surface remains unclear. The hypothesis proposed by the authors that the localized generation of Cer on the surface by released ASM leads to generation of Cer-enriched platforms could be plausible, but is not backed by data, technical challenges to visualize these platforms notwithstanding. These results do not rule out possible SM independent effects of ASM on the cell surface, if indeed the role of ASM is confirmed by controlled genetic depletion studies.

      We agree with the reviewer that we do not show generation of ceramide-enriched platforms. We thus changed Figure 6F in the revised manuscript to make clear that it remains elusive whether ceramide-enriched platforms are formed. We also added a sentence to the discussion (line 615) to emphasize that the existence of these microdomains is still debated in lipid research.

      We think that the following observations support SM-dependent effects of ASM during S. aureus invasion:

      (i) reduced invasion upon removing SM from the plasma membrane (Figure 2N, Supp. Figure 2M)

      (ii) increased invasion in TPC1 and Syt7 K.O. (Figure 2, P) in presence of exogenously added SMase.

      However, we agree with the reviewer that we do not directly demonstrate ASM-mediated SM cleavage during S. aureus invasion. Hence, we added a sentence to the discussion that mentions a possible SM-independent role of ASM for invasion (line 556) that reads:

      “Since it remains elusive to which extent ASM processes SM on the plasma membrane during S. aureus invasion, one may speculate that ASM could also have functions other than SM metabolization during host cell entry of the pathogen. However, we did not detect a direct interaction between S. aureus and ASM in an S. aureus-host interactome screen (7).”

      - The reviewer acknowledges technical challenges in directly visualizing lysosomal Ca2+ using the methods outlined. Genetically encoded lysosomal Ca2+ sensor such as Gcamp3-ML1 might provide better ways to directly visualize this during inhibitor treatment, or S. aureus infection.

      We thank the reviewer for this suggestion. We included the following section in our discussion (line 593):

      “Since fluorescent calcium reporters allow to monitor this process microscopically (8, 9) ,future experiments may visualize this process in more detail and contribute to our understanding of the underlying signaling. mechanisms.”

      References

      (1) J. Rappaport, C. Garnacho, S. Muro, Clathrin-mediated endocytosis is impaired in type A-B Niemann-Pick disease model cells and can be restored by ICAM-1-mediated enzyme replacement. Mol Pharm 11, 2887-2895 (2014).

      (2) J. Rappaport, R. L. Manthe, C. Garnacho, S. Muro, Altered Clathrin-Independent Endocytosis in Type A Niemann-Pick Disease Cells and Rescue by ICAM-1-Targeted Enzyme Delivery. Mol Pharm 12, 1366-1376 (2015).

      (3) C. Hoffmann et al., Caveolin limits membrane microdomain mobility and integrin-mediated uptake of fibronectin-binding pathogens. J Cell Sci 123, 4280-4291 (2010).

      (4) L.-P. Tricou et al., Staphylococcus aureus can use an alternative pathway to be internalized by osteoblasts in absence of β1 integrins. Scientific Reports 14, 28643 (2024).

      (5) C. Li et al., Regulation of Staphylococcus aureus Infection of Macrophages by CD44, Reactive Oxygen Species, and Acid Sphingomyelinase. Antioxid Redox Signal 28, 916-934 (2018).

      (6) A. Moldovan, M. J. Fraunholz, In or out: Phagosomal escape of Staphylococcus aureus. Cell Microbiol 21, e12997 (2019).

      (7) M. Rühling, F. Schmelz, A. Kempf, K. Paprotka, J. Fraunholz Martin, Identification of the Staphylococcus aureus endothelial cell surface interactome by proximity labeling. mBio 0, e03654-03624 (2025).

      (8) D. Shen et al., Lipid storage disorders block lysosomal trafficking by inhibiting a TRP channel and lysosomal calcium release. Nat Commun 3, 731 (2012).

      (9) L. C. Davis, A. J. Morgan, A. Galione, NAADP-regulated two-pore channels drive phagocytosis through endo-lysosomal Ca(2+) nanodomains, calcineurin and dynamin. EMBO J 39, e104058 (2020).

    1. eLife Assessment

      This paper reports new data on the structure of the human CTF18-RFC clamp loader complex bound to the PCNA clamp. The new and convincing data complement previous reports of CTF-RFC-PCNA structures and as such, represents an important contribution.

    2. Reviewer #1 (Public review):

      Summary:

      The authors report the structure of the human CTF18-RFC complex bound to PCNA. Similar structures (and more) have been reported by the O'Donnell and Li labs. This study should add to our understanding of CTF18-RFC in DNA replication and clamp loaders in general. However, there are numerous major issues that I recommend the authors fix.

      Strengths:

      The structures reported are strong and useful for comparison with other clamp loader structures that have been reported lately.

    3. Reviewer #2 (Public review):

      Summary

      Briola and co-authors have performed a structural analysis of the human CTF18 clamp loader bound to PCNA. The authors purified the complexes and formed a complex in solution. They used cryo-EM to determine the structure to high resolution. The complex assumed an auto-inhibited conformation, where DNA binding is blocked, which is of regulatory importance and suggests that additional factors could be required to support PCNA loading on DNA. The authors carefully analysed the structure and compared it to RFC and related structures.

      Strength & Weakness

      Their overall analysis is of high quality, and they identified, among other things, a human-specific beta-hairpin in Ctf18 that flexible tethers Ctf18 to Rfc2-5. Indeed, deletion of the beta-hairpin resulted in reduced complex stability and a reduction in the rate of primer extension assay with Pol ε. Moreover, the authors identify that the Ctf18 ATP-binding domain assumes a more flexible organisation.

      The data are discussed accurately and relevantly, which provides an important framework for rationalising the results.

      All in all, this is a high-quality manuscript that identifies a key intermediate in CTF18-dependent clamp loading.

    4. Reviewer #3 (Public review):

      Summary:

      CTF18-RFC is an alternative eukaryotic PCNA sliding clamp loader which is thought to specialize in loading PCNA on the leading strand. Eukaryotic clamp loaders (RFC complexes) have an interchangeable large subunit which is responsible for their specialized functions. The authors show that the CTF18 large subunit has several features responsible for its weaker PCNA loading activity, and that the resulting weakened stability of the complex is compensated by a novel beta hairpin backside hook. The authors show this hook is required for the optimal stability and activity of the complex.

      Relevance:

      The structural findings are important for understanding RFC enzymology and novel ways that the widespread class of AAA ATPases can be adapted to specialized functions. A better understanding of CTF18-RFC function will also provide clarity into aspects of DNA replication, cohesion establishment and the DNA damage response.

      Strengths:

      The cryo-EM structures are of high quality enabling accurate modelling of the complex and providing a strong basis for analyzing differences and similarities with other RFC complexes. They use complementary pre-steady state FRET and polymerase primer extension assays to investigate the role of a unique structural element in CTF18.

      Weaknesses:

      The manuscript would have benefited from a more detailed biochemical analysis using mutagenesis and assays to tease apart the functional relevance of the many differences with the canonical RFC complex.

      Overall appraisal:

      Overall, the work presented here is solid and important. The data is sufficient to support the stated conclusions.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      The authors report the structure of the human CTF18-RFC complex bound to PCNA. Similar structures (and more) have been reported by the O'Donnell and Li labs. This study should add to our understanding of CTF18-RFC in DNA replication and clamp loaders in general. However, there are numerous major issues that I recommend the authors fix. 

      Strengths: 

      The structures reported are strong and useful for comparison with other clamp loader structures that have been reported lately. 

      Comments on revisions: 

      The revised manuscript is greatly improved. The comparison with hRFC and the addition of direct PCNA loading data from the Hedglin group are particular highlights. I think this is a strong addition to the literature.

      We thank the reviewer for their positive comments.  

      I only have minor comments on the revised manuscript. 

      (1) The clamp loading kinetic data in Figure 6 would be more easily interpreted if the three graphs all had the same x axes, and if addition of RFC was t=0 rather than t=60 sec.

      We now analyze and plot EFRET as a function of time after complex addition, effectively setting the loader addition to t = 0 for each trace (Figure 6 and Figs S10-14 in the new manuscript). Baseline (Ymin) and plateau (Ymax) EFRET values were obtained by averaging the stable signal regions immediately before and after clamp-loader addition, respectively. Traces are normalized to their own dynamic range before fitting.

      (2) The author's statement that "CTF18-RFC displayed a slightly faster rate than RFC" seems to me a bit misleading, even though this is technically correct. The two loaders have indistinguishable rate constants for the fast phase, and RFC is a bit slower than CTF18-RFC in the slow phase. However, the data also show that RFC is overall more efficient than CTF18-RFC at loading PCNA because much more flux through the fast phase (rel amplitudes 0.73 vs 0.36). Because the slow phase represents such a reduced fraction of loading events, the slight reduction in rate constant for the slow phase doesn't impact RFC's overall loading. And because the majority of loading events are in the fast phase, RFC has a faster halftime than CTF18-RFC. (Is it known what the different phases correspond to? If it is known, it might be interesting to discuss.)

      We removed the quoted statement. We avoid comparing amplitude partitions (A₁/A_T) for CTF18-RFC because (i) a substantial fraction of the reaction occurs within the <7 s dead time, and (ii) single- vs double-exponential identifiability differs across complexes. Instead, we report model-minimal progress times: RFC t<sub>0.5</sub> ≤ 7 s (faster onset), CTF18-RFC ~ 8 s, CTF18<sup>Δ165–194</sup>-RFC ~ 12 s; completion (t<sub>0.95</sub>): RFC ≈ 77 s, CTF18-RFC ≈ 77 s, mutant ≈ 145 s. This shows RFC has the steeper onset, while CTF18-RFC catches up in completion, and the mutant is slower overall. We briefly note that RFC’s phases have been assigned in prior stopped-flow work and are consistent with a rapid entry step and a slower repositioning/complex release phase; we do not assign phases for CTF18-RFC here and instead rely on model-minimal timing comparisons to avoid over-interpretation. 

      (3) AAA+ is an acronym for "ATPases Associated with diverse cellular Activities" rather than "Adenosine Triphosphatase Associated". 

      Corrected to ATPases Associated with diverse cellular Activities (AAA+).

      Reviewer #2 (Public review): 

      Summary 

      Briola and co-authors have performed a structural analysis of the human CTF18 clamp loader bound to PCNA. The authors purified the complexes and formed a complex in solution. They used cryo-EM to determine the structure to high resolution. The complex assumed an auto-inhibited conformation, where DNA binding is blocked, which is of regulatory importance and suggests that additional factors could be required to support PCNA loading on DNA. The authors carefully analysed the structure and compared it to RFC and related structures. 

      Strength & Weakness 

      Their overall analysis is of high quality, and they identified, among other things, a humanspecific beta-hairpin in Ctf18 that flexible tethers Ctf18 to Rfc2-5. Indeed, deletion of the beta-hairpin resulted in reduced complex stability and a reduction in a primer extension assay with Pol ε. Moreover, the authors identify that the Ctf18 ATP-binding domain assumes a more flexible organisation. 

      The data are discussed accurately and relevantly, which provides an important framework for rationalising the results. 

      All in all, this is a high-quality manuscript that identifies a key intermediate in CTF18-dependent clamp loading. 

      Comments on revisions: 

      The authors have done a nice job with the revision. 

      We thank the reviewer for their very positive comments.

      Reviewer #3 (Public review): 

      Summary: 

      CTF18-RFC is an alternative eukaryotic PCNA sliding clamp loader which is thought to specialize in loading PCNA on the leading strand. Eukaryotic clamp loaders (RFC complexes) have an interchangeable large subunit which is responsible for their specialized functions. The authors show that the CTF18 large subunit has several features responsible for its weaker PCNA loading activity, and that the resulting weakened stability of the complex is compensated by a novel beta hairpin backside hook. The authors show this hook is required for the optimal stability and activity of the complex. 

      Relevance: 

      The structural findings are important for understanding RFC enzymology and novel ways that the widespread class of AAA ATPases can be adapted to specialized functions. A better understanding of CTF18-RFC function will also provide clarity into aspects of DNA replication, cohesion establishment and the DNA damage response. 

      Strengths: 

      The cryo-EM structures are of high quality enabling accurate modelling of the complex and providing a strong basis for analyzing differences and similarities with other RFC complexes. 

      Weaknesses: 

      The manuscript would have benefited from a more detailed biochemical analysis using mutagenesis and assays to tease apart the differences with the canonical RFC complex. Analysis of the FRET assay could be improved. 

      Overall appraisal: 

      Overall, the work presented here is solid and important. The data is mostly sufficient to support the stated conclusions.

      We thank the reviewer for their mainly positive assessment. Following this reviewer suggestion, we have re-analysed the FRET assay data and amended the manuscript accordingly.

      Comments on revisions: 

      While the authors addressed my previous specific concerns, they have now added a new experiment which raises new concerns. 

      The FRET clamp loading experiments (Fig. 6) appear to be overfitted so that the fitted values are unlikely to be robust and it is difficult to know what they mean, and this is not explained in this manuscript. Specifically, the contribution of two exponentials is floated in each experiment. By eye, CTF18-RFC looks much slower than RFC1-RFC (as also shown previously in the literature) but the kinetic constants and text suggest it is faster. This is because the contribution of the fast exponential is substantially decreased, and the rate constants then compensate for this. There is a similar change in contribution of the slow and fast rates between WT CTF18 and the variant (where the data curves look the same) and this has been balanced out by a change in the rate constants, which is then interpreted as a defect. I doubt the data are strong enough to confidently fit all these co-dependent parameters, especially for CTF18, where a fast initial phase is not visible. I would recommend either removing this figure or doing a more careful and thorough analysis. 

      We appreciate the reviewer’s concern regarding potential overfitting of the kinetic data in Figure 6. To address this, we performed a model-minimal re-analysis designed specifically to avoid parameter covariance and over-interpretation (Figure 6 and Figs S11-14 in the new manuscript). Only data recorded after the instrument’s <7 s dead time were included in the fits, thereby excluding the partially obscured early region of the reaction. For each clamp loader complex, we selected the minimal kinetic model that produced residuals randomly distributed about zero. This approach yielded a single-exponential fit for CTF18-RFC, whereas RFC and CTF18<sup>Δ165–194</sup>-RFC required double-exponential fits; single-exponential models for the latter two complexes left structured residuals, clearly indicating the presence of an additional kinetic phase.

      Rather than relying on co-dependent amplitude and rate parameters, we quantified the reactions by reporting progress times (t<sub>0.5</sub>, t<sub>0.90</sub>, t<sub>0.95</sub>), which provide a model-independent measure of reaction speed. This directly addresses the reviewer’s concern and allows a fair comparison of the relative kinetics among the complexes.

      From this analysis, RFC exhibited the fastest onset (t<sub>0.5</sub> ≤ 7 s; lower bound), while CTF18RFC and CTF18<sup>Δ165–194</sup>-RFC showed progressively slower half-times of approximately 8 s and 12 s, respectively. Completion times further emphasized these differences: both RFC and CTF18-RFC reached 95 % completion at ~77 s, whereas the mutant required ~145 s. Despite these kinetic distinctions, CTF18-RFC and its β-hairpin deletion mutant achieved similar EFRET plateaus, indicating that the mutation slows reaction progression but does not reduce the overall extent of PCNA loading.

      Finally, we emphasize that our interpretation is deliberately conservative. We do not assign distinct kinetic phases to CTF18-RFC, as their molecular basis remains unresolved. RFC’s phases have been characterized in prior stopped-flow studies, but CTF18-RFC likely follows a distinct or simplified pathway. Our conclusions are thus limited to what the data unambiguously support: deletion of the Ctf18 β-hairpin decreases the rate—but not the extent—of PCNA loading, consistent with the reduced stimulation of Pol ε primer extension observed under single-turnover conditions.

    1. eLife Assessment

      This important study reports that EEG recordings of the earliest stage of information processing in human visual cortex can be used to predict subsequent choice responses. The findings provide novel, convincing evidence for integrative processing in low-level sensory cortices at the level of scalp-recorded potentials, with the exact nature of the neural signals at the single cell level to be determined. The paper is likely to be of interest to neuroscientists interested in the contribution of early sensory signals to decision making.

    2. Reviewer #1 (Public review):

      General assessment of the work

      In this manuscript, Mohr and Kelly show that the C1 component of the human VEP is correlated with binary choices in a contrast discrimination task, even when the stimulus is kept constant and confounding variables are considered in the analysis. They interpret this as evidence for the role V1 plays during perceptual decision formation. Choice-related signals in single sensory cells are enlightening because they speak to the spatial (and temporal) scale of the brain computations underlying perceptual decision making. However, similar signals in aggregate measures of neural activity offer a less direct window and thus less insight into these computations. The authors do a good job justifying their focus on the C1 component and illustrating how it may behave under different simulated scenarios. The results are interesting, although it is difficult to specify which reasonable hypothesis is exactly ruled out by these results. One interpretation is that V1 activity directly guides perceptual decisions in this task. Alternatively, higher-level areas may do this, provided that their activity largely reflects their V1-inputs. This certainly seems possible in a simple task like this.

      Summary of substantive concerns

      I have no substantive concerns about the revised version of the paper.

    3. Reviewer #2 (Public review):

      Summary:

      Mohr and Kelly report a high-density EEG study in healthy human volunteers in which they test whether correlations between neural activity in primary visual cortex and choice behavior can be measured non-invasively. Participants performed a contrast discrimination task on large arrays of Gabor gratings presented in the upper left and lower right quadrants of the visual field. The results indicate that single-trial amplitudes of C1, the earliest cortical component of the visual evoked potential in humans, predict forced-choice behavior over and beyond other behavioral and electrophysiological choice-related signals. These results constitute an important advance for our understanding of the nature and flexibility of early visual processing.

      Strengths:

      The findings suggest a previously unsuspected role for aggregate early visual cortex activity in shaping behavioral choices.

      The authors extend well-established methods for assessing covariation between neural signals and behavioral output to non-invasive EEG recordings.

      The effects of initial afferent information in primary visual cortex on choice behavior is carefully assessed by accounting for a wide range of potential behavioral and electrophysiological confounds.

      Caveats and limitations are transparently addressed and discussed.

      Weaknesses:

      Due to the inherent limitations of scalp-recorded visual evoked potentials, the results cannot be directly compared to invasive recordings in animal models.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      General assessment of the work:

      In this manuscript, Mohr and Kelly show that the C1 component of the human VEP is correlated with binary choices in a contrast discrimination task, even when the stimulus is kept constant and confounding variables are considered in the analysis. They interpret this as evidence for the role V1 plays during perceptual decision formation. Choice-related signals in single sensory cells are enlightening because they speak to the spatial (and temporal) scale of the brain computations underlying perceptual decision-making. However, similar signals in aggregate measures of neural activity offer a less direct window and thus less insight into these computations. For example, although I am not a VEP specialist, it seems doubtful that the measurements are exclusively picking up (an unbiased selection of) V1 spikes. Moreover, although this is not widely known, there is in fact a long history to this line of work. In 1972, Campbell and Kulikowski ("The Visual Evoked Potential as a function of contrast of a grating pattern" - Journal of Physiology) already showed a similar effect in a contrast detection task (this finding inspired the original Choice Probability analyses in the monkey physiology studies conducted in the early 1990's). Finally, it is not clear to me that there is an interesting alternative hypothesis that is somehow ruled out by these results. Should we really consider that simple visual signals such as spatial contrast are *not* mediated by V1? This seems to fly in the face of well-established anatomy and function of visual circuits. Or should we be open to the idea that VEP measurements are almost completely divorced from task-relevant neural signals? Why would this be an interesting technique then? In sum, while this work reports results in line with several single-cell and VEP studies and perhaps is technically superior in its domain, I find it hard to see how these findings would meaningfully impact our thinking about the neural and computational basis of spatial contrast discrimination.

      We agree that single cell measurements allow for a spatially more detailed analysis, but they are not feasible in humans. Assuming we value insights into the relationship between neural activity and decision making in the human as well as non-human brain, we are restricted to non-invasive measurements such as EEG, which inevitably showcase the neural underpinnings of decision making at a coarser level of analysis. This was the challenge we met with our paradigm design. For example, we chose contrast as the task-relevant stimulus feature in this study because monotonic contrast response functions exist for sensory neurons throughout the visual system, and the aggregated measures that we could attain with EEG would reflect that contrast-sensitivity and hence provide a window onto the encoding of the main decision-relevant quantity. We were specifically interested in initial afferent, contrast-dependent V1 activity reflected in the C1 component (80-90 ms). As we point out in the Introduction, the C1 is unusual among EEG signals in the extent to which it is dominated by a single visual area, V1 (Jeffreys & Axford, 1972; Clark et al., 1994; Di Russo et al., 2002; Ales et al., 2010; Mohr et al., 2024), and even if other downstream areas also make a minor contribution in the C1 time period, it still represents a very low-level sensory response early in the sensory analysis pipeline, appropriate for addressing our primary question of whether such a low-level signal is used in the formation of perceptual decisions. The alternative hypothesis, that early responses are passed over in decision readout, relates to a fundamental debate about whether early sensory responses are separated from cognition. The possibility that late, but not early, representations are correlated with choices does not imply that the later sensory representations are divorced from the earlier ones, only that there is a noise component that is not shared between the two, such as that produced by the ensuing computations that generate the later representations. Instead, a lack of choice probability in early representations would imply that decision readout is selective in where it sources sensory evidence from, with some possible reasons being to maintain high quality standards for sensory evidence or to impose a layer of separation between cognition and sensation.

      As the reviewer points out, the animal literature is highly mixed on the topic of choice probability in V1. Even for orientation discrimination tasks where V1 is ostensibly highly suited given the existence of orientation columns in V1, and even when measurements are taken from V1 neurons with good neurometric performance and/or aggregated across a V1 population (Jasper et al 2019), some studies have reported little to no V1 choice probability. If our alternative hypothesis of no EEG-indexed V1 choice probability flies in the face of well-established anatomy and function of visual circuits, then so also do these empirical findings in the animal neurophysiology literature. 

      Although there are important aspects of choice probability that are accessible in single cell studies but not in EEG (e.g. noise correlations, details of circuit physiology), our EEG measurements tap into the same phenomenon, just at a different level of analysis, i.e. the neural population level. At this level, we have been able to address whether the full body of sensory responses at a particular stage of visual analysis is systematically related to perceptual decision outcomes. Very similar questions are in fact sometimes addressed in the animal neurophysiology literature; for example, Kang and Maunsell (2020) aggregated single-cell choice probability measurements within visual areas to investigate whether choice probability strength at the level of an entire visual area was sensitive to task demands. The global vantage point of EEG comes with the additional benefit of picking up signatures of other potentially mediating processes such as attention and being able to control for them in our analysis. Our human study thus provides a valuable complementary viewpoint alongside animal neurophysiology work in this area.

      Summary of substantive concerns:

      (1) The study of choice probability in V1 cells is more extensive than portrayed in the paper's introduction. In recent years, choice-related activity in V1 has also been studied by Nienborg & Cumming (2014), Goris et al (2017), Jasper et al (2019), Lange et al (2023), and Boundy-Singer et al (2025). These studies paint a complex picture (a mixture of positive, absent, and negative results), but should be mentioned in the paper's introduction.

      We thank the reviewer for highlighting these papers bearing on choice-related activity in V1, only two of which we had cited. The three additional studies do indeed lend further support to our description of the complex picture around V1-CP effects in the literature and we have now included them.

      (2) The very first study to conduct an analysis of stimulus-conditioned neural activity during a perceptual decision-making task was, in fact, a VEP study: Campbell and Kulikowski (1972). This study never gained the fame it perhaps deserves. But it would be appropriate to weave it into the introduction and motivation of this paper.

      We are aware of this paper, and indeed we ourselves have shown steady-state VEP (SSVEP) correlations with timing and selection of decision reports (O'Connell et al 2012; Grogan et al 2023), but SSVEPs do not provide an index of initial afferent V1 activity in the way that the C1 of the transient VEP does. SSVEPs are evoked by a rapid sequence of stimulus onsets, so that activity cannot be attributed to a particular stimulus onset nor its bottom-up latency resolved, and, being a response to an ongoing stimulus, it combines top-down and bottom-up influences from striate and extra striate areas (Di Russo et al 2007). Indeed, in Campbell and Kulikowski (1972) the SSVEP was almost entirely eliminated when the stimulus was undetected. This is in keeping with robust modulations of the SSVEP by spatial attention (Muller and Hillyard 2000). Cognitive influences of this magnitude are never observed in the C1, and in fact are often not observed at all even when later VEP components show robust modulations (Luck et al 2000), which motivated a recent meta-analysis to address the issue (Qin et al 2022). This highlights the important distinction between the earliest transient VEP activity reflecting mainly the initial afferent response in V1, and steady-state sensory activity reflecting a mix of bottom-up and top-down influences across visual cortex. Because of the importance of this distinction, we have added a reference to the above SSVEP papers to the 3rd paragraph of the introduction along with a statement about the distinction.

      (3) What are interesting alternative hypotheses to be considered here? I don't understand the (somewhat implicit) suggestion here that contrast representations late in the system can somehow be divorced from early representations. If they were, they would not be correlated with stimulus contrast.

      This same conundrum applies to single-cell studies of choice probability. Do studies showing choice probability in V4 but not V1 for example demonstrate that V4 is divorced from V1? In such studies, measurements are typically taken from large representative samples of neurons from both areas with good neurometric performance in both cases and the task often (though not always) involves a target stimulus feature that is encoded in V1 such as orientation. Why then should V4 but not V1 show choice probability when we know the vast majority of input to the visual cortex passes through V1? It must be that feature representation and choice formation are different things with one not inferring the other. This is true for an EEG study as much as it is for a single-cell study.

      The alternative hypothesis in our study is that the early sensory responses indexed by the C1 are not directly used in the formation of the perceptual decision at hand. As outlined in our comments above, this does not imply that those early responses are divorced from later responses. Of course, both are correlated with stimulus contrast and so would correlate with each other across changing contrast but this does not necessitate that their noise is correlated when contrast is held constant because new instantiations of noise can be generated by the computations performed at each stage of visual processing. Thus, the interesting alternative hypothesis is that information contained in the sensory representation generated during initial afferent V1 activity is not used directly to form decisions, and instead, decisions are read out from the outputs of computations performed further downstream. Such an outcome, if it had arisen in our data, would have been consistent with a separation between cognition and early visual processing. Instead, our results suggest a certain level of cognitive interfacing at the lowest and earliest cortical levels of visual processing. We have now added text to the Introduction to highlight the distinction between sensory representation and decision readout in order to make the alternative hypothesis clearer.

      (4) I find the arguments about the timing of the VEP signals somewhat complex and not very compelling, to be honest. It might help if you added a simulation of a process model that illustrated the temporal flow of the neural computations involved in the task. When are sensory signals manifested in V1 activity informing the decision-making process, in your view? And how is your measure of neural activity related to this latent variable? Can you show in a simulation that the combination of this process and linking hypothesis gives rise to inverted U-shaped relationships, as is the case for your data?

      We thank the reviewer for this suggestion of a simulation, which we carried out using the Matlab code. We have also included new Figure 1-Figure Supplement 1 in the revised manuscript.

      In our view, sensory signals in V1 are informing the decision-making process in this task from at least as early as the initial afferent response. The main point about C1 latency in relation to the response-time contingency of the choice probability effect is that the more time that elapses without a decision made (and therefore the more additional sensory processing that contributes to the decision), the more diluted is the contribution of the C1 to the decision by contributions from later representations, and thus choice probability reduces. Likewise, when response times are too quick for C1 evidence to contribute, choice probability is also absent, hence the inverted-U-shaped curve. Moreover, if the C1-choice correlation is mediated by a top-down factor such as attention rather than readout, the inverted-U-shaped curve is not expected because in such a case the relative timing of the C1 and choice commitment would not be relevant.

      Reviewer #2 (Public review):

      Summary:

      Mohr and Kelly report a high-density EEG study in healthy human volunteers in which they test whether correlations between neural activity in the primary visual cortex and choice behavior can be measured non-invasively. Participants performed a contrast discrimination task on large arrays of Gabor gratings presented in the upper left and lower right quadrants of the visual field. The results indicate that single-trial amplitudes of C1, the earliest cortical component of the visual evoked potential in humans, predict forced-choice behavior over and beyond other behavioral and electrophysiological choice-related signals. These results constitute an important advance for our understanding of the nature and flexibility of early visual processing.

      Strengths:

      (1) The findings suggest a previously unsuspected role for aggregate early visual cortex activity in shaping behavioral choices.

      (2) The authors extend well-established methods for assessing covariation between neural signals and behavioral output to non-invasive EEG recordings.

      (3) The effects of initial afferent information in the primary visual cortex on choice behavior are carefully assessed by accounting for a wide range of potential behavioral and electrophysiological confounds.

      (4) Caveats and limitations are transparently addressed and discussed.

      We would like to thank the reviewer for these positive remarks.

      Weaknesses:

      (1) It is not clear whether integration of contrast information across relatively large arrays is a good test case for decision-related information in C1. The authors raise this issue in the Discussion, and I agree that it is all the more striking that they do find C1 choice probability. Nevertheless, I think the choice of task and stimuli should be explained in more detail.

      We thank the reviewer for raising this point about the large stimulus arrays. As we said in our Discussion, it would seem that aggregation across a large stimulus region would be better suited to a downstream visual area with larger receptive fields, yet our setting of a strict deadline would put the emphasis back on earlier sensory representations. We now elaborate on this matter in the discussion, to say that although the small receptive fields and short, slow horizontal connections in V1 mean that the aggregation necessary for performing the task is unlikely to happen within V1 during the C1 timeframe, the aggregation would be readily achieved simply by convergence of the outputs of all relevant V1 neurons for a given stimulus array on the same decision process. In this sense, the design of our paradigm was such that the globally-measured C1 component on the scalp reflected the same aggregated evidence input as the summed V1 readout that we suppose would be entering the decision process.  

      We have also added further rationale in the Methods section on the practical benefits of the stimulus design, as the reviewer anticipates in their subsequent point, of yielding robust C1 signals. This concern was paramount in the design of this study because we expected the C1 difference metric that was of interest to be very small. We also needed a robust C1 to be measured in both the upper and lower visual field in as many individuals as possible and, in our experience, this is true less often when using smaller stimuli, even with a pre-mapping procedure.

      It also helped to homogenize C1 topography across individuals and ensure that topographies from the upper and lower visual field had sufficient overlap that there were electrodes with strong loading from both topographies where the C1 difference as a function of which array was brighter would be maximal.

      We have updated the methods section to provide these rationales while we describe the stimulus design.

      (2) In a similar vein, while C1 has canonical topographical properties at the grand-average level, these may differ substantially depending on individual anatomy (which the authors did not assess). This means that task-relevant information will be represented to different degrees in individuals' single-trial data. My guess is that this confound was mitigated precisely by choosing relatively extended stimulus arrays. But given the authors' impressive track record on C1 mapping and modeling, I was surprised that the underlying rationale is only roughly outlined. For example, given the topographies shown and the electrode selection procedure employed, I assume that the differences between upper and lower targets are mainly driven by stimulus arms on the main diagonal. Did the authors run pilot experiments with more restricted stimulus arrays? I do not mean to imply that such additional information needs to be detailed in the main article, but it would be worth mentioning.

      We thank the reviewer for their thoughtful consideration of this issue about individual variability in C1 retinotopy. Indeed, as the reviewer anticipated we expected the large stimulus coverage to mitigate this issue and we think that our response to the point above and the changes we made to the manuscript in response address this point also. Although we did not show this in the manuscript, we did in fact find that C1 topography was much more similar across individuals than it has been in previous C1 experiments we have carried out with smaller stimuli.

      However, we acknowledge the reviewer’s point that the signal measured at a specific electrode likely has a variable loading strength from the various gratings in the stimulus array and that the gratings of maximal loading may indeed vary from subject to subject. Such inter-subject variability cannot confound the choice probability effects because the latter are measured within-subject. Nevertheless, it could be a source of noise. We believe the impact of this is unlikely to be substantial for the following reasons:

      i) We designed the spatial spread of contrasts in such a way as to encourage participants to aggregate across the full array. In essence, to match the property of the C1 as an aggregate measure of V1 activity, we designed a task that involved aggregating across stimulus elements. Therefore, the decision weighting applied to any particular grating should be representative of the weighting applied to all gratings and, as such, the specific gratings that contribute most to the C1 signal for a particular participant should be relatively inconsequential.

      ii) By avoiding the horizontal and vertical meridians we avoided the regions of space where the shifts in C1 topography are largest.

      (3) Also, the stimulus arrangement disregards known differences in conduction velocity between the upper and lower visual fields. While no such differences are evident from the maximal-electrode averages shown in Figure 1B, it is difficult to assess this issue without single-stimulus VEPs and/or a dedicated latency analysis. The authors touch upon this issue when discussing potential pre-C1 signals emanating from the magnocellular pathway.

      Indeed, there are important differences in V1 properties between the upper and lower visual fields, visual acuity being another example in addition to conduction velocity as the reviewer points out. However, these differences appeared to be quite minimal in this case (Figure 1B does in fact include a single-stimulus VEP – the “1-stim” entry in the legend). Perhaps this is also due to the large stimulus array which may include a range of conduction velocities within it and thereby blur overall differences between the upper and lower visual field. The variability of contrast within each array was also quite high (+/-20% from the midpoint), which would have further increased within-array conduction velocity variability and blurred differences between arrays.

      Our staircasing procedure may have also helped in this regard to some extent as it included a bias parameter between the arrays to account for any behavioural response biases. Although the small contrast changes it usually incurred are likely much too small to change conduction velocities, it corrected for any effect on behaviour they may have.

      (4) I suspect that most of these issues are at least partly related to a lack of clarity regarding levels of description: the authors often refer to 'information' contained in C1 or, apparently interchangeably, to 'visual representations' before, during, or following C1. However, if I understand correctly, the signal predicting (or predicted by) behavioral choice is much cruder than what an RSA-primed readership may expect, and also cruder than the other choice-predictive signals entered as control variables: namely, a univariate difference score on single-trial data integrated over a 10 ms window determined on the basis of grand-averaged data. I think it is worth clarifying and emphasizing the nature of this signal as the difference of aggregate contrast responses that *can* only be read out at higher levels of the visual system due to the limited extent of horizontal connectivity in V1. I do not think that this diminishes the importance of the findings - if anything, it makes them more remarkable.

      This is true that a univariate measure may stick out in a field increasingly favouring multivariate analyses with the spread of machine learning, and so we have added a short qualifier in the methods section where we describe the C1 measurement to explicitly state that it is a scalar variable. What we have done in using this univariate measure is leverage the rich prior knowledge about V1 anatomy and neurophysiology, rather than trust in data-driven classifiers; interestingly, we found that such a classifier trained on all electrodes discriminates choices less well than our informed univariate measure during the C1 time-frame. 

      We also thank the reviewer for raising an interesting point about the nature of aggregation and readout in the context of our stimulus. We agree that it is not feasible that V1 activity would be aggregated locally in V1 across such large regions of space prior to being readout within the C1 time period. As we say above, the aggregation may instead be carried out through convergent transmission of the parallel, spatially-local V1 information to the decision process.

      (5) Arguably even more remarkable is the finding that C1 amplitudes themselves appear to be influenced by choice history. The authors address this issue in the Discussion; however, I'm afraid I could not follow their argument regarding preparatory (and differential?) weighting of read-outs across the visual hierarchy. I believe this point is worth developing further, as it bears on the issue of whether C1 modulations are present and ecologically relevant when looking (before and) beyond stimulus-locked averages.

      We thank the reviewer for their positive appraisal of this additional finding, which we also found remarkable. We agree that our description of our interpretation was too brief and lacked clarity. We have reworded it and expressed it in terms of the speed accuracy trade-off, with the new explanation given below. However, it is important to remember that this account is speculative and serves only to explain the response-time contingency of the bias. That the bias was present and constitutes a modulation of the C1 does not rest on this argument:

      […] “to explain the RT contingency for the C1 bias, we speculate that the speed-accuracy trade-off could fluctuate from trial to trial and that the corresponding decision bound fluctuations (Heitz and Schall 2012) could be implemented by pre-determining decision weights across visual areas. For example, to achieve faster decisions, the sensory evidence requirement could be reduced by placing greater emphasis on initial afferent V1 evidence. In such a case, the RT contingency of the above choice history bias could be explained if the C1 bias is exerted in proportion with the planned emphasis of C1 evidence for the upcoming decision.”

      Recommendations to the Authors:

      Reviewer #2 (Recommendations for the authors):

      (1) As someone whose first language is not English, I am somewhat hesitant to bring this up, but I found the use of 'readout' as both noun and verb somewhat confusing. I thought read-out was defined as 'that which is read out'.

      We agree that this dual use of the word readout may cause confusion. To avoid this, we have edited the manuscript to replace verbal forms of the word “readout” with “read out”.

      (2) I found it difficult to follow the reasoning for why intermediate RTs should be the ones most affected by C1-related information. Perhaps this could be described in more detail for the uninitiated reader.

      We appreciate that our reasoning for why intermediate RTs should be the ones most affected by C1-related information was difficult to follow. We have now added a simulation to showcase this rationale more clearly - see response to reviewer 1, and new figure supplement to figure 1. 

      (3) It would be interesting to compare the effect sizes observed here to those seen in single-cell studies and to discuss this comparison with regard to differences in the nature of EEG signals and single-cell firing rates.

      While we agree that such a comparison would be interesting if feasible, it would have to be for the same task settings, which have not been used in a single-cell study, and  the very different nature and extent of noise between the two recording modalities would make such a comparison difficult to interpret, e.g. background noise in EEG from ongoing processes unrelated to the task. 

      (4) Figure 1: It may be worth mentioning in the legend that only parts of the peripheral stimulus grid are shown for better visibility, as the Methods speak of 9 x 9 grids. Also, in panel B, it should be mentioned that waveshapes are calculated using individually selected maximal-difference electrodes.

      We thank the reviewer for spotting these. We have updated the caption for this figure to reflect these two observations.

      (5) Figure 4: The different shades of green may be difficult to distinguish when printed.

      Although this may be true, we chose shades of green that differ in luminance so they should still be distinguishable. Different colours may in fact be less distinguishable if they had the same luminance and the print was black-and-white. We chose different shades of the same colour to reflect the fact that we were plotting the same signals at different difficulty levels. In our opinion, this takes precedence since eLife is an online journal so the majority of readers will likely read it digitally.

      (6) Methods/Task: While the ITI of 780 ms is substantial, I was wondering why the authors decided against jittering this interval? It would be helpful to briefly discuss whether contrast adaptation for slow periodic stimulation may have affected the findings.

      We opted against jittering the ITI to avoid an additional source of inter-trial variability. While this may allow for adaptation effects of this source, this would be approximately constant across trials and therefore less of a concern for our design. We have added text to the methods section to state this rationale.

      (7) Methods/Stimuli: The authors convincingly argue that focusing on single arms of the stimuli is an unlikely strategy, but did they ask for participants' strategies during debriefing?

      We are glad that the reviewer found our argument about whether or not participants may have focused on a single arm of the stimuli convincing. We did not ask participants about their strategies but even with such a debriefing, there would still remain a possibility that a participant may have used that strategy but were unaware that they were doing so. In any case, if participants were doing this it would have dampened the strength of our choice probability result. 

      (8) Methods/Procedure, Difficulty Titration: Why did the authors opt for manually adapting the difficulty level in a separate session rather than constantly and automatically titrating difficulty?

      We did this because calculating choice probability requires a comparison of trials with different choice outcomes but the same stimulus so continuously staircasing difficulty level during the experiment would have created a confound. Although this could have been corrected for in our regression, this would have entailed greater noise that we could avoid by staircasing in advance.

    1. eLife Assessment

      This important study links allelic expression imbalance with replication timing, suggesting a stochastic model for haploinsufficiency in dosage-sensitive disease. The integration of allele-specific RNA-seq and replication timing in clonal systems provides solid evidence for an association between asynchronous replication and allelic imbalance, although the scope and generality of some conclusions require more cautious interpretation. This study will interest epigeneticists and genome regulation researchers studying replication timing and monoallelic expression, as well as developmental biologists and human geneticists concerned with clonal heterogeneity, haploinsufficiency, and variable disease penetrance.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      Summary:

      The existence of VERT regions is well supported, but the number of regions called as ISCs may be inflated by permissive thresholds (e.g., AEI {greater than or equal to} 0.8 or {less than or equal to} 0.2 in a single clone). This risks conflating transient stochastic differences with stable ISCs. Similarly, the claim of cell-type specificity is not convincingly demonstrated given the small sample size (n=4) and strong batch confounding between lymphoblastoid and cartilage progenitors. While syntenic VERT regions across mouse and human are intriguing, they complicate interpretation of strong clustering by cell type. Sampling depth may also have exaggerated allelic imbalance calls.

      The proposed role of ISCs in haploinsufficiency is conceptually interesting but remains speculative; developmental stochasticity and founder population size may play larger roles than replication timing. The claim that autosomal inactivation is mechanistically distinct from XCI, however, is reasonable and supported.

      Some conclusions should be more explicitly qualified as preliminary. Cell-type specificity and mitotic stability both require stronger evidence; the latter is inferred indirectly from clonal expansion rather than shown directly, and orthogonal experiments (e.g., allele-specific ChIP-seq, DNA methylation) would be required. Estimated genomic coverage of ISCs should also be re-evaluated, as single-clone observations may inflate counts.

      Replication is limited. Hierarchical clustering is confounded by batch and based on presence/absence calls that lack quantitative resolution. More robust approaches would include using magnitude of imbalance, annotating VERTs by genomic location, applying stricter thresholds for replication timing, and benchmarking AEI distributions against the X chromosome. These are realistic re-analyses requiring no new data and could be completed in ~1 month.

      Methods are generally well described and reproducible. Figures and text would benefit from improved clarity: axis labels are missing in places (e.g., Fig. 1c, Fig. 2g), legends should explain chromosome arm colors, and cluttered figures such as Fig. 1j could be re-visualized for interpretability. Gene set enrichment analysis should be restricted to avoid inflated significance from overly broad categories. A useful citation for XCI timing (pmid=39420003) could be added to strengthen background.

      Significance:

      Conceptually, this work introduces ISC-like phenomena in human and mouse progenitor lines, coupling allelic expression imbalance with replication timing. Technically, it combines allele-specific RNA-seq with Repli-seq in genotyped, clonal, single-cell-derived lines. Clinically, it suggests an alternative model for haploinsufficiency, relevant to dosage-sensitive diseases where stochastic transcriptional delays could shape penetrance.

      The study builds on prior work in allelic exclusion (e.g., HLA, olfactory receptors) and random monoallelic expression, generalizing these phenomena into ISC/vert frameworks and proposing mitotic stability of allele choice. By extending beyond expression to replication timing, the authors suggest a broader paradigm for epigenetic regulation at autosomal loci.

      The paper will be of interest to epigeneticists studying XCI, allelic exclusion, and monoallelic expression; to developmental biologists examining replication timing and differentiation; and to clinicians concerned with dosage-sensitive and haploinsufficient disorders.

    3. Reviewer #2 (Public review):

      Summary:

      - This is a complicated research topic that touches on a few sub-fields of biology, and thus to make the paper more approachable I would recommend a careful edit of the text for clarity and precision of language.<br /> - Authors point out that this is a decades-old field; it would make sense to use terminology established within the field rather than inventing their own. Allelic imbalance has been referred to as AI, MAE (monoallelic expression), RMAE (random monoallelic expression) etc. The paper whose mouse data the authors make use of uses Asynchronous Stochastic Replication Timing (ASRT) instead of VERT to refer to the same phenomenon. Creating unnecessary jargon makes the paper more difficult to read and adds needless complexity to an already complex field.<br /> - Methods do not provide sufficient detail to fully evaluate or reproduce these experiments.<br /> - It is helpful to show representative loci as the authors do in Fig 1F and G and Fig 2, but these panels are very densely rendered and thus difficult to process visually - even the cartoon version (1D) is thick with overlapping lines. The point that allelic imbalance is enriched in VERTs would be enhanced if the authors could present the allelic ratio for all genes found in all VERTs, demonstrating how replication timing on either chromosome affects the allelic ratio.<br /> - The authors make the important point that VERTs are unlikely to be shared among different cell types and tissues (Fig 1i) but then find an enrichment for neuronal and immune genes in VERT regions identified in ACPs. It follows that these same genes are unlikely to be in such regions in the tissues where they are relevant. Some of the GO terms presented are too broad to suggest any biological significance to the result, even if there is statistical significance (for example, the top term for LCL clones 'Cytoplasm' is associated with 12,000 genes, and the second term for mouse clones 'Membrane' is associated with 10,000). It would be helpful to focus on GO terms lower in the GO hierarchy.<br /> - Figure 3 highlights the association of related gene clusters with VERTs but the VERTs are assigned based on variable replication timing in just 1 or 2 clones. This is an interesting observation, but to make the point that "VERT regions frequently coincide with gene clusters in the human genome" there needs to be a systematic assessment of replication timing at all gene clusters across all clones, and a statistical test for significance.<br /> - It is an interesting hypothesis that VERTs are conserved between species at synentic loci. If such regions are really conserved, one would expect that replication timing at these sites would be consistently asynchronous. However, the data presented shows that in human clones these VERTs can be specific to an individual donor (as in 5A) or an individual clone (as in 5H).<br /> - Again, the finding that VERTs coincide with neurodevelopmental disease genes in immune and cartilage cells is at odds with the previous statements and data about the tissue specificity of VERTs. In order to support the claim that neurodevelopmental disease associated genes reside in asynchronously replicating regions, and are thus more prone to allelic imbalance, the authors would need to demonstrate this phenomenon in neuronal cells.

      Significance:

      The authors pair analysis of replication timing and allele-specific expression in clonal populations of primary human cells. They combine these data with previously published data on clones from transformed human cell lines. They identify a number of genomic regions that display asynchronous replication timing in at least one clone and correlate these regions with allele-specific expression of genes within them. They also observe that several interesting gene sets, including genes that are associated with human diseases, map to asynchronously replicating regions. This is a good experimental approach that builds on already published data demonstrating the connection between allelic imbalance and replication timing. However, the authors consistently lean on thin evidence (i.e. a single clone) within a modestly sized dataset (4 clones from 2 donors each) to propose a new model for haploinsufficiency in human disease. The consistent focus on limited elements in the data and perhaps an overreach in the interpretation makes it difficult to appreciate what is in fact a very good experiment.

    4. Author response:

      General Statements

      We thank the reviewers for their thoughtful and constructive comments, which will substantially improve our manuscript. In response, we will revise the text and figures throughout to address the points raised. Specifically, we will:

      i. Refine our definition of Inactivation/Stability Centers (I/SCs): We will limit this designation to loci where both Allelic Expression Imbalance (AEI) and Variable Epigenetic Replication Timing (VERT) are detected, either in the present study or in previously published work.

      ii. Expand methodological clarity: We will provide detailed descriptions of how VERT regions were identified, annotated, and quantified, including thresholds for allelic imbalance, replication timing variability, and sampling depth. We also justify the ≥80% AEI cutoff, which is based on recent studies showing that modest allelic biases can have biological and clinical significance.

      iii. Enhanced benchmarking and validation: In addition to the analysis of X inactivation in female ACP cells, we will include comparisons between imprinted and non-imprinted regions to benchmark the magnitude of allelic replication timing imbalance, demonstrating that the magnitude of imbalance observed at imprinted loci is comparable to that at the non-imprinted VERT regions.

      iv. Address tissue specificity and sampling limitations: We will discuss the limited number of clones, tissues, and individuals analyzed, emphasizing that while our data identify robust AEI and VERT patterns, additional tissues and individuals will be required to capture the full diversity of I/SC regulation.

      v. Clarify biological relevance: We will expand our discussion to highlight the consistency of AEI findings across cell types, including examples of genes implicated in neurodevelopmental and neurodegenerative disorders, and we will clarify our model of how I/SC regulation may contribute to haploinsufficiency, variable expressivity, and incomplete penetrance in human disease.

      vi. Improved figures and supplemental data: We will update figure legends for clarity, add a new supplementary figure comparing imprinted and non-imprinted regions, and cross-reference all supplemental tables.

      We believe these revisions strengthen the manuscript conceptually and experimentally, and we thank the reviewers and editors for their valuable feedback.

      Description of the planned revisions

      Reviewer #1:

      The existence of VERT regions is well supported, but the number of regions called as ISCs may be inflated by permissive thresholds (e.g., AEI {greater than or equal to} 0.8 or {less than or equal to} 0.2 in a single clone). This risks conflating transient stochastic differences with stable ISCs.

      We selected the >80% (or <20%) allelic imbalance threshold, along with the requirement of at least one biallelic clone, as our criterion for significant AEI. This choice was guided by a recent study demonstrating that allelic imbalance as low as a 65%/35% is enough to effect disease penetrance in humans (Nature 2025; 637:1186–1197). For completeness, results obtained using more stringent thresholds (>90% and >95% imbalance) are presented in Supplementary Table 2.

      Furthermore, it is unlikely that transient stochastic differences in allelic expression, such as those detected by single-cell RNA sequencing assays (Nat. Rev. Genet. 2015; 16:653–664), would be captured by our approach. Each clone in our study was expanded from a single cell to over one million cells before both RNA-seq and Repli-seq analysis, effectively averaging out transient transcriptional and/or replication fluctuations, and thus reflecting stable, mitotically heritable epigenetic states.

      More robust approaches would include using magnitude of imbalance, annotating VERTs by genomic location, applying stricter thresholds for replication timing, and benchmarking AEI distributions against the X chromosome.

      All VERT regions identified in this study were annotated according to both the magnitude of allelic imbalance and their genomic coordinates, using 250 kb windows for the human samples and 50 kb windows for the mouse samples (see Supplementary Tables 1 and 6). Figure 1c directly compares the magnitude of imbalance, defined as outliers in the standard deviation, for both allelic replication timing and allelic expression across autosomal and X-linked loci in female ACP cells.

      In addition, we will benchmark the magnitude of replication timing imbalance using autosomal imprinted regions as a second internal control. We detected allelic replication imbalance at 13 known imprinted loci, and the standard deviation of replication timing at these loci, measured in 250 kb windows, is comparable to that observed across the >350 VERT regions detected at non-imprinted sites. To illustrate this comparison, we will include a supplementary figure directly comparing imprinted and non-imprinted regions.

      Figures and text would benefit from improved clarity: axis labels are missing in places (e.g., Fig. 1c, Fig. 2g), legends should explain chromosome arm colors, and cluttered figures such as Fig. 1j could be re-visualized for interpretability.

      Figure labels will be added to Figs. 1c and 2g, and legends will be modified for clarity.

      “the claim of cell-type specificity is not convincingly demonstrated given the small sample size (n=4) and strong batch confounding between lymphoblastoid and cartilage progenitors.” And “Hierarchical clustering is confounded by batch and based on presence/absence calls that lack quantitative resolution.”

      We agree that the limited number of individuals and clones, as well as the comparison between only two distinct tissue types (LCLs and ACPs), have quantitative limitations. Our primary intent was to evaluate whether any I/SCs were shared between independently derived clonal datasets and to determine whether there is evidence of tissue-specific I/SC usage, rather than to make quantitative claims about global cell-type specificity.

      To address this concern, we will replace the hierarchical clustering analysis currently shown in Figure 1i with a Venn diagram that more directly illustrates the overlap and tissue-specific distribution of VERT regions detected in the different clonal sets. This revised representation avoids assumptions about clustering relationships and removes batch-driven bias, while still conveying the key observation that many VERT regions are shared across tissues and others appear tissue-restricted.

      While syntenic VERT regions across mouse and human are intriguing, they complicate interpretation of strong clustering by cell type. Sampling depth may also have exaggerated allelic imbalance calls.

      We note that the human LCLs used in our study are B cells, and immunoglobulin gene rearrangements were used to confirm the clonal uniqueness of each line. Similarly, the mouse replication timing data analyzed here was generated from pre-B cells, which also undergo immunoglobulin gene rearrangement. Thus, both the human LCL and mouse pre-B cell datasets were derived from B-cell lineages, providing a consistent cellular context for comparative analysis.

      Sequencing depth is an important consideration for all variant base calls. Without fully haplotype-resolved genomes, previous studies relied on calculating per-SNP calls of allelic imbalance based on reads covering a single nucleotide locus. To improve sequencing depth supporting the identification of VERT and AEI regions, we utilized fully haplotype-resolved genomes that allowed all informative allele-specific reads to be pooled across all heterozygous SNPs within genomic windows or expressed genes. For AEI, we set a minimum threshold of 20 informative allele-specific reads per gene, a minimum FDR-corrected p-value of <=0.05, and a minimum of 80% vs 20% allelic imbalance. Importantly, a recent study has shown that allelic imbalance as low as a 65%/35% is enough to effect disease penetrance in humans (Nature 2025; 637:1186–1197). We reiterate that more stringent thresholds (>90% and >95% imbalance) are presented in Supplementary Table 2.

      Gene set enrichment analysis should be restricted to avoid inflated significance from overly broad categories.

      Reviewer #2:

      Some of the GO terms presented are too broad to suggest any biological significance to the result, even if there is statistical significance (for example, the top term for LCL clones 'Cytoplasm' is associated with 12,000 genes, and the second term for mouse clones 'Membrane' is associated with 10,000). It would be helpful to focus on GO terms lower in the GO hierarchy.

      We will include our complete Gene Ontology analysis, with more specific biological categories, in Supplemental Table 5.

      Allelic imbalance has been referred to as AI, MAE (monoallelic expression), RMAE (random monoallelic expression) etc. The paper whose mouse data the authors make use of uses Asynchronous Stochastic Replication Timing (ASRT) instead of VERT to refer to the same phenomenon. Creating unnecessary jargon makes the paper more difficult to read and adds needless complexity to an already complex field.

      While we agree that allelic expression imbalance has been described by different investigators using many different phrases, we believe that MAE, RMAE and AI do not represent an accurate description of the phenomenon. In our study [and our previous study; Nat Commun. 2022; 13(1):6301] we used clonal analysis of allele-specific expression and found that while some clones display equivalent levels of expression between alleles of a given gene (i.e. bi-allelic expression) other clones express only one allele (i.e. mono-allelic expression), and yet other clones have undetectable expression (i.e. silent on both alleles). This pattern of allele-restricted expression indicates that each allele independently adopts either an expressed or silent state. Importantly, because these expression states are mitotically stable, allele-autonomous, and independent of parental origin, we refer to the choice of the expressed allele as stochastic. Given this variability, we believe that the phrase “Allelic Expression Imbalance” (AEI) represents a more accurate descriptor for this phenomenon. We also point out that “Allelic Expression Imbalance” has been used >120 times in the Pubmed database.

      In addition, the replication asynchrony that exists at these loci is not consistent with purely ASynchronous Replication Timing (ASRT) between alleles. We found that each allele can independently adopt either earlier or later replication timing in different clones. This variability results in some clones exhibiting pronounced asynchrony between alleles, while in others, the two alleles replicate synchronously, with both adopting either the earlier or later timing state. As reported in our previous study (Nat. Commun. 2022; 13:6301), this behavior reflects a stochastic and allele-autonomous process, leading us to describe these loci as exhibiting Variable Epigenetic Replication Timing (VERT), which we believe is a more accurate descriptor of this phenomenon.

      The point that allelic imbalance is enriched in VERTs would be enhanced if the authors could present the allelic ratio for all genes found in all VERTs, demonstrating how replication timing on either chromosome affects the allelic ratio.

      The stochastic nature of allelic expression and replication timing observed at VERT loci indicates that each allele independently acquires its epigenetic state. Specifically, the expressed or silent status of one allele does not predict the replication timing or expression status of the opposite allele. Accordingly, the Early/Late pattern of replication timing that we detect, both in this study and in our previous work (Nat. Commun. 2022; 13:6301), is not correlated with which allele is transcriptionally active. This supports our conclusion that asynchronous replication timing is not a downstream consequence of monoallelic transcription, but rather an independent epigenetic feature of I/SCs. Regardless, we will provide the combined expression ratios for all transcripts that are located within the VERT regions in a Supplemental Table.

      In addition, our analysis of imprinted loci reveals that even at genomic regions with parent-of-origin–specific expression, replication timing does not align with allelic activity: both early- and late-replicating alleles can be transcriptionally active, depending on the gene. This observation is consistent with the complex organization of many imprinted domains, where genes on opposite alleles exhibit reciprocal expression patterns. To illustrate this point, we will include a new supplemental figure demonstrating that imprinted loci harbor genes expressed from both the earlier- and later-replicating alleles.

      Figure 3 highlights the association of related gene clusters with VERTs but the VERTs are assigned based on variable replication timing in just 1 or 2 clones. This is an interesting observation, but to make the point that "VERT regions frequently coincide with gene clusters in the human genome" there needs to be a systematic assessment of replication timing at all gene clusters across all clones, and a statistical test for significance.

      Our intent in Figure 3 was not to suggest that all gene clusters are subject to VERT and AEI, but rather to highlight that several well-characterized multigene families that are known to exhibit random AEI, such as olfactory receptor and HLA gene clusters, coincide with VERT regions at their genomic locations. These examples serve as representative illustrations demonstrating that I/SC-associated regulation occurs at established AEI loci organized in gene clusters.

      To clarify this point, we will revise the text to explicitly state that Figure 3 presents illustrative examples of known AEI-associated gene clusters overlapping with VERT regions, rather than a comprehensive or statistically exhaustive analysis of all gene clusters across the genome.

      It is an interesting hypothesis that VERTs are conserved between species at synentic loci. If such regions are really conserved, one would expect that replication timing at these sites would be consistently asynchronous. However the data presented shows that in human clones these VERTs can be specific to an individual donor (as in 5A) or an individual clone (as in 5H).

      As discussed in our Limitations section, our analysis was restricted to a limited number of cell types, clones, and individuals, which may not capture the full diversity of I/SC usage across tissues and populations. While our dataset was sufficient to identify robust patterns of AEI and VERT, it likely represents only a subset of the broader landscape of I/SC regulation in both humans and mice. We anticipate that future studies incorporating a wider range of tissues, individuals, and clonal analyses will uncover an even greater degree of conservation and diversity in I/SC usage across genomes.

      In order to support the claim that neurodevelopmental disease associated genes reside in asynchronously replicating regions, and are thus more prone to allelic imbalance, the authors would need to demonstrate this phenomenon in neuronal cells.

      We make two points that address this critique: First, many of the neurodevelopmental disease genes located within or adjacent to VERT regions are not exclusively expressed in neuronal cells and have already been shown to exhibit AEI in non-neuronal contexts. For example, Gimelbrant and Chess (Science, 2007; 318:1136–1140) demonstrated AEI of the Parkinson disease genes SNCA and LRRK2 in lymphoblastoid cell lines (LCLs), and in our previous study, we detected AEI of DNAJC6, another Parkinson disease gene, in LCL cells (Nat. Commun. 2022; 13:6301). In the present study that used ACP cells, we identified VERT and AEI of several epilepsy-associated genes, including SCN1A, SCN2A (Fig. 6b), GABRA1(Fig. 6e), and SAMD12 (Fig. 6j), as well as a gene implicated in autism and neurodevelopmental disorders, SEMA5A (Fig. 5c).

      Second, independent studies from the E. Heard laboratory have provided further evidence that AEI occurs in neuronal lineages. Using mouse neural progenitor cells (NPCs), they identified genes subject to AEI (Dev. Cell, 2014; 28:366–380) and they later evaluated AEI of syntenic human neurodevelopmental disease genes, including Snca, App, Eya4, and Grik2 (Nat. Commun. 2021; 12:5330). In addition, they used the phrase “Allelic Expression Imbalance” to describe the epigenetic expression biases at these genes.

      Together, these findings reinforce that AEI, and by extension I/SC regulation, is not restricted to specific cell types, but rather represents a generalizable mechanism of stochastic epigenetic regulation that includes genes relevant to neurodevelopment and disease.

      However, the authors consistently lean on thin evidence (i.e. a single clone) within a modestly sized dataset (4 clones from 2 donors each) to propose a new model for haploinsufficiency in human disease. The consistent focus on limited elements in the data and perhaps an overreach in the interpretation makes it difficult to appreciate what is in fact a very good experiment.

      We agree that our analysis was conducted on a modest number of clones and individuals, which we explicitly acknowledge as a limitation of the present study. However, several key points support the robustness and broader relevance of our conclusions:

      i. Clonal Design and Replication: The strength of our approach lies in its clonal resolution. Each clone represents a single-cell–derived population expanded to over a million cells, enabling direct detection of stable, mitotically heritable allele-specific epigenetic states that would not be apparent in population-averaged data. Importantly, many of the VERT regions we identified are shared between independent clones from different donors and across distinct cell types (ACP and LCL), demonstrating reproducibility and biological consistency.

      ii. Cross-Species Validation: We further identified syntenic VERT regions in mouse pre-B cell clones, including at loci known to exhibit AEI in prior studies, providing independent validation and evolutionary conservation of the phenomenon.

      iii. Integration with Published Evidence: Our findings extend prior observations of AEI and variable replication timing (e.g. Gimelbrant et al. Science 2007; Heskett et al. Nat. Commun. 2022) and are fully consistent with known stochastic allelic expression imbalance of autosomal genes. We also draw parallels with the absence of cellular selection mechanisms that dictate dominant inheritance patterns for loss of function alleles for X linked disease genes (reviewed in: J Clin Invest, 2008, 20-23; and Nat Rev Genet. 2025, 26, 571–580). Our proposed model linking I/SC regulation to haploinsufficiency is therefore a synthesis of our results with an extensive body of published data, not an inference drawn from isolated observations.

      iv. Scope and Framing: We will revise the manuscript to clarify that our proposed model represents a mechanistic framework, not a definitive or exclusive explanation, for how stochastic allelic regulation could contribute to dosage-sensitive disease phenotypes. We will also explicitly discuss the need for larger datasets and additional tissues to refine and test this model.

      In summary, while we recognize the limited sampling inherent to clonal analyses, the consistency of our observations across donors, cell types, and species, together with prior corroborating studies, supports the validity of the conclusions and justifies the broader conceptual implications.

      Description of analyses that authors prefer not to carry out

      Reviewer #1:

      Cell-type specificity and mitotic stability both require stronger evidence; the latter is inferred indirectly from clonal expansion rather than shown directly, and orthogonal experiments (e.g., allele-specific ChIP-seq, DNA methylation) would be required.

      We disagree with this reviewer that the mitotic stability of the epigenetic states are “inferred indirectly from clonal expansion rather than shown directly”. Our experimental design inherently captures mitotically stable, allele-specific states because each clonal line is derived from a single progenitor cell and expanded to millions of cells before analysis. The allele-specific replication timing and expression profiles observed in these clones therefore reflect epigenetic states that are stably inherited across many cell divisions, rather than transient or stochastic fluctuations. This approach was also validated in our previous study (Nat. Commun. 2022; 13:6301), where the same clonal strategy demonstrated stable allele-restricted replication and expression patterns over extended passages.

      We agree that orthogonal assays such as allele-specific ChIP-seq or DNA methylation analyses would provide additional mechanistic detail on the nature of I/SC-associated regulation. However, these experiments fall outside the scope of the present study, which was designed specifically to identify and map autosomal loci that exhibit coordinated AEI and VERT, the defining epigenetic features of I/SCs. While we fully acknowledge that defining the precise molecular marks (e.g., histone modifications, DNA methylation, chromatin accessibility) that underlie I/SC regulation will be an important future direction, our current data provide a genome-wide, allele-resolved foundation upon which such mechanistic studies can build.

      In summary, the current dataset achieves the central goal of defining the genomic distribution and conservation of I/SCs based on functional readouts of replication timing and expression. Future work will extend these findings using allele-specific epigenomic profiling to characterize the epigenetic modifications associated with I/SC stability and cell-type specificity.

    1. eLife Assessment

      This manuscript reports the development and characterization of iGABASnFR2, a genetically encoded GABA sensor that demonstrates substantially improved performance compared to its predecessor, iGABASnFR1. The work is comprehensive and methodologically rigorous, combining high-throughput mutagenesis, functional screening, structural analysis, biophysical characterization, and in vivo validation. The significance of the findings is fundamental, and the supporting evidence is compelling. iGABASnFR2 represents a notable advance in GABA sensor engineering, enabling enhanced imaging of GABA transmission both in brain slices and in vivo, and constitutes a timely, technically robust addition to the molecular toolkit for neuroscience research.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript by Kolb and Hasseman et al. introduces a significantly improved GABA sensor, building on the pioneering work of the Janelia team. Given GABA's role as the main inhibitory neurotransmitter and the historical lack of effective optical tools for real-time in vivo GABA dynamics, this development is particularly impactful. The new sensor boasts an enhanced signal-to-noise ratio (SNR) and appropriate kinetics for detecting GABA dynamics in both in vitro and in vivo settings. The study is well-presented, with convincing and high-quality data, making this tool a valuable asset for future research into GABAergic signaling.

      Strengths:

      The core strength of this work lies in its significant advancement of GABA sensing technology. The authors have successfully developed a sensor with higher SNR and suitable kinetics, enabling the detection of GABA dynamics both in vitro and in vivo. This addresses a critical gap in neuroscience research, offering a much-needed optical tool for understanding the most important inhibitory neurotransmitter. The clear representation of the work and the convincing, high-quality data further bolster the manuscript's strengths, indicating the sensor's reliability and potential utility. We anticipate this tool will be invaluable for further investigation of GABAergic signaling.

      Weaknesses:

      Despite the notable progress, a key limitation is that the current generation of GABA sensors, including the one presented here, still exhibits inferior performance compared to state-of-the-art glutamate sensors. While this work is a substantial leap forward, it highlights that further improvements in GABA sensor would still be highly beneficial for the field to match the capabilities seen with glutamate sensors.

    3. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript by Kolb and Hasseman et al. introduces a significantly improved GABA sensor, building on the pioneering work of the Janelia team. Given GABA's role as the main inhibitory neurotransmitter and the historical lack of effective optical tools for real-time in vivo GABA dynamics, this development is particularly impactful. The new sensor boasts an enhanced signal-to-noise ratio (SNR) and appropriate kinetics for detecting GABA dynamics in both in vitro and in vivo settings. The study is well-presented, with convincing and high-quality data, making this tool a valuable asset for future research into GABAergic signaling.

      Strengths:

      The core strength of this work lies in its significant advancement of GABA sensing technology. The authors have successfully developed a sensor with higher SNR and suitable kinetics, enabling the detection of GABA dynamics both in vitro and in vivo.

      This addresses a critical gap in neuroscience research, offering a much-needed optical tool for understanding the most important inhibitory neurotransmitter. The clear representation of the work and the convincing, high-quality data further bolster the manuscript's strengths, indicating the sensor's reliability and potential utility. We anticipate this tool will be invaluable for further investigation of GABAergic signaling.

      Weaknesses:

      Despite the notable progress, a key limitation is that the current generation of GABA sensors, including the one presented here, still exhibits inferior performance compared to state-of-the-art glutamate sensors. While this work is a substantial leap forward, it highlights that further improvements in GABA sensors would still be highly beneficial for the field to match the capabilities seen with glutamate sensors.

      We thank Reviewer 1 for the positive assessment. We agree that further improvements in GABA sensor performance remain desirable. We acknowledge this limitation and outline directions for future development in the Discussion paragraph beginning "There are several promising avenues that could be taken to further optimize iGABASnFR."

      Reviewer #2 (Public review):

      Summary:

      This manuscript presents the development and characterization of iGABASnFR2, a genetically encoded GABA sensor with markedly improved performance over its predecessor, iGABASnFR1. The study is comprehensive and methodologically rigorous, integrating high-throughput mutagenesis, functional screening, structural analysis, biophysical characterization, and in vivo validation. iGABASnFR2 represents a significant advancement in GABA sensor engineering and application in imaging GABA transmission in slice and in vivo. This is a timely and technically strong contribution to the molecular toolkit for neuroscience.

      Strengths:

      The authors apply a well-established sensor optimization pipeline and iterative engineering strategy from single-site to combinatorial mutants to engineer iGABASnFR2. The development of both positive and negative going variants (iGABASnFR2 and iGABASnFR2n) offers experimental flexibility. The structure and interpretation of the key mutations provide insights into the working mechanism of the sensor, which also suggest optimization strategies. Although individual improvements in intrinsic properties are incremental, their combined effect yields clear functional gains, enabling detection of direction-selective GABA release in the retina and volume-transmitted GABA signaling in somatosensory cortex, which were challenging or missed using iGABASnFR1.

      Weaknesses:

      With minor revisions and clarifications, especially regarding membrane trafficking, this manuscript will be a valuable resource for probing inhibitory transmission.

      We thank Reviewer 2 for the positive assessment. Regarding membrane trafficking, we appreciate the suggestion to test different trafficking motifs. While such optimization represents a valuable direction for future development, it was beyond the scope of the present study and not feasible with the available time and resources. A different imaging modality would be needed to assess membrane trafficking efficiency or membrane-restricted expression, as the images presented in the manuscript (Figure 2a) are wide-field epifluorescence images, which lack the axial resolution required to distinguish membrane-localized signal from cytosolic fluorescence.

      We expect that the current characterization of iGABASnFR2 will nevertheless provide a strong foundation for future efforts to optimize membrane targeting and expression using alternative trafficking strategies.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) We noted an interesting inconsistency in the response of iGABASnFR1 and iGABASnFR2 when expressed as purified protein versus in mammalian cells. Such discrepancies are not uncommon for proteins exhibiting different behaviors in E. coli versus mammalian expression systems. We appreciate the authors' diligent effort in performing screening within a neuronal context. Similarly, the stark difference between the absolute affinity in purified form (∼0.778 μM) and on-cell measurements (6.4 μM) warrants further discussion. The authors may consider commenting on these observations in the discussion section.

      We have revised the Discussion (lines 401-410 in the ‘Tracked Changes’ document) to address the discrepancy between measurements obtained with purified protein and those from expression on the neuronal surface. As noted by the reviewer, such discrepancies are common, and our revision is intended to convey our empirical experience with this phenomenon rather than to offer a definitive mechanistic explanation.

      One factor to appreciate is that, when on the surface of neurons, the sensor is tethered to the membrane by an additional 60 amino acids. In addition to altering the local chemical environment, membrane tethering could impose entropic or mechanical constraints on the sensor. These constraints may damp conformational motions that underlie ligand binding and fluorescence changes. Beyond this, the local environment experienced by a membrane-anchored sensor differs substantially from that of soluble protein. There are potential electrostatic and steric effects arising from the plasma membrane and extracellular matrix, as well as post-translational modifications associated with mammalian expression. These effects on sensor performance are not readily predictable in either magnitude or direction, as illustrated by iGluSnFR, which exhibits a higher apparent affinity when membrane-tethered than in soluble form (Aggarwal et al 2023). For these reasons, we place greater emphasis on neuronal measurements as the most informative indicator of in vivo sensor performance.

      (2) Although iGABASnFR2 fluorescence exhibits pH dependence, its response appears less pH-dependent compared to the first-generation sensor. To enhance clarity, we suggest plotting the normalized response of both sensors across different pH values. This visual representation would be highly informative for readers.

      Thank you - we have implemented this, now showing the (F_sat - F_apo)/F_apo response as a function of pH for all three sensors in Fig 4 fig. supp 3b. This visualization nicely illustrates that the apo-to-sat response of iGABASnFR1 is much more influenced by pH than either iGABASnFR2 or iGABASnFR2n, which we note on lines 252-253 of the ‘Tracked Changes’ document.

      (3) To provide a more comprehensive characterization of the sensors, we recommend including a quantification of the decay times for all three versions of the sensors in Figure 2, specifically after panel 2c.

      Thank you - we now provide this in Fig 2d.

      (4) For improved readability of Figure 3a, we suggest adding distinct labels for iGABASnFR1 and iGABASnFR2 with corresponding colors.

      Good suggestion - we matched the color of the backbones to the rest of the manuscript (orange and green). We also added labels on the figure to ensure clarity.

      (5) The GABA released by SAC cells in Figure 5 looks amazing! We propose a minor modification to the cartoon in Figure 5b: mirroring the image horizontally (left to right). Given that the subsequent panels (e, h, and k) set the preferred direction of SAC movement as rightward, the current cartoon in Figure 5b inadvertently suggests stronger inhibition by SAC-released GABA when the spot moves left. Mirroring the image would align the cartoon more accurately with the subsequent data representations.

      Thanks - this is a nice streamlining. We have implemented the change.

      Reviewer #2 (Recommendations for the authors):

      (1) As sensor performance differs substantially between purified protein and neurons, a summary table comparing key properties (e.g., EC50, ∆F/F <sub>ax</sub>, response amplitude to # of AP) across purified protein and neurons would be highly informative.

      We discuss differences in sensor performance between purified protein and neurons in the Discussion (lines 401-410 in ‘Tracked Changes document) and, for the reasons outlined there, consider neuronal measurements to be far more predictive of in vivo performance. We therefore chose not to include a summary table directly comparing purified protein and neuronal data, as this would risk over-emphasizing in vitro measurements that we view primarily as qualitative signposts rather than more directly informative indicators of functional performance.

      (2) The authors should comment on the observed differences in performance between purified protein and neuronal expression. Would HEK293 cell measurements serve as a better predictor of in vivo performance than in vitro titrations? Insights here would benefit future sensor development pipelines.

      We have revised the Discussion to address this point (lines 401-410 in the ‘Tracked Changes’ document). We often observe differences in sensor performance between purified protein measurements and cellular or in vivo contexts. In our experience, titrations in primary neurons provide a better predictor of in vivo performance than in vitro protein titrations, as they more closely reflect relevant cellular factors. We do not have direct evidence that expression in heterologous systems such as HEK293 cells is generally more predictive, although this seems plausible; however, predictions inevitably become less reliable as sensors are translated to fully in vivo conditions.

      (3) Improved membrane localization likely contributes to the enhanced sensitivity of iGABASnFR2 in neurons beyond changes in EC50. In Figure 2a, membrane trafficking appears suboptimal. The authors should explore alternative trafficking motifs (e.g., ER2, Kv2.1, or motifs from other sensors) to further improve the membrane expression and consider adding a second fluorescent protein for quantifying membrane-localized brightness.

      Figure 2a presents wide-field epifluorescence images, which lack the axial resolution required to distinguish membrane-localized signal from cytosolic fluorescence. We therefore do not consider this imaging modality suitable for assessing membrane trafficking efficiency or membrane-restricted expression.

      We appreciate the suggestion to test different trafficking motifs to attempt to better capture biological signals. While such optimization represents a valuable direction for future development, it was beyond the scope of the present study and not feasible with the available time and resources. We expect that the current characterization of iGABASnFR2 will nevertheless provide a strong foundation for future efforts to further optimize membrane targeting and expression using alternative trafficking strategies.

      (4) Figure 4 - Supplement 2: The apparent EC50 of iGABASnFR2 seems affected by buffer composition and the presence of high concentrations of unrelated compounds. The authors should comment on this.

      We thank the reviewer for raising this point. Upon closer inspection, the EC50 of iGABASnFR2 in Fig 4 Supp 2 is measured at 1.4 μM, while in Fig 4a it is 1.1 μM - these mean values are quite close to one another, and within the range of experimental variability we expect for experiments done weeks or months apart. What differs most noticeably in this dataset is the shape of the dose–response curve rather than the EC50 itself; the origin of this difference is currently unclear. We have revised the Results text (lines 226-231 in ‘Tracked Changes document) to clarify this point and to emphasize that the key observation of Fig. 4–figure supplement 2 is that none of the additional compounds tested substantially impair GABA binding, indicating that they do not act as strong non-competitive allosteric antagonists or inhibitors.

      (5) The negative-going variant, iGABASnFR2n, is introduced but only briefly characterized. Including additional data or even a conceptual use case would clarify its potential utility.

      We have modified the discussion to provide more examples of conceptual use cases, clarifying how such a sensor could indeed be highly impactful. The full passage is lines 372-387 in the ‘Tracked Changes’ document; to summarize: a key application of the negative-going sensor is detecting decreases in ‘GABA tone’, which plays a key role in setting the excitation-inhibition balance across brain circuits. Reductions in extrasynaptic GABA are a well-documented feature of several biologically important brain-state transitions, including arousal, experience-dependent plasticity, and stress-related modulation of inhibition, and iGABASnFR2n could be an important tool for investigating these processes.

    1. eLife Assessment

      In this important contribution, Yan and colleagues describe a powerful and compelling strategy to generate concatamers of the BK channel and their fusion constructs with the auxiliary gamma subunits, which allows exploring contributions of individual subunits of the tetrameric channel to its gating and the study of heteromeric channel complexes of defined composition. Distinct examples are presented, which illustrate great diversity in the stoichiometric control of BK channel gating, depending on the site and nature of molecular perturbations. The molecular approaches could be extended to other membrane proteins whose N and C termini face opposite sides of the membrane.

    2. Reviewer #1 (Public review):

      Summary:

      BK channels are widely distributed and involved in many physiological functions. They have also proven a highly useful tool for studying general allosteric mechanisms for gating and modulation by auxiliary subunits. Tetrameric BK channels are assembled from four separate alpha subunits which would be identical for homozygous alleles and of potentially five different combinations for heterozygous alleles Geng et al . (2023), (https://doi.org/10.1085/jgp.202213302). Construction of BK channels with concatenated subunits in order to strictly control heteromeric subunit composition had not yet been used because the N-terminus in BK channels is extracellular whereas the C-terminus is intracellular. In this new work, Chen, Li, and Yan devise clever methods to construct and assemble BK channels of known subunit composition, as well as to fix the number of γ1 axillary subunits per channel. With their novel molecular approaches, Chen, Li and Yan report that a single γ1 axillary subunit is sufficient to fully modulate a BK channel, that the deep conducting pore mutation L312A exhibited a graded effect on gating with each addition mutated subunit replacing a WT subunit in the channel adding an additional incremental left shift in activation, and that the V288A mutation at the selectivity filter must be present on all four alpha subunits in order to induce channel inactivation. Chen, Li, and Yan have been successful in introducing new molecular tools to generate BK channels of known stoichiometry and subunit composition. They validate their methods and provide three different examples of stoichiometric modulation by LRRC26, the selectivity filter, and the pore.

      Strengths:

      Powerful new molecular tools for study of channel gating are developed and validated in the study.

      Weaknesses:

      One example each of auxiliary, deep pore, and selectivity filter allosteric actions are presented, but this is sufficient for the purposes of the paper to establish their methods and present specific examples of applicability.

    3. Reviewer #2 (Public review):

      Summary:

      This manuscript describes novel BK channel concatemers as a tool to study the stoichiometry of gamma subunit and mutations in modulation of the channel. Taking the advantage of modular design of BK channel alpha subunit the authors connected S1-S6/1st RCK as two- and four-subunit concatemers and coexpressed with S0-RCK2 to form normal function channels. These concatemers avoided the difficulty that the extracellular N-terminus of S0 was unable to connect with the cytosolic C-terminus of the alpha or gamma subunit, allowing a single gamma subunit to be connected to the concatemers. The concatemers also helped reveal the required stoichiometry of mutant BK subunits in modulating channel function. These include L312A in the deep pore region that altered channel function additively with each additional subunit harboring the mutation, and V288A at the selectivity filter that altered channel function cooperatively only when all four subunits being mutated. These results demonstrate that the concatemers are robust and effective in studying BK channel function and molecular mechanisms related to stoichiometry. The different requirement of the gamma subunit and the mutations stoichiometry for altering channel function is interesting, revealing fundamental mechanisms of how different motifs of the channel protein control function.

      Strengths:

      The manuscript presents well designed experiments with high quality data, which convincingly demonstrate the BK channel concatemers and their utility. The results are clearly written.

      Weaknesses:

      This reviewer did not identify any major concerns with the manuscript.

      Editors' note: We thank you for addressing some of the concerns, adding clarifications and more complete discussions, including further details about experimental protocols. The revised version is significantly improved. Some concerns linger that the biophysical/structural mechanisms underlying the observed phenotypes remain unclear and in some ways are phenomenological. However, the current study is more about the methodology and the mechanisms underlying the stoichiometry dependent effects are perhaps left for a separate study, with more detailed exploration. Congratulations for the excellent work.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      BK channels are widely distributed and involved in many physiological functions. They have also proven a highly useful tool for studying general allosteric mechanisms for gating and modulation by auxiliary subunits. Tetrameric BK channels are assembled from four separate alpha subunits, which would be identical for homozygous alleles and potentially of five different combinations for heterozygous alleles (Geng et al., 2023, https://doi.org/10.1085/jgp.202213302). Construction of BK channels with concatenated subunits in order to strictly control heteromeric subunit composition had not yet been used because the N-terminus in BK channels is extracellular, whereas the C-terminus is intracellular. In this new work, Chen, Li, and Yan devise clever methods to construct and assemble BK channels of known subunit composition, as well as to fix the number of γ1 axillary subunits per channel. With their novel molecular approaches, Chen, Li and Yan report that a single γ1 axillary subunit is sufficient to fully modulate a BK channel, that the deep conducting pore mutation L312A exhibited a graded effect on gating with each addition mutated subunit replacing a WT subunit in the channel adding an additional incremental left shift in activation, and that the V288A mutation at the selectivity filter must be present on all four alpha subunits in order to induce channel inactivation. Chen, Li, and Yan have been successful in introducing new molecular tools to generate BK channels of known stoichiometry and subunit composition. They validate their methods and provide three examples of their use with useful observations.

      Strengths:

      Powerful new molecular tools for the study of channel gating have been developed and validated in the study.

      Weaknesses:

      (1) One example each of auxiliary, deep pore, and selectivity filter allosteric actions is presented, but this is sufficient for the purposes of the paper to establish their methods and present specific examples of applicability.

      We sincerely thank Reviewer #1 for the thoughtful and supportive evaluation of our work. We greatly appreciate the reviewer’s clear summary of the study and the recognition of the novelty and utility of our molecular concatemer strategy for controlling BK channel subunit composition and stoichiometry.

      We also appreciate the reviewer’s positive assessment that the three examples (auxiliary subunit modulation, deep pore mutation, and selectivity filter mutation) are sufficient to establish the method and demonstrate its applicability. We are encouraged that the reviewer found the new molecular tools to be powerful and well validated.

      We have no further changes to make in response to this review, but we are grateful for the reviewer’s constructive and encouraging comments.

      Reviewer #2 (Public review):

      Summary:

      This manuscript describes novel BK channel concatemers as a tool to study the stoichiometry of the gamma subunit and mutations in the modulation of the channel. Taking advantage of the modular design of the BK channel alpha subunit, the authors connected S1-S6/1st RCK as two- and four-subunit concatemers and coexpressed with S0-RCK2 to form normal function channels. These concatemers avoided the difficulty that the extracellular N-terminus of S0 was unable to connect with the cytosolic C-terminus of the gamma subunit, allowing a single gamma subunit to be connected to the concatemers. The concatemers also helped reveal the required stoichiometry of mutant BK subunits in modulating channel function. These include L312A in the deep pore region that altered channel function additively with each additional subunit harboring the mutation, and V288A at the selectivity filter that altered channel function cooperatively only when all four subunits were mutated. These results demonstrate that the concatemers are robust and effective in studying BK channel function and molecular mechanisms related to stoichiometry. The different requirement of the gamma subunit and the mutations stoichiometry for altering channel function is interesting, which may relate to the fundamental mechanism of how different motifs of the channel protein control function.

      Strengths:

      The manuscript presents well-designed experiments with high-quality data, which convincingly demonstrate the BK channel concatemers and their utility. The results are clearly presented.

      Weaknesses:

      This reviewer did not identify any major concerns with the manuscript.

      We sincerely thank Reviewer #2 for the careful reading of our manuscript and for the highly positive and supportive comments. We appreciate the reviewer’s detailed summary of our concatemer design strategy and its use in studying gamma subunit stoichiometry and mutation-dependent modulation of BK channel function.

      We are especially grateful for the reviewer’s recognition that the experiments are well designed, the data are of high quality, and the results demonstrate the robustness and utility of the concatemer approach. We also appreciate the reviewer’s thoughtful note on the mechanistic implications of the distinct stoichiometric requirements observed for the gamma subunit, L312A, and V288A.

      We are pleased that the reviewer identified no major concerns. We have no further changes to make in response to this review, and we thank the reviewer again for the positive evaluation.

      Recommendations for the authors:

      Reviewing Editor Comments:

      While the study presents a great methodological advancement, the phenomenological examples described could perhaps benefit from a little more mechanistic description/discussion. In particular, the functional effect of the V288A mutant is very novel. It could be useful to discuss whether this mutant impacts channel selectivity/conductance. It could be beneficial to also contrast the subunit dependence of V288A with that of the W434F mutant of the Shaker channel. In the latter, C-type inactivation gating is accelerated even when the mutant is present in a single subunit, which contrasts with the effect in V288A.

      We greatly appreciate the editor’s and reviewers’ thorough and constructive evaluation, and we have revised the manuscript accordingly.

      We added discussion with citation about the potential effect of V288A on selectivity (lines 348349). We also added the reported stoichiometric effects of mutations in Shaker and hERG1 channels on C-inactivation in discussion (lines 336-351). From these studies and our findings with V288A in BK channels, it is interesting to note that the stoichiometric effects of these mutations varies and those located near or within selectivity filter signature exhibited an all-or-none effect in both hERG1 and BK channels.

      The authors might also want to consider performing and showing immunoblots with the alpha_deltaM fragment co-expressed with the other channel fragments. Together with the GFP tag, this alpha_deltaM would perhaps be a ~90 kDa protein. It should be captured by anti-V5 IP and resolved on an SDS-PAGE gel (at least with the quad construct).

      We added supplemental data (Fig.1 – figure supplement 1) to show co-expression and co-IP of the α<sup>ΔM</sup>-GFP construct and a FLAG-tagged α<sub>M</sub> construct. The α<sup>ΔM</sup>-GFP displayed right size on SDS-PAGE. It is of note that the single unit α<sub>M</sub> construct tended to oligomerize even under denatured condition on SDS-PAGE.

      For Figure 4, providing details about the inter-pulse intervals and interpulse holding voltage would be helpful. I was not able to find this information in the methods or text.

      The inter-pulse intervals and holder voltage are now added in Fig. 4 legend (line 638).

      Reviewer #1 (Recommendations for the authors):

      (1) Submitted papers should have page numbers to facilitate reviewing.

      Both page and line numbers are added.

      (2) The designation of the various channel types, such as BKα and BKαM should be identical in the text and figures, so either drop BK in the text or add BK in the figures. Maybe drop BK in the text, as it is known that BK channels are the topic of this study.

      We appreciate the suggestion to be consistent in text and figures. We have dropped “BK” for “BKα<sub>M</sub>” throughout the text.

      (3) "Single Boltzmann fits of G-V curves" would be consistent with a homogenous channel population but do not necessarily suggest a single homogenous channel population of BK channels, as was shown by Geng et al. (2023) (https://doi.org/10.1085/jgp.202213302) where the G-V curve for simultaneous expression of five BK channel types with different V1/2s for each channel type was well approximated by a single Boltzmann function. The dogma that a single Boltzmann fit suggests one channel type needs to be reset. So wave a red flag here: whereas a single Boltzmann fit is consistent with a single channel type, it does not establish a single channel type nor even suggest a single channel type.

      We fully agree that a good Single Boltzmann fit doesn’t mean homogenous channel population. We have changed “suggesting” to “consistent with” (line 203) and “reflecting” to “agreeing with” (line 205).

      (4) Geng et al. (2023) demonstrated that the pore mutation G375R in BK channels gave a left shift in activation linearly related to the number of WT subunits replaced with mutant subunits. This should incremental shift in activation for G375R should be mentioned, as it is consistent with the incremental effects of the L312A deep pore mutation on activation as reported by the authors in their Figure 3D.

      We appreciate the pointing-out of this highly relevant publication. We have now included this reference and discussed together with L312A mutation (lines 309-313).

      (5) I went back and looked at the Lingle laboratory papers on the gamma subunit. An additional sentence or two on what the Lingle lab found and didn't find would be useful here for readers.

      In the Introduction, we have listed the Lingle lab’s findings and the limitations of their experimental methods that warrants the development of a concatenated construct method as proposed in this study (lines 84-88). We prefer to not discuss further in the Discussion as it will be redundant.

      (6) For the two examined mutations L312A and V288A, include in the Methods a 21 amino acid sequence for each mutation with the amino acid to be mutated (L or V) in the center, with beginning and end numbering at the beginning and end of each list. This will allow the reader/experimenter to readily locate the mutated residue on their BK amino acid sequences, which may have different numbering than U11058. Interestingly, for the so-called canonical sequence Q12791 · KCMA1_HUMAN that I found in UniProt starting with U11058, there is an L312, but I found no V288, but an F288. Am I doing this correctly? Do I have the correct sequence/isoform? The only sure way to identify an AA is with an extensive pre and post-sequence so that the chance of misidentification approaches zero.

      We verified that the listed Gene Bank IDs of U11058 for cDNA and AAB65837 for protein should point to the right sequences. In the section of Results, we have now included the peptide sequences of the selectivity filter signature motif and part of the S6 TM where V288 and L312A are located, respectively (lines 179 and 220).

      Reviewer #2 (Recommendations for the authors):

      The different stoichiometry of the gamma subunit and the mutations in regulating channel function raise important questions. For instance, what are the structural and energetic bases for their different stoichiometric requirements? Does the structure motif, such as the selectivity filter or deep pore, act as a unit? Or does a specific residue, such as V288 or L312, act individually to determine the different stoichiometric requirements? What molecular interactions are involved for these residues and subunit to influence the cooperativity among the four alpha subunits in channel function? Some of these questions are discussed in the manuscript, but it may help the readers to clarify what aspects of the mechanistic bases for the findings in this manuscript are known and what aspects remain to be studied.

      We agree that these are all important questions. We have now cited more previous studies on C-inactivation in other K<sup>+</sup> channels and on deep pore mutations in BK channels in terms of subunit stoichiometry (lines 336-351). The results appear to be consistent, suggesting shared properties among residues within the selectivity filter motif or among residues in deep pore region.

      Some minor comments are as follows.

      (1) Page 7, 2nd paragraph: "Page 2B" change to "Page 3B"? Also, "delay in deactivation" is not precise. The term "Delay" in channel kinetics has a specific meaning, and the use of this word here causes some confusion. The authors may want to delete "substantial delay in deactivation evident as a”.

      Corrected by changing Fig. 2B to Fig. 3B and deleting “a substantial delay in deactivation evident as” (line 191).

      (2) Page 9, 1st paragraph: "used in the voltage protocol used". Drop one of the instances of used".

      Corrected by deleting the first “used” (line 246).

      (3) Page 12, 1st paragraph: "Nonetheless, the tight inter-subunit cooperativity observed at the selectivity filter makes it a plausible candidate for serving as the activation gate, a property not yet demonstrated for the lower S6 segment." This seems to be an interesting idea. However, it is not clearly explained. The authors may want to clarify how the cooperativity is related to the activation gate.

      We have now added a sentence with citations to discuss the requirement of intersubunit cooperativity for an activation gate to function (lines 354-357).

      Other major changes: We updated immunoblot figures Fig1C and Fig2C for better presentation.

    1. eLife Assessment

      This fundamental study presents experimental evidence on how geomagnetic and visual cues are integrated in a nocturnally migrating insect. The evidence supporting the conclusions is compelling. The work will be of broad interest to researchers studying animal migration and navigation.

    2. Reviewer #1 (Public review):

      Summary

      The manuscript by Ma et al. provides robust and novel evidence that the noctuid moth Spodoptera frugiperda (Fall Armyworm) possesses a complex compass mechanism for seasonal migration that integrates visual horizon cues with Earth's magnetic field (likely its horizontal component). This is an important and timely study: apart from the Bogong moth, no other nocturnal Lepidoptera has yet been shown to rely on such a dual-compass system. The research therefore expands our understanding of magnetic orientation in insects with both theoretical (evolution and sensory biology) and applied (agricultural pest management, a new model of magnetoreception) significance.

      The study uses state-of-the-art methods and presents convincing behavioural evidence for a multimodal compass. It also establishes the Fall Armyworm as a tractable new insect model for exploring the sensory mechanisms of magnetoreception, given the experimental challenges of working with migratory birds. Overall, the experiments are well designed, the analyses are appropriate, and the conclusions are generally well supported by the data.

      Strengths

      • Novelty and significance: First strong demonstration of a magnetic-visual compass in a globally relevant migratory moth species, extending previous findings from the Bogong moth and opening new research avenues in comparative magnetoreception.

      • Methodological robustness: Use of validated and sophisticated behavioural paradigms and magnetic manipulations consistent with best practices in the field. The use of 5 min bins to study a dynamic nature of magnetic compass which is anchored to a visual cue but updated with latency of several minutes is an important finding and a new methodological aspect in insect orientation studies.

      • Clarity of experimental logic: The cue-conflict and visual cue manipulations are conceptually sound and capable of addressing clear mechanistic questions.

      • Ecological and applied relevance: Results have implications for understanding migration in an invasive agricultural pest with expanding global range.

      • Potential model system: Provides a new, experimentally accessible species for dissecting the sensory and neural bases of magnetic orientation.

      Weaknesses

      Overall, this is a strong study, and the authors have completed an excellent major revision that has undoubtedly addressed most major and minor issues. The remaining points below are minor recommendations, and I acknowledge that differences in opinion are always possible:

      (1) Structure and Presentation of Results

      • I recommend reordering the visual-cue experiments to progress from simpler conditions (no cues) to more complex ones (cue-conflict). This would improve narrative logic and accessibility for non-specialist readers. The authors have chosen not to implement this suggestion, which I respect, but my recommendation stands.

      (2) Ecological Interpretation

      • The authors should expand their discussion on how the highly simplified, static cue setup translates to natural migratory conditions, where landmarks are dynamic, transient, or absent. Specifically, further consideration is needed on how the compass might function when landmarks shift position, become obscured, or are replaced by celestial cues. Additionally, the discussion would benefit from a more consolidated section with concrete suggestions for future experiments involving transient, multiple, or more naturalistic visual cues.

      This point was addressed partially in one paragraph of the Discussion, which reads as follows:

      "In nature, they are likely to encounter a range of luminance-gradient visual cues, including relatively stable celestial cues as well as transient or shifting local features encountered en route. Although such natural cues differ from our simplified laboratory stimulus, they may represent intermittently sampled visual inputs that can be optimally integrated with magnetic information, with the congruency between visual and magnetic cues likely playing a key role in maintaining a stable compass response. Whether the cues are static or changing, brief periods without them may still allow the subsequent recovery of a stable long-distance orientation strategy. Determining which types of natural visual cues support the magnetic-visual compass, and how they interact with magnetic information, including how their momentary alignment or angular relationship is integrated and how such visual cue-magnetic field interactions may require time to influence orientation, together with elucidating the genetic and ecological bases of multimodal orientation, will be important objectives for future research."

      While this paragraph is informative, the wording remains lengthy, somewhat unclear, and vague. Shorter, clearer statements would improve readability and impact. For example:

      • How could moths maintain direction during periods when only the magnetic field is present and visual landmarks are absent?

      • Could celestial cues (e.g., stars) compensate, and what happens if these are also obscured?

      • What role does saliency play when multiple visual landmarks are present simultaneously?

      • How might a complex skyline without salient landmarks affect orientation?

      Including simple, concise sentences that pose concrete open questions and suggest experimental designs would strengthen the discussion without creating space issues. In my view, a comprehensive discussion of how the simplified, static cue setup relates to natural migratory conditions-where landmarks are dynamic, transient, or absent-would add significant value to the paper.

      (3) Methodological Details and Reproducibility

      • The lack of luminance level measurements should be explicitly highlighted.

      • The authors chose not to adjust figure legends by replacing "magnetic South" with "magnetic North." While I believe this would be more conventional and preferable, this is ultimately a minor stylistic issue.

      (4) Conceptual Framing and Discussion

      • Although the authors made a good attempt to explain the limitations of using an artificial visual cue, I believe there is room for a more explicit argument. For example, it could be stated clearly that this species is unlikely to encounter a situation in nature where a single, highly salient landmark coincides with its migratory direction. Therefore, how these findings translate to real migratory contexts remains an open question. A sentence or two making this point directly would strengthen the discussion.

      (5) Technical and Open-Science Points

      • Sharing the R code openly (e.g., via GitHub) should be seriously considered. The code does not need to be perfectly formatted, but making it available would be highly beneficial from an open-science perspective.

    3. Reviewer #2 (Public review):

      Summary:

      The work titled "Geomagnetic and visual cues guide seasonal migratory orientation in the nocturnal fall armyworm, the world's most invasive insect" provided experimental evidence on how geomagnetic and visual cues are integrated, and visual cues are indispensable for magnetic orientation in the nocturnal fall armyworm.

      Strengths:

      It has been demonstrated that the Australian Bogon moth could integrate global stellar cues with the geomagnetic field for long distance navigation. However, data are lacking for other insects. This study suggested that the integration of geomagnetic and visual cues may represent a conserved navigational mechanism broadly employed across migratory insects.

      Weaknesses:

      The visual cues used in the indoor experimental system designed by the authors may have some limitations in ecological relevance. The author may need more explanations on this experimental system.

      In the revised manuscript, the authors have added explanations in the discussion section. I am fine with the revision.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary

      The manuscript by Ma et al. provides robust and novel evidence that the noctuid moth Spodoptera frugiperda (Fall Armyworm) possesses a complex compass mechanism for seasonal migration that integrates visual horizon cues with Earth's magnetic field (likely its horizontal component). This is an important and timely study: apart from the Bogong moth, no other nocturnal Lepidoptera has yet been shown to rely on such a dual-compass system. The research therefore expands our understanding of magnetic orientation in insects with both theoretical (evolution and sensory biology) and applied (agricultural pest management, a new model of magnetoreception) significance.

      The study uses state-of-the-art methods and presents convincing behavioural evidence for a multimodal compass. It also establishes the Fall Armyworm as a tractable new insect model for exploring the sensory mechanisms of magnetoreception, given the experimental challenges of working with migratory birds. Overall, the experiments are well-designed, the analyses are appropriate, and the conclusions are generally well supported by the data.

      Strengths

      (1) Novelty and significance: First strong demonstration of a magnetic-visual compass in a globally relevant migratory moth species, extending previous findings from the Bogong moth and opening new research avenues in comparative magnetoreception.

      (2) Methodological robustness: Use of validated and sophisticated behavioural paradigms and magnetic manipulations consistent with best practices in the field. The use of 5-minute bins to study the dynamic nature of the magnetic compass which is anchored to a visual cue but updated with a latency of several minutes, is an important finding and a new methodological aspect in insect orientation studies.

      (3) Clarity of experimental logic: The cue-conflict and visual cue manipulations are conceptually sound and capable of addressing clear mechanistic questions.

      (4) Ecological and applied relevance: Results have implications for understanding migration in an invasive agricultural pest with an expanding global range.

      (5) Potential model system: Provides a new, experimentally accessible species for dissecting the sensory and neural bases of magnetic orientation.

      Weaknesses

      While the study is strong overall, several recommendations should be addressed to improve clarity, contextualisation, and reproducibility:

      We thank Reviewer #1 for the positive and encouraging evaluation of our study. We appreciate the recognition of our work’s strengths and are grateful for the constructive feedback on the remaining weaknesses, which will guide and strengthen our revisions.

      Structure and presentation of results

      Requires reordering the visual-cue experiments to move from simpler (no cues) to more complex (cue-conflict) conditions, improving narrative logic and accessibility for non-specialists.

      Thank you for this thoughtful suggestion. While we appreciate the rationale for presenting results from simpler to more complex conditions, we kept the original sequence because it aligns with the logic of our study. Our initial aim was to determine whether fall armyworms use a magnetic compass integrated with visual cues, as shown in the Bogong moth. After establishing this phenotype, we then examined whether visual cues are required for maintaining magnetic orientation. We have also clarified in the Introduction that magnetic orientation in the Bogong moth relies on integration with visual cues, which provides readers with clearer context and improves the overall narrative flow.

      Ecological interpretation

      (a) The authors should discuss how their highly simplified, static cue setup translates to natural migratory conditions where landmarks are dynamic, transient or absent.

      Thank you for raising this important point. We agree that natural migratory environments provide visual information that is often dynamic, transient, or intermittently absent, in contrast to the simplified and static cue used in our indoor experiments. Our intention in using a minimal, static cue was to isolate and test the fundamental presence of magnetic–visual integration in fall armyworms under fully controlled conditions.To address the reviewer’s concern, we have added a brief note in the Discussion indicating that fall armyworms may encounter both static and dynamic luminance-based visual cues in nature, such as light–dark gradients created by terrain features or more stable celestial patterns. Although these natural cues differ from our simplified laboratory stimulus, they may similarly provide asymmetric visual structure that can be integrated with magnetic information. We also note that determining which natural visual cues support the magnetic–visual compass will be an important direction for future work.

      (b) Further consideration is required regarding how the compass might function when landmarks shift position, are obscured, or are replaced by celestial cues. Also, more consolidated (one section) and concrete suggestions for future experiments are needed, with transient, multiple, or more naturalistic visual cues to address this.

      Thank you for this constructive suggestion. We appreciate the reviewer’s point that additional consideration of how the compass might function under shifting, obscured, or celestial visual cues would strengthen the manuscript. Given the limited evidence currently available for this species, we have incorporated a concise and appropriately cautious discussion addressing these possibilities.

      Methodological details and reproducibility

      (a) It would be better to move critical information (e.g., electromagnetic noise measurements) from the supplementary material into the main Methods.

      Thank you for this helpful suggestion. In the revised manuscript, we have added the key electromagnetic noise measurements information to the main Methods section.

      (b) Specifying luminance levels and spectral composition at the moth's eye is required for all visual treatments.

      Thank you for this helpful comment. We have clarified in the Methods as well as the legend of Fig. S3 that both luminance levels and spectral composition were measured at the position corresponding to the moth’s head.

      (c) Details are needed on the sex ratio/reproductive status of tested moths, and a map of the experimental site and migratory routes (spring vs. fall) should be included.

      Thanks. We have added the reproductive status of the tested moths in the Methods, specifying that all individuals used were unmated 2-day-old adults.

      (d) Expanding on activity-level analyses is required, replacing "fatigue" with "reduced flight activity," and clarifying if such analyses were performed.

      Thank you for this comment. In this context, the term “fatigue” referred to the possibility that moths might gradually lose motivation or attention to orient when flying for an extended period in a simplified, artificial environment with limited sensory cues. Such a decrease in orientation motivation over time could, in theory, lead to a loss of individual orientation and consequently to the observed loss of group orientation. To test this possibility, we analyzed the orientation performance of each individual moth across different phases using the Rayleigh test. The r-value was used as a measure of individual directedness (higher r-values indicate stronger orientation). Our results showed that mean r-values did not differ significantly among the experimental phases (multiple comparisons, Table S2). This indicates that 25min measurement itself was not responsible for the loss of orientation. We did not perform a quantitative activity-level analysis in this study. However, as mentioned in Methods, flight activity was continuously monitored during the experiments by observing fluctuations in the pointer values on the experimental software, which corresponded to the moth’s rotational movements. If the pointer values remained unchanged for more than 10 seconds, the experimenter checked for wing vibrations by sound; if the moth had stopped flying, gentle tapping on the arena wall was used to stimulate renewed flight. Only individuals that maintained active flight throughout the experiment, with fewer than four instances of wingbeat cessation, were included in the analysis. We also mentioned that activity level analysis was not performed due to technical difficulties in the revised manuscript.

      Figures and data presentation

      (a) The font sizes on circular plots should be increased; compass labels (magnetic North), sample sizes, and p-values should be included.

      Thank you for this helpful suggestion. Regarding the compass labels and statistical reporting, our analysis provides significance levels as ranges rather than exact p-values; therefore, we clarified in the figure legends that the two dashed circles correspond to thresholds for statistical significance p = 0.05 and p = 0.01, respectively. Sample sizes are already indicated within each panel. To avoid visual clutter caused by displaying both magnetic North and South, we show only the magnetic South direction (mS) consistently across panels, which can improve readability.

      (b) More clarity is required on what "no visual cue" conditions entail, and schematics or photos should be provided.

      Thank you for this comment. In our study, the “no visual cue” condition refers to the absence of the black triangular landmark inside the flight simulator. To improve clarity, we have updated the legend of Fig. 4 to explicitly state this and have referred readers to the schematic in Fig. 1, which illustrates the structure of the flight simulator. These additions clarify what the “no visual cue” condition entails without requiring additional schematics.

      (c) The figure legends should be adjusted for readability and consistency (e.g., replace "magnetic South" with magnetic North, and for box plots better to use asterisks for significance, report confidence intervals).

      Thank you. Regarding the choice of compass labeling, we intentionally used magnetic South (mS) rather than magnetic North (mN) because the main population tested in our experiments represents the autumn migratory generation. During autumn, fall armyworms orient southward when visual and magnetic cues are aligned. Using magnetic South in the plots therefore provides a clearer representation of cue alignment in this season and avoids potential confusion when interpreting the combined visual–magnetic information.

      Conceptual framing and discussion

      (a) Generalisations across species should be toned down, given the small number of systems tested by overlapping author groups.

      Thank you for this valuable comment. In the revised manuscript, we have softened such statements in both abstract and maintext.

      (b) It requires highlighting that, unlike some vertebrates, moths require both magnetic and visual cues for orientation.

      Thank you for this helpful suggestion. We have added a sentence to the Discussion explicitly highlighting that, unlike some vertebrates capable of using magnetic information in the absence of visual cues, moths require the integration of both magnetic and visual cues for accurate orientation. This clarification emphasizes the distinct multimodal nature of compass use in migratory moths.

      (c) It should be emphasised that this study addresses direction finding rather than full navigation.

      Thank you for this important clarification. We have now made it explicit in the manuscript that our experiments address direction finding (i.e., orientation) rather than full navigation. This distinction is stated in both the Introduction and Discussion to clearly define the scope of the study.

      (d) Future Directions should be integrated and consolidated into one coherent subsection proposing realistic next steps (e.g., more complex visual environments, temporal adaptation to cue-field relationships).

      Thank you for this constructive suggestion. We agree that outlining realistic next steps is valuable. However, given the limited scope of the current data, we have only slightly expanded the existing forward-looking statements in the Discussion.

      (e) The limitations should be better discussed, due to the artificiality of the visual cue earlier in the Discussion.

      Thank you for this comment. We agree that the artificiality of the visual cue is an important limitation of the present study. Rather than extending speculative discussion, we have clarified this limitation in the revised Discussion and highlighted the key questions that future work must address.

      Technical and open-science points

      Appropriate circular statistics should be used instead of t-tests for angular data shown in the supplementary material.

      Thank you for this comment. We have addressed this point (Fig. S1) in the revised supplementary material.

      Details should be provided on light intensities, power supplies, and improvements to the apparatus.

      Thank you. Light intensities are reported as spectral irradiance measurements in Supplementary Materials, which provide full wavelength-resolved information for the illumination used, although a separate measurement of total illuminance (lux) was not performed. We have also added the requested information on the power supplies.

      The derivation of individual r-values should be clarified.

      Thanks. We have clarified in the revised manuscript.

      Share R code openly (e.g., GitHub).

      Thanks. We are in the process of organizing the relevant R code, but have not been able to upload it to GitHub before the current revision deadline. The code is available from the corresponding author upon request.\

      Some highly relevant - yet missing - recent and relevant citations should be added, and some less relevant ones removed..

      Thanks. We added one recent relevant reference to the revised manuscript.

      Reviewer #2 (Public review):

      Summary:

      This work provided experimental evidence on how geomagnetic and visual cues are integrated, and visual cues are indispensable for magnetic orientation in the nocturnal fall armyworm.

      Strengths:

      Although it has been demonstrated previously that the Australian Bogon moth could integrate global stellar cues with the geomagnetic field for long-distance navigation, the study presented in this manuscript is still fundamentally important to the field of magnetoreception and sensory biology. It clearly shows that the integration of geomagnetic and visual cues may represent a conserved navigational mechanism broadly employed across migratory insects. I find the research very important, and the results are presented very well.

      We thank Reviewer #2 for the positive and encouraging evaluation of our study. We appreciate the recognition of our work’s strengths.

      Weaknesses:

      The authors developed an indoor experimental system to study the influence of magnetic fields and visual cues on insect orientation, which is certainly a valuable approach for this field. However, the ecological relevance of the visual cue may be limited or unclear based on the current version. The visual cues were provided "by a black isosceles triangle (10 cm high, 10 cm 513 base) made from black wallpaper and fixed to the horizon at the bottom of the arena". It is difficult to conceive how such a stimulus (intended to represent a landmark like a mountain) could provide directional information for LONG-DISTANCE navigation in nocturnal fall armyworms, particularly given that these insects would have no prior memory of this specific landmark. It might be a good idea to make a more detailed explanation of this question.

      We appreciate the constructive feedback on the weaknesses, which will guide and strengthen our revisions. To address the reviewer’s concern, we have added a brief note in the Discussion indicating that fall armyworms may encounter both static and dynamic luminance-based visual cues in nature, such as light–dark gradients created by terrain features or more stable celestial patterns. Although such natural cues differ from our simplified laboratory stimulus, they may represent intermittently sampled visual inputs that can be optimally integrated with magnetic information, whether the cues are static or changing, and brief periods without them may still allow the subsequent recovery of a stable long-distance orientation strategy.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Major to Medium Suggestions

      (a) Reordering of Visual Cue Tests

      The manuscript currently presents cue-conflict experiments before the simpler "no visual cue" tests. For non-specialist readers, it would be more logical to start with the basic condition (no visual cues) and then move to progressively more complex ones. This provides a clearer and more logically sound narrative.

      For example, the results could first demonstrate that without visual cues, the moths fail to orient (both in darkness and uniform light), and then show that introducing a single salient cue (a triangle on the horizon) restores directed behaviour. This would help readers understand the logic of the progression and should be better integrated throughout the Results and Discussion.

      Thanks. We have responded this comment in Public Reviews.

      (b) Translating Key Findings to Realistic Scenarios (LL 333-344 or where suitable in Discussion, and mentioning that we utilised a reductionist principle first in Intro, but clearly articulated that it is very simplified)

      The main text (eg Discussion) should address how these findings translate to real-world conditions. The experimental design used a single, highly salient, and static cue, always aligned with the migratory direction. In nature, such a consistent landmark is unlikely-mountains or other features would shift position relative to the moth's trajectory as it flies.

      Key questions arise which need to be addressed:

      - How would the compass system adapt to changing landmark positions as the moth moves?

      - What happens when no landmarks are visible (e.g. over flat plains or cloudy nights)?

      - Would stellar or other cues take over in such cases? Your hypotheses, please.

      Addressing these points - and proposing specific future experiments (e.g. with transient or multiple visual cues)-would strengthen the ecological relevance of the findings and show a clear way forward.

      Thanks for your kind comments. We now explicitly state in the Introduction that our study employs a reductionist approach using a simplified visual environment to isolate magnetic-visual interactions. As the ecological questions raised by the reviewer cannot be addressed with the current dataset, we avoid extended speculation but have added brief clarification in the Discussion and addressed these points in the Public Reviews response. We also indicate that future work will need to examine the types of visual cues that can support magnetic orientation and how such cues couple with geomagnetic information.

      Technical and Methodological Points

      (a) Incomplete Methods Section

      Critical technical information (e.g. electromagnetic noise measurements) currently appears only in supplementary figure legends. All such details should be included in the main Methods section if the word count allows (or include a short section in the main text with reference to more details in the supplementary material).

      Thanks for your kind comments. We have addressed this as suggested in the Public Reviews.

      (b) Lighting Conditions

      Specify luminance levels (the amount of light emitted and passing through in quanta per unit of surface, eg m2) at the moth's eye and indicate whether spectral composition was consistent between treatments (with and without the visual cue).

      Thanks for your comments. We have responded to this point in the Public Reviews.

      (c) Figures

      - Increase font sizes on circular histograms.

      - Add compass labels (ideally magnetic North, mN, not south, etc, as it is usual in pertinent literature), sample sizes, and p-values on each panel.

      - Replace "magnetic South" (mS) indicators with magnetic North (mN) to align with convention.

      Thanks for your comments. We have responded to this point in the Public Reviews.

      (d) Migratory Expectations

      Include expected compass bearings for spring and autumn migrations (with citations) to relevant figures (Figure 2, 4, S2).

      Thanks for your comments. We have added the information that “We recently found that fall armyworms from the year-round range in Southwest China (Yunnan) exhibit seasonally appropriate migratory headings when flown outdoors in virtual flight simulators, heading northward in the spring and southward in the fall, and this seasonal reversal is controlled by photoperiod (Chen et al., 2023).” in Introduction. Thus, we didn’t offer expected seasonal compass bearings in Results section.

      (e) Add a map showing the experimental site and known migratory routes, clearly labelling spring vs fall routes. It would help justify expected headings.

      Thank you for this suggestion. At present, there are no experimentally validated migratory routes (e.g., through mark-release-recapture or tracking approaches) for the specific fall armyworm population used in our study. Because these routes have not been biologically confirmed, we didn’t offer a presumed migratory map that may imply unwarranted certainty.

      (f) Composition of Test Groups

      Indicate sex ratios and reproductive status (mated/unmated) of tested moths, if known or comment if unknown, as both can affect migratory motivation and behaviour.

      Thank you for this suggestion. We have responded to this point in the Public Reviews.

      (g) Role and Nature of Visual Cues

      While the results clearly show that orientation disappears without visual cues, the triangle cue is highly artificial. Well-studied Bogong moths are known to rely on views of Australian mountain ranges during their nocturnal migrations, but there is no evidence that armyworms use a similar strategy. Even for bogongs, it is not just one salient mountain always in front of them on migration. Discuss whether Fall Armyworm would encounter comparable natural cues in the field along their migratory route, or whether the triangle might simply provide a frame of reference rather than a true landmark.

      Thank you for this comments. We have responded to this point in the Public Reviews.

      (h) Future work could test:

      - More naturalistic sky cues (moonlight, star fields).

      - Varying the landmark's position relative to the magnetic field - slowly moving along - transient landmarks. Also, less salient landmarks and a more complex skyline, as it is usually more complex than just a single salient peak.

      Thank you for this comments. We have responded to this point in the Public Reviews. Brief discussion as suggested has been added to the revised manuscript.

      Minor Comments and Line-by-Line Suggestions

      L70 - Check citation (possibly Mouritsen 2018). Missing in the list of references.

      Thanks. This point has been addressed.

      L75 - Consider citing the new and highly relevant preprint:

      Pakhomov, A., Shapoval, A., Shapoval, N., & Kishkinev, D. (2025). Not All Butterflies Are Monarchs: Compass Systems in the Red Admiral (Vanessa atalanta). bioRxiv.

      Thanks. We have cited this reference.

      LL81-82 - Clarify vague phrasing; specify criteria for "good" vs "poor" orientation ability. Or reword/leave out.

      Thanks for your comments.

      L85 - "but one," not "bar one." 

      Thanks. Corrected.

      L124 - The 2 genetic citations are weakly linked to magnetoreception. We do not have a clear understanding of the insect magnetoreceptor and its underlying mechanism, so we simply cannot interpret genetic associations very well to underpin them to magnetoreception. For example, does noctuid's magnetic sense require a magnetised-based receptor and genes involved in biomineralization? Consider removing or softening claims. 

      Thanks. Adressed.

      LL123-126 - Define what for YOU constitutes "strong evidence" for magnetoreception (e.g. adaptive directional behaviour consistent with migratory orientation?). Is there such a thing as strong evidence at all?

      Thanks for your comments. We agree that terms such as “confirmed” or “strong evidence” can overstate the certainty of magnetoreception findings, given the ongoing debates in the field. In the revised manuscript, we have toned down.

      L153 - Indicate whether coils in NMF condition were powered or inactive.

      Thanks for your comments. Addressed.

      L163 - Justify use of multiple 5-min phases (e.g. temporal resolution of behaviour). It is confusing at the start, where first mentioned, and becomes clearer only towards the end, but it should be clearer at the start.

      Thanks for your comments. The assay was divided into these 5-min segments to provide the temporal resolution needed to detect changes in flight orientation as the relative alignment of magnetic and visual cues was systematically altered. We now clarify this earlier in the Results.

      LL167-171 - This is a good place where you can provide a map (main or supplementary with referencing) showing the study site and migration routes.

      Thanks for your suggestion. We have responded to this point in the Public Reviews.

      L174 - Avoid repetition of "expected."

      Thanks. Addressed.

      LL176-177 - Report 95% confidence intervals or equivalent and clarify which test (e.g. Moore's paired test) each p-value refers to.

      Thanks for your suggestion.

      LL189-191 - explain what fatigue means. I would remove fatigue and substitute it with "lowered flight activity". Also, the same statement comes later, so avoid repetitiveness and remove it in one place. The analysis of directedness is good throughout, but what about the analysis of activity level? Could you explain whether you did it or not, and if not, why, or if angular changes can serve as an activity proxy? Replace "fatigue" with "reduced flight activity." Avoid repetition. Clarify if activity level analysis was performed or if it was not, e.g. due to technical difficulties.

      Thanks for your comments. We have responded to this point in the Public Reviews.

      L196 - Note whether 95% CI overlaps with the expected direction. This is a crucial outcome.

      Thanks for your comments.

      LL203-205 - unclear, better to stick to "congruency", especially "initial congruency for the relationship between mN and visual cue" throughout.

      Thanks for your suggestions.

      L206 - Better to introduce a new subheading: "Laboratory-Reared Animals.".

      Thanks for your suggestion. A new subheading has been added in the revised manuscript.

      LL207-208 - Clarify which cues were available in Chen et al. (2023) and how they differ here.

      Thanks for your comments. In Chen et al. (2023), the moths oriented under an artificial starry sky together with optic flow cues. In contrast, our experiments intentionally removed both the starry-sky pattern and optic flow to avoid introducing additional visual information when testing magnetic-visual integration for orientation. We have added further clarification regarding the conditions used in Chen et al. (2023) in the revised manuscript.

      L228 - Use "lab-reared" consistently throughout the entire MS. Do not mix with lab-raised.

      Thanks. Addressed by consistently using “lab-raised”.

      Figure 2 - Confusing in parts, especially for people coming from birds and other vertebrates orientation background. At 12 o'clock, you usually expect either mN / gN (magnetic or geographic North) or the animal's own initial directional response used as control to compare the same animal's direction post-treatment. Here, your 6 o'clock is magnetic South in the first place - non-conventional. At 12 o'clock, better use mN or gN. Avoid using non-conventional references such as magnetic south. Remind readers of seasonally appropriate headings and refer to the map.

      Thanks. We have responded to this point in the Public Reviews.

      LL232-234 - Emphasize that cue-magnetic congruency is key. Highlight the most important point that the congruency between the seasonal migratory direction and visual cues is key, not that in spring/fall, visual cues must be towards or opposite to the migratory goal. But the visual cue could be in the migratory direction or opposite, or at an angle - this is for future direction.

      Thanks. We have responded to this point in the Public Reviews.

      Figure 2 and associated main text - highlight that you only tested the designs when in all seasons the salient and single visual cue was in the migratory direction (in spring it coincided with mN but in fall it was towards the magnetic south). Other directions of visual cues have not been tested, but for simplicity and consistency, you chose to do these ones as the first step, perhaps.

      Thank you for this insightful comment. Yes, our experiments tested only the conditions in which the salient and single visual cue was aligned with the migratory direction. Other angular relationships between visual cues and the magnetic field were not examined in this study. For simplicity and consistency, we focused on this alignment as a first step toward understanding magnetic-visual cue integration in migratory orientation. We now highlight this in the Fig. 2 legend.

      Figures captures/legends - hard to tell from the main text now, better to italicize figure caption text and visually space them from the main text.

      Thanks for your suggestions.

      LL 250-251 - mention to people more familiar with r - lowercase - what is the expected range for R uppercase. It is not bound 0-1 as r. Could it be negative? How large can it be?

      Thanks. Thanks for the comment. After revisiting Moore (1980) we think that R* cannot take negative values. However, since R* = R*/N^ (3/2), it is not bounded between 0 and 1. We didn’t find any concept of an upper bound in the paper (https://doi.org/10.2307/2335330).

      Figure 3 - Consider adding a horizontal line indicating the 5% significance threshold.

      Thanks for your suggestions.

      L 261 - need to have some narrative after the subheading before you insert Figure 3.

      Thanks. Addreseed.

      LL274-275 - highlight that the timeline of this congruency between mN and a landmark and the effect of this on directedness is not explored here, but worth doing in future. How long does a new congruency or a relationship between mN and a visual cue need to be exposed to the animal to regain its directional response? Clearly, it is just a question of time of exposure so that a new association is established. Suggest future work on time-dependent adaptation to new cue-field relationships.

      Thanks for your suggestion. We have now included this point as a future direction in the revised Discussion.

      Figure 4 & S4 - Replace letters with asterisks/brackets for significance. The use of the letter is confusing and unconventional.

      Thanks for your suggestion.

      Figure 4 caption - Clarify the main takeaway.

      Thanks for your suggestion.

      Figure 4 - bare minimum is confusing. I understand that you wanted to avoid "no visual cues" because, as long as the animal sees things, there are things to be used as visual cues, even if this is not the intention of the experimenter. However, it needs clarification and rewording. Better to be more specific, like "no black triangle and horizon were used, just the uniformly white cylinder", or something like that.

      Thanks for your comments. In our setup it accurately describes the intentional removal of both the black triangle and the horizon, leaving only the uniformly white cylinder as the visual environment. This wording was chosen to reflect the practical limitations of producing a perfectly symmetrical flight simulator under laboratory conditions, and we therefore prefer to retain the original phrasing.

      L328 - Remove Xu et al. (2021) citation (not relevant). This is an in vitro study with a protein which may not work exactly as it is claimed in the paper in vivo.

      Thanks. Citation removed.

      L349-350 - Clarify what "no visual cue" means (e.g., uniformly white cylinder, no horizon line). Include a photo or a schematic of the inner surface of the cylinder for this condition in the Supplementary Materials.

      Thanks. We have responded to this point in the Public Reviews.

      L380 & throughout - Replace "barely minimum visual cues" (BMVC) with "no visual cues", clarifying limitations in Methods, meaning that you can explain that absolutely no visual cues is practically impossible because, as long as there is light, animals can use some asymmetries as cues even if this is not the intention of the experimenter.

      Thank you for this comment. We have decided to retain the term “barely minimum visual cues (BMVC)” because it accurately describes our experimental condition, which is distinct from a true “no visual cues” environment. In the revised Figure legend, we now clarify that BMVC refers to conditions in which obvious visual cues (i.e., features such as the black triangle in Fig. 1) were removed, while acknowledging that complete elimination of all visual information is not possible under illuminated conditions.

      L396 - Be cautious when generalizing from two species tested by a research group that is not absolutely independent (some authors in bogong and armyworm works overlap). We saw examples in diurnal migratory butterflies (Monarchs), a more studied species than the armyworm, that the findings do not entirely translate to Red Admirals (Pakhomov et al. 2025 preprint mentioned). Suggestion to tone down any claims of broad generalisation throughout the manuscript.

      Thank you for this comment. We have responded to this point in the Public Reviews.

      LL402-407 - Note that, unlike birds (e.g. European robins), moths appear to require both magnetic and visual cues for orientation, whereas birds, mole rats and some other animals can use magnetic cues alone.

      Thank you for this comment. We have responded to this point in the Public Reviews.

      L410 - Specify that this is correct only in the Northern Hemisphere.

      Thank you for this comment. Addressed.

      LL415-416 - Acknowledge artificiality of single-cue setup (see the major comments above); integrate earlier in the Discussion.

      Thank you for this comment. We have responded to this point in the Public Reviews.

      LL420-425 - Consolidate Future Directions into a single subsection; include more concrete experimental ideas, for example, using more naturalistic, numerous transient landmarks (could be done in a virtual maze with LEDs on the wall of the cylinder with cues moving with time). Multiple visual cues. Manipulating with salience of cues - less simplistic, less salient.

      Thank you for this comment. We have responded to this point in the Public Reviews.

      L431 - Does this paper support this statement? I think it just tested the use of stellar cues in a zero magnetic field. It also dealt with direction finding, not navigation, which is a position-finding ability - a much more complex feat and might not be the ability of moths (requires further studies like with geographic and magnetic displacements, etc). Reword and check this. Show the distinction between direction finding and navigation.

      Thank you for this comment. We have reworded the relevant sentence to use “orientation” instead of “navigation”.

      L436-437 - Specify "global visual cues" (stellar, lunar, etc.) and merge all future directions into one coherent section.

      Thank you for this comment. Addressed.

      LL443-446 - A bit early to plan such studies because migratory direction could well be a complex multigenetic trait, so that you cannot approach it simply with the knock out of a single gene. The genetic basis of magnetic direction needs to be first demonstrated, which leads you to the Future Directions section.

      Thank you for this helpful comment. We fully agree that migratory direction is likely a complex multigenic trait, and our intention was not to imply that knocking out a single gene would be sufficient to explain magnetic or migratory orientation. Our statement aimed only to highlight that identifying candidate genes is an important first step toward understanding the genetic basis of magnetic orientation.

      Line 496 - Clarify whether optic flow was used (unlike previous studies).

      Thank you for pointing this out. Clarified.

      LL499-511 - Clarify the improvements done in Chen's system and their relevance.

      Thank you for pointing this out. We reworded this sentence “The Flash flight simulator system was developed based on the early design of the Mouritsen-Frost flight simulator and adapted for our experiments in Yuanjiang”.

      Line 531 - Report and compare light intensities between indoor and outdoor experiments.

      Thanks for this comment. Unfortunately, due to the sensitivity limits of our current equipment, we were unable to reliably measure outdoor light intensities at night. However, we did not perform any open-top outdoor flight-simulator experiments; instead, we used field-captured moths but conducted all behavioral tests indoors.

      L549 - Add make/model of power supplies.

      Thanks. Addressed.

      LL582-585 - Specify whether R code will be shared; recommend open access (e.g., GitHub, other open repositories). Reiterate the importance of open science and sharing all scripts. Also here, add citations to some studies where MMRT has been used recently.

      Thank you for this comment. We have responded to this point in the Public Reviews.

      Line 592 - Explain how individual r-values were derived from optical encoder data.

      Thank you for this comment. Addressed.

      L842-843 - t-tests are inappropriate for angular data; use circular tests (Watson-Williams, Mardia-Watson-Wheeler, etc.).

      Thank you for this comment. Addressed.

      L865 - Reword to avoid repetition of "fall." Example: "In field captured armyworms during fall migration".

      Thank you for this comment. Addressed.

      LL882-885 - Improve phrasing and language here. Confirming that - no colon after. "Both the acrylic plate and diffusion paper." Confirm relevance of spectra to moth visual sensitivity - add relevant citation to original studies showing that.

      Thank you for this comment. Addressed.

      L886 - Reword "uniform" - does not look uniform to me.

      Thank you for this comment. Addressed.

      Reviewer #2 (Recommendations for the authors):

      The first two sentences of the abstract ("The navigational mechanisms employed by nocturnal insect migrants remain to be elucidated in most species. Nocturnal insect migrants are often considered to use the Earth's geomagnetic field for navigation, yet the underlying mechanisms of magnetoreception in insects remain elusive") are somewhat redundant. The authors may consider rewriting them.

      Thank you for pointing this out. We have rewritten this opening to provide a more concise and non-repetitive introduction.

    1. eLife Assessment

      Laaker et al. make an important finding that the cribriform plate acts as a unique neuroimmune interface that shapes local myeloid cell states during EAE-induced neuroinflammation. Using immuohistochemistry, flow cytometry, and single-cell RNAseq of doublets to interrogate cell-cell interactions, the authors provide solid evidence that macrophages, migratory dendritic cells (DCs), and fibroblasts interact at the site of CSF outflow, with DCs showing characteristics of immune tolerance. While the functional consequences of these cell states remain to be established, the work shows that the cribiform plate can play a key role in influencing immune cell composition and interactions with stromal cells.

    2. Reviewer #1 (Public review):

      Summary:

      Laaker et al. investigate the immunological role of the cribriform plate during neuroinflammation using the EAE model. The authors combine immunohistochemistry, flow cytometry, and single-cell RNA sequencing to characterize CD11b+CD11c+ myeloid cells that accumulate at podoplanin (PDPN)-rich meningeal-lymphatic niches surrounding olfactory nerve bundles. They identified distinct populations of migratory dendritic cells (DCs) and macrophages retained at the cribriform plate that exhibit transcriptional signatures consistent with immune tolerance, reduced interferon signaling, and programmed cell death, including Pdcd1 (PD-1) expression. In parallel, CCR2+ monocytes and alternatively activated (M2-like) Arg1+/CHI3L3+ macrophages integrate into this niche, suggesting the establishment of a locally immunosuppressive myeloid network.

      Strengths:

      (1) Overall, the study postulates a novel model in which the cribriform plate functions as a specialized perineural immune interface that reshapes myeloid phenotypes during neuroinflammation.

      (2) Suggests broader relevance for shaping peripheral immunity and therapeutic targeting. If DCs are being "tuned" at this exit site, it could influence what reaches cervical lymph nodes and how peripheral responses are set during CNS autoimmunity; the authors explicitly position this as relevant to CNS autoimmunity and possibly other CNS diseases (while acknowledging the need for human validation).

      (3) Technical sound and highly original work. Convergent multi-method support: the central narrative is backed by immunohistochemistry + flow cytometry + scRNA-seq, rather than a single assay. The headline conclusion (tolerogenic/suppressive skew at the cribriform plate during EAE) is explicitly built from these combined modalities.

      Weaknesses:

      (1) In Figure 1, the manuscript would be strengthened by quantification of CSF1R-GFP+ and CD11c-eYFP+ cells in PDPN+LYVE1- versus PDPN+LYVE1+ regions in both control and EAE mice. This would demonstrate selective accumulation or retention of myeloid cells at the cribiform plate niche.

      (2) While the PostContact-seq strategy is innovative (Figure 3), additional justification is needed to demonstrate that tissue dissociation did not artificially disrupt PDPN-myeloid contacts. The relatively small proportion of live PDPN-rich doublets (~2.5% total aggregates and ~18% PDPN+ within total aggregates) raises questions about representativeness compared with in situ observations. The authors should also more explicitly elaborate on why PostContact-seq was favored over alternative approaches such as PIC-seq.

      (3) The authors stated that results regarding cell-cell interactions were integrated across four intercellular communication methodologies (Figure 4B), but this integration is not clearly described in either the Results or Method sections. This needs clarification. Moreover, the interaction analysis in Figure 4B seems to rely on TALKIEN, which does not incorporate prior ligand-receptor knowledge. Given the availability of widely used tools, such as CellChat and NicheNet, the authors may consider cross-referencing their findings.

      (4) Given the increase in CCR2+ cells in PDPN+ regions (Figure S4), a pseudotime trajectory analysis may be valuable to test whether CCR2+ monocytes preferentially differentiate into CHI3L3+ immunosuppressive macrophages, PD-1+ DCs, or other myeloid subsets in post-contact versus no contact.

      (5) Validation of immunosuppressive signatures in macrophages (Fig. 4G-H) using the same FACS-based post-contact versus no-contact sorting strategy (as in Figure 3A) would strengthen the conclusions.

      (6) The identity of CD45IV+ cells in contact with PDPN+ cells is unclear (Figure 6B-C). The authors should provide a gating strategy demonstrating that these cells are CD11b+CD11c+ DCs within the PDPN+ doublet population, and ideally show whether these dying cells are PD-1+. Furthermore, co-labeling in tissue sections for PD-1, cleaved caspase-3, and CD11c-eYP would provide important spatial validation of flow cytometry findings (Figure 6E).

      (7) In Figures 1F-H, the authors should comment on the morphological differences of CD11c+ cells in the olfactory bulb versus those infiltrating the cribriform plate.

    3. Reviewer #2 (Public review):

      Summary:

      In this article, Laaker et al described diverse populations of macrophages and dendritic cells found in and around the cribriform plate in the context of a neuroinflammation caused by an autoimmune disease (EAE). The authors utilize elegant histochemical staining and a nifty approach to sort doublets to interrogate cells that are in contact with one another, presumably in vivo. Notably, they uncover a population of CD11c+CD11b+ cells interacting with M2 macrophages and PDPN+ fibroblasts and lymphatics. These cells are heterogenous but some of these DCs express PD-1, and transcriptional profiling suggests they may have immunosuppressive behavior. Altogether, this article explains well the complexity of cell populations found around the cribriform plate during inflammation, and is suggestive of different interactions that trigger these different phenotypes from immune cells.

      Strengths:

      Beautiful images of a unique CNS: peripheral interface that support a novel scRNA approach to understanding how different cell populations engage in functional interactions in vivo.

      Weaknesses:

      It's currently unclear how the sorted populations reflect in vivo interactions or a propensity to form aggregates during ex vivo processing. The authors address both podoplanin-expressing cells as stromal cells and as lymphatic endothelial cells, but at times it's unclear which of these two populations is being analyzed and which is the most relevant. While novel observations, most of these findings are descriptive and lack functional correlates, and in places, the potential implications could use further discussion.

    4. Author response:

      We would like to thank the reviewers for their supportive comments which largely agree with our main finding that a heterogeneous population of dendritic cells and Th2-skewed macrophages interact with the PDPN+ niche at the cribriform plate during EAE neuroinflammation. Additionally, they have provided several meaningful critiques to our study which we are now working on addressing in a newly revised manuscript.

    1. eLife Assessment

      In this valuable study, DNA and RNA are co-imaged in single cells to show that the proximity of topologically associated domain (TAD) boundaries is uncoupled from the transcriptional activity of nearby genes. The evidence supporting these conclusions is convincing for the regions examined, with high-throughput imaging providing robust statistics. This work will be of interest to researchers studying genome architecture and its relationship to gene regulation.

    2. Reviewer #1 (Public review):

      Summary:

      This is an important study that employs high-throughput single-cell imaging to directly investigate the relationship between topologically associating domain (TAD) boundaries and gene regulation. The authors rigorously test the prevailing model that TAD boundaries functionally regulate gene activity by modulating chromatin interactions. Their core finding is that, under their specific experimental conditions, the physical distance between TAD boundaries shows no consistent correlation with the transcriptional bursting activity of a gene within the TAD. However, the authors' leap from this specific observation to the broad conclusion that "TAD boundary architecture and gene activity are uncoupled" risks conceptual overgeneralization and may lead to misinterpretation, as it seemingly contradicts substantial prior evidence supporting the regulatory role of TAD structures.

      Strengths:

      The major strength of this work lies in its innovative high-throughput, multi-colour imaging platform, which enables the simultaneous detection of spatial distances between specific DNA elements (TAD boundaries) and transcriptional activity at the same genomic locus in single cells and single alleles. The high-throughput nature makes the results convincing. A second key strength is the incorporation of perturbations, including global transcriptional inhibition, cell-type comparison, and degradation of key architectural proteins (CTCF, cohesin). This provides a comprehensive methodological framework to examine the relationship between boundary proximity and gene activity from multiple angles under defined conditions.

      Weaknesses:

      (1) Conceptual framing and interpretation:

      The central conclusion may require more precise framing to avoid potential overreach. The authors' interpretation equating "physical distance between TAD boundaries" with overall "TAD boundary architecture," and "transcriptional bursting events" with broader "gene activity," could benefit from clarification. This framing may not fully capture the temporal dynamics of transcription or the regulatory complexity within TADs. Furthermore, the broad conclusion of an uncoupled relationship appears to challenge extensive prior evidence from perturbation studies showing that disrupting TAD boundaries can alter gene expression. The authors' own observation of reduced gene activity upon RAD21 degradation suggests that global TAD disruption can affect transcription. A more precise and limited conclusion, acknowledging that their data demonstrate a lack of detectable correlation between boundary distance and bursting activity in their system, would be more accurate and help reconcile these findings with the existing literature.

      (2) Technical methods and data presentation:

      (2.1) Accuracy and dimensionality of distance measurements: The manuscript does not clearly state whether distances are measured in 2D or 3D, nor does it sufficiently address precision limits. The stated Z-step size (1 µm) may be inadequate for accurately measuring sub-micron chromatin distances in 3D.

      (2.2) Probe design and systematic error: The genomic coverage size of the BAC probes used for DNA FISH is not explicitly stated. Large probe coverage could inherently blur the precise spatial location of adjacent DNA loci. The reported average distance (~300 nm) may be influenced by the physical size of the probes, as well as systematic expansion or distortion introduced by sample fixation and FISH processing. Although such technical limitations are currently unavoidable, the authors should clarify how these factors might affect their ability to detect subtle distance changes.

      (2.3) Data Visualization: The manuscript would benefit from including representative, zoomed-in regions of interest from the raw imaging data. This would allow readers to visually assess measured distance differences against background noise.

      (2.4) Potential impact of resolution limits: In Figure 5, the micro-C data reveal a clear difference in interaction patterns inside versus outside the VARS2 locus TAD, yet the imaging data show no corresponding distance difference. This strongly suggests that the current imaging system, limited by optical resolution, probe size, and localisation accuracy, may be unable to resolve finer-scale spatial reorganizations associated with specific chromatin conformations (e.g., enhancer-promoter loops). The authors should explicitly discuss that their conclusion of "no coupling observed" may be constrained by the resolution and sensitivity of their method and does not preclude the possibility of detecting such associations with higher-precision measurements or in live-cell dynamics.

      In summary, this study provides a valuable single-cell perspective. However, the authors should more cautiously define the scope of their findings in the manuscript and provide a more balanced discussion situating their work within the broader field.

    3. Reviewer #2 (Public review):

      Summary:

      Almansour et al. investigate whether the proximity of TAD boundaries is directly linked to gene activity. The authors use high-throughput imaging to simultaneously measure the gene activity and physical distances between boundary regions in an allele-specific manner. Using transcriptional inhibitors, expression induction, and acute depletion of CTCF and cohesin, they test whether proximity of boundaries affects, or is affected by, gene activity.

      Strengths:

      The combined use of DNA and RNA imaging enabled simultaneous measurement of boundary proximity and transcriptional status at individual alleles. This allows single-allele correlation between boundary proximity and gene activity at multiple loci across thousands of alleles.

      The use of both transcription inhibitors and transcription stimulation provides compelling and consistent evidence that boundary proximity can be disconnected from a gene's activity. The data convincingly support the conclusion that stable proximity between boundary regions is not required for ongoing transcription at the loci and timescales examined.

      This work strengthens the emerging view that genome organization at the level of domain boundaries does not impose a deterministic control over transcription.

      Weaknesses:

      In untreated cells, the distribution of distance measurements between boundary probes is exceptionally narrow. While depletion of RAD21 clearly demonstrates an ability to detect changes in this distribution, this tight baseline distribution may limit sensitivity to more subtle changes (like those one might expect from transcriptional influences). In addition, the correlation analysis is asymmetric, primarily stratifying by transcriptional status and then comparing boundary distances. Given the central claim that boundary architecture does not influence gene activity, the analysis should be done from the opposite perspective (stratifying by boundary distance).

      Strong disruption of boundary distances is only observed upon depletion of cohesin. Notably, this corresponds with the largest changes in gene activity. In contrast, depletion of CTCF actually had minimal impact on boundary distances and also had minimal impact on gene activity. This makes sense in light of previous work, where live cell imaging demonstrated that cohesin is more important for domain-structure, whereas CTCF is only important for blocking cohesin from continuing on, such that the fully formed loop occurs in a very small percentage of cells. Therefore, the fact that disruption of cohesin (more important for internal domain structure) affects gene activity while disruption of CTCF does not is exceptionally interesting but is lacking from the discussion.

      On a related note, this approach primarily tests the role of boundary interactions rather than domain organization as a whole, and it should be acknowledged that internal domain structures are not directly assessed.

      The comparison to work in other organisms (particularly the comparisons made to Drosophila) should be handled with care. The mechanisms underlying domain formation differ substantially across these systems, particularly regarding the differences in CTCF's role.

    1. eLife Assessment

      This study presents a high-quality cryo-EM structure of the human kinase PINK1 in complex with the HSP90-CDC37 chaperone complex, capturing a partially folded intermediate in which the C-lobe and C-terminal extension are structured while the N-lobe remains unfolded and engaged by the HSP90 clamp. The structural data are broadly consistent with a recently published structure of the same complex, providing useful insight into early steps of PINK1 maturation and highlighting residues linked to familial Parkinson's disease. However, the mechanistic conclusions remain incomplete because the manuscript does not experimentally validate key hypotheses raised by the structure, including the functional roles of the C-lobe interface, the HPNI motif, the C-terminal extension, or the proposed competition between HSP90 and TOM20.

    2. Reviewer #1 (Public review):

      Summary:

      The ubiquitin kinase PINK1 accumulates on damaged mitochondria to signal the initiation of mitophagy. While we know what PINK1 looks like when it is stabilised on damaged mitochondria, not much is known about how it gets there. In this study, Okatsu et al. solve a cryoEM structure of partially folded PINK1 in complex with its chaperones HSP90 and CDC37 to a resolution of 3.08 Å. This structure captures PINK1 in a state whereby the C-lobe of its kinase domain is folded, while the N-lobe remains unfolded and stabilised by an HSP90 dimer. According to the authors' model, their structure represents cytosolic PINK1 on its way to the mitochondria. This structure also demonstrates how PINK1 is folded in a step-wise mechanism and proposes a role for residues that are mutated in Parkinson's disease.

      Strengths:

      PINK1 is known to be a client of the HSP90 chaperone system. Here, Okatsu et al. present a solid structural dataset showing how PINK1 interacts with HSP90 and CDC37, and they describe key residues and motifs predicted to facilitate the interactions between PINK1 and the chaperones. Notably, two key residues within interacting regions on PINK1 are also mutated in Parkinson's disease. The structure by Okatsu et al. is in line with another recently published structure of the same complex (Tian et al. Nat Comms, 2025), which appears very similar, further supporting the findings. Together, these two studies represent the first observations of cytosolic PINK1 in a semi-folded state, which provides a novel insight into PINK1 at an earlier stage within the signalling cascade.

      Weaknesses:

      This paper is not the first to describe the structure of the PINK1-HPS90-CDC37 complex. A study by Tian et al. was published in early December 2025 in Nature Communications, reporting a 2.84 Å structure of PINK1-HSP90-CDC37, as well as a structure of PINK1 with HSP90 and another HSP90 co-chaperone, FKBP51. It would be important to acknowledge this comparable study and to discuss how the structure in this study compares with the Tian et al. structures and whether it reveals any additional information.

      Although they make claims about the functional relevance of PINK1-interacting residues, the study by Okatsu et al. does not include any biochemical or functional validation of the structure. To support their claims, the authors should test the PINK1-HSP90-CDC37 interaction using their recombinant proteins for mutants of the conserved hydrophobic PINK1 residues in the PINK1 c-lobe, H352, L353, H360, I382, D384, as well as the PINK1 HPNI motif, especially the PD mutation H271Q. The PINK1 PD mutation L347P, which interacts with the CDC37 HPNI moti,f is also worth testing.

      A major question that arises from this work is whether the PINK1-HSP90-CDC37 complex is newly translated PINK1 on its way to mitochondria (as suggested by the authors) or PINK1 that has already entered mitochondria, been cleaved and then retrotranslocated. This latter scenario is the favoured model proposed by Tian et al. (Nat Comms) based on their biochemical experiments. The discrepancies between the two models should at least be discussed, and the authors should also attempt to demonstrate experimentally whether their model is correct. This question is important to address because it would allow this structural information to be placed in the greater context of PINK1 signalling.

      It is also unclear what the consequences are of disrupted PINK1-HSP90-CDC37 interactions on the PINK1 signalling process more broadly - does PINK1 accumulate in the cytosol? Is there less of it? Can it still be degraded via the N-end rule? What happens during mitophagy? Perhaps some of these questions can be answered with cell-based studies using a selection of the PINK1 mutants mentioned above that disrupt the PINK1-HSP90-CDC37 complex formation.

    3. Reviewer #2 (Public review):

      Summary:

      Okatsu et al report the cryoEM structure of the PINK1-HSP90-CDC37 complex at 3.08A. To do so, they mutated the PARL cleavage site (F104M) and removed the N-terminal 103 a.a. The construct was co-expressed with HSP90beta and CDC37 in insect cells, as performed previously for other kinase-HSP90-CDC37 complexes (e.g. Raf1). Molybdate was added to prevent cycling between open and closed HSP90 conformations. The initial characterization by single particle cryoEM reveals two HSP90 conformations: closed with CDC37 dissociated, and open with the CTD of HSP90 separated. Thus, the authors crosslinked the complex, which yielded a more homogenous closed structure with clearly visible density for HSP90, CDC37, and PINK1. The structure shows an immature or partially folded kinase domain conformation for PINK1, with the C-lobe bound to HSP90 and the N-lobe unfolded. The C-lobe binds to HSP90 via the HPNI motif in CDC37, which mimics the HPNI motif found in the N-lobe of kinases, and which is conserved across kinases. The main novelty here is the interaction between the C-terminal extension (CTE) of PINK1, which must adopt another conformation than in the folded state, which would otherwise clash with HSP90. The interaction with the CTE is notably mediated by the flexible charged linker (FCL) of HSP90, which is partially disordered. In this conformation, HSP90 would clash with TOM20 binding.

      Strengths:

      Overall, this is well-executed structural biology work, which brings insight into the elements required to fold PINK1. The protein engineering used in this study is of great value and will help others in the field explore the function of PINK1 folding. Understanding the mode of activation of PINK1 is important, and this work brings forward hypotheses that are worthy of testing.

      Weaknesses:

      In the absence of functional assays, the study does not bring much novelty or biological insights. Furthermore, there are already several structures of HSP90-CDC37 bound to partially folded kinases, and a simple superposition of these structures on the model of HsPINK1 allows similar conclusions to be drawn, i.e. that it would bind a folded C-lobe and unfolded N-lobe. Furthermore, a very similar structure of PINK1 bound to HSP90-CDC37 (and FKBP5) was published in Nature Communications in December 2025 by another group. The main novelty from this work (and the paper published in December) is that the CTE adopts a different conformation compared to the mature form, but the implications of this are not explored. Furthermore, the authors propose that HSP90 would compete with TOM20, but what dictates the outcome of this competition? More importantly, how do these results help understand how PINK1 become active? Again, this is not explored.

    1. eLife Assessment

      This valuable study demonstrates molecular changes associated with age related impairment in oligodendrocyte differentiation and ability to myelinate. The identification of particular genes that are associated with this decline will provide potential future targets for therapeutic interventions. The reviewers felt that the quality of the evidence was solid while identifying some minor weaknesses that if addressed would enhance the rigor of the study.

    2. Reviewer #1 (Public review):

      Summary:

      This manuscript by Ghosh and colleagues investigates the transcriptional changes within the oligodendrocyte lineage that contribute to age-related declines in oligodendrocyte differentiation and myelination. Combining bulk RNA-Seq on acutely purified oligodendrocyte lineage cells with bioinformatic approaches, the authors identify groups of genes that show different patterns of dynamic regulation during differentiation (which they term "switch" genes, or "switches"). A subset of these switch genes is differentially regulated with age. The authors identify two transcription factors, Bcl11a and Foxm1, that are downregulated during differentiation, have predicted binding site enrichment at other switch genes, and are downregulated in aged OPCs. Functionally testing Bcl11a, the authors show that Bcl11a knockdown inhibits the differentiation of young OPCs in culture, whereas overexpression promotes the differentiation of aged OPCs. Viral expression of Bcl11a in Sox10-expressing cells accelerates the formation of Plp1+ oligodendrocytes in aged rodents following lysolecithin induced demyelination.

      Strengths:

      The work is clearly presented and addresses an important biological problem. The bioinformatic approaches used in the manuscript are powerful, and the identification of Bcl11a as a modulator of oligodendrocyte differentiation is a novel finding. The combined in vitro and in vivo approaches to assess the function of Bcl11a in oligodendrocyte differentiation are a substantial strength of the work.

      Weaknesses:

      Although the PCA plots show distinct and reproducible global gene expression differences between the different isolated cell populations, the authors do not present a figure showing expression levels of typical stage-specific markers (e.g., Pdgfra, Pcdh15, C1ql1 for OPCs, Bcas1, Enpp6, Gpr17 for preOLs, Mobp, Mog, etc. for OLs) or confirm the absence of markers of other lineages (astrocytes, neurons, microglia, etc.). This makes it difficult to evaluate the success of their cell isolation strategy at different ages without reanalyzing the raw data. In addition, other publicly available datasets (e.g., the Barres lab bulk RNA-Seq datasets from PMID 25186741 or the Castelo-Branco lab single cell datasets from PMID 27284195) do not show downregulation of Bcl11a during OL differentiation as is described here - this apparent discrepancy is not discussed.

    3. Reviewer #2 (Public review):

      Aging poses a significant challenge to the regenerative capacity of oligodendrocyte precursor cells (OPCs) to differentiate and myelinate neuronal axons. Myelin abnormalities accumulate with age, and it is likely that the ability of OPCs to differentiate into myelinating oligodendrocytes becomes progressively impaired during aging, leading to inefficient turnover of damaged myelin and oligodendrocytes, as well as reduced adaptive myelination. Understanding the molecular mechanisms underlying the compromised capacity of aged OPCs is therefore critical for addressing age-related white matter decline.

      This study aims to decipher the intrinsic molecular changes that occur in aged OPCs. By profiling differentially expressed transcription factors (TFs) between young and aged OPCs, and by employing a novel bioinformatic tool to identify key TFs that undergo dynamic changes across distinct stages of OPC differentiation, the authors identify Bcl11a as a potential regulator. Bcl11a is highly expressed in young OPCs but markedly reduced in aged cells. Functional experiments further demonstrate that while Bcl11a does not affect OPC proliferation, it significantly promotes the differentiation of aged OPCs. Importantly, this effect is also observed in vivo following demyelinating injury in aged mice.

      While the study provides compelling evidence that BCL11A represents a limiting factor for OPC differentiation during ageing, the downstream targets and molecular mechanisms through which BCL11A exerts its effects are not directly addressed. As such, the work should be interpreted primarily as identifying a key regulatory node rather than a fully defined molecular pathway.

      Overall, this study offers valuable insights into the age-related loss of regenerative capacity in the central nervous system and introduces a computational framework that may be broadly useful for investigating dynamic gene regulation in other biological contexts.

      Major Points:

      (1) MACS mouse anti-A2B5 microbeads are not OPC-specific and may also label astrocyte precursor cells or immature astrocytes. How do the authors justify this caveat? Could some of the claimed "OPC-specific" switch genes in fact be enriched in astrocyte lineage cells?

      (2) Overall, Figures 1 and 2 are not very informative in terms of biological insight. The authors should provide more detail in the main figures regarding the enriched gene sets associated with each of the Type 1-4 switch categories. For example, summarizing the top Gene Ontology terms for each switch type would greatly enhance interpretability.

      (3) A similar issue applies to Figure 3. The authors should explicitly specify the transcription factors in the main figure, particularly the 27 TFs identified through the ENCODE/ReMap2 analysis.

      (4) Have the authors validated Bcl11a expression across different CNS cell types and between young and aged conditions using independent methods such as qPCR, immunofluorescence, or western blotting?

      (5) Regarding OPC aging, an open question is whether the reduced differentiation capacity of aged OPCs is an intrinsic property of the cells themselves or whether it results from prolonged exposure to an aging environment that induces non-cell-autonomous epigenetic or genetic changes, thereby rendering OPCs less efficient at differentiating. It would be helpful if the authors could expand on this point in the Discussion, with reference to relevant previous studies and experimental evidence.

      (6) Do the authors observe a change in the number or density of OPCs between young and aged mice?

      (7) The in vivo characterization of Bcl11a overexpression using the AAV-based approach appears incomplete. Do aged mice overexpressing Bcl11a in Sox10⁺ cells exhibit reduced age-related myelin degeneration under baseline conditions? In the LPC model, do the authors observe differences in lesion size and/or remyelination efficiency?

      (8) Are the authors presenting gSWITCH for the first time in this manuscript? Given that the gSWITCH framework is novel and central to the study, its conceptual contribution could be emphasized more strongly. A brief comparison with existing trajectory- or pattern-based methods-ideally in the main text around Figure 1-would help readers better appreciate its novelty.

      (9) The evolutionary analysis also appears somewhat disconnected from the rest of the study. Could the authors leverage available public datasets to test whether a similar Bcl11a expression trajectory is observed in human oligodendrocyte lineage cells?

    1. eLife Assessment

      This useful study develops an individual-based model to investigate the evolution of division of labor in vertebrates, comparing the contributions of group augmentation and kin selection. The findings are solid in showing that, within the specific structure of the model and the parameter space explored, group augmentation can robustly favor the evolution of differentiated helper roles, particularly when age-dependent task switching and dominance dynamics are allowed to evolve. However, the evidence only partially supports the authors' broader claim that group augmentation is the primary driver of vertebrate division of labor. Several modelling assumptions, including the limited scope for synergistic task benefits, the restriction of helper effects to group-size-mediated benefits, and the relatively narrow exploration of cost and benefit parameters, constrain the potential for kin selection to generate division of labor and limit the generality of the conclusions.

    2. Reviewer #2 (Public review):

      Summary:

      This paper formulates an individual-based model to understand the evolution of division of labor in vertebrates. The model considers a population subdivided in groups, each group has a single asexually-reproducing breeder, other group members (subordinates) can perform two types of tasks called "work" or "defense", individuals have different ages, individuals can disperse between groups, each individual has a dominance rank that increases with age, and upon death of the breeder a new breeder is chosen among group members depending on their dominance. "Workers" pay a reproduction cost by having their dominance decreased, and "defenders" pay a survival cost. Every group member receives a survival benefit with increasing group size. There are 6 genetic traits, each controlled by a single locus, that control propensities to help and disperse, and how task choice and dispersal relate to dominance. To study the effect of group augmentation without kin selection, the authors cross-foster individuals to eliminate relatedness. The paper allows for the evolution of the 6 genetic traits under some different parameter values to study the conditions under which division of labour evolves, defined as the occurrence of different subordinates performing "work" and "defense" tasks. The authors envision the model as one of vertebrate division of labor.

      The main conclusion of the paper is that group augmentation is the primary factor causing the evolution of vertebrate division of labor, rather than kin selection. This conclusion is drawn because, for the parameter values considered, when the benefit of group augmentation is set to zero, no division of labor evolves and all subordinates perform "work" tasks but no "defense" tasks.

      Strengths:

      The model incorporates various biologically realistic details, including the possibility to evolve age polytheism where individuals switch from "work" to "defence" tasks as they age or vice versa, as well as the possibility of comparing the action of group augmentation alone with that of kin selection alone.

      Weaknesses from the previous round of review::

      The model and its analysis are limited, which in my view makes the results insufficient to reach the main conclusion that group augmentation and not kin selection is the primary cause of the evolution of vertebrate division of labour. There are several reasons.

      First, although the main claim that group augmentation drives the evolution of division of labour in vertebrates, the model is rather conceptual in that it doesn't use quantitative empirical data that applies to all/most vertebrates and vertebrates only. So, I think the approach has a conceptual reach rather than being able to achieve such conclusion about a real taxon.

      Second, I think that the model strongly restricts the possibility that kin selection is relevant. The two tasks considered essentially differ only by whether they are costly for reproduction or survival. "Work" tasks are those costly for reproduction and "defense" tasks are those costly for survival. The two tasks provide the same benefits for reproduction (eqs. 4, 5) and survival (through group augmentation, eq. 3.1). So, whether one, the other, or both helper types evolve presumably only depends on which task is less costly, not really on which benefits it provides. As the two tasks give the same benefits, there is no possibility that the two tasks act synergistically, where performing one task increases a benefit (e.g., increasing someone's survival) that is going to be compounded by someone else performing the other task (e.g., increasing that someone's reproduction). So, there is very little scope for kin selection to cause the evolution of labour in this model. Note synergy between tasks is not something unusual in division of labour models, but is in fact a basic element in them, so excluding it from the start in the model and then making general claims about division of labour is unwarranted. In their reply, the authors point out that they only consider fertility benefits as this, according to them, is what happens in cooperative breeders with alloparental care; however, alloparental care entails that workers can increase other's survival *without group augmentation*, such as via workers feeding young or defenders reducing predator-caused mortality, as a mentioned in my previous review but these potentially kin-selected benefits are not allowed here.

      Third, the parameter space is understandably little explored. This is necessarily an issue when trying to make general claims from an individual-based model where only a very narrow parameter region of a necessarily particular model can be feasibly explored. As in this model the two tasks ultimately only differ by their costs, the parameter values specifying their costs should be varied to determine their effects. In the main results, the model sets a very low survival cost for work (yh=0.1) and a very high survival cost for defense (xh=3), the latter of which can be compensated by the benefit of group augmentation (xn=3). Some limited variation of xh and xn is explored, always for very high values, effectively making defense unevolvable except if there is group augmentation. In this revision, additional runs have been included varying yh and keeping xh and xn constant (Fig. S6), so without addressing my comment as xn remains very high. Consequently, the main conclusion that "division of labor" needs group augmentation seems essentially enforced by the limited parameter exploration, in addition to the second reason above.

      Fourth, my view is that what is called "division of labor" here is an overinterpretation. When the two helper types evolve, what exists in the model is some individuals that do reproduction-costly tasks (so-called "work") and survival-costly tasks (so-called "defense"). However, there are really no two tasks that are being completed, in the sense that completing both tasks (e.g., work and defense) is not necessary to achieve a goal (e.g., reproduction). In this model there is only one task (reproduction, equation 4,5) to which both helper types contribute equally and so one task doesn't need to be completed if completing the other task compensates for it; instead, it seems more fitting to say that there are two types of helpers, one that pays a fertility cost and another one a survival cost, for doing the same task. So, this model does not actually consider division of labor but the evolution of different helper types where both helper types are just as good at doing the single task but perhaps do it differently and so pay different types of costs. In this revision, the authors introduced a modified model where "work" and "defense" must be performed to a similar extent. Although I appreciate their effort, this model modification is rather unnatural and forces the evolution of different helper types if any help is to evolve.

      I should end by saying that these comments don't aim to discourage the authors, who have worked hard to put together a worthwhile model and have patiently attended to my reviews. My hope is that these comments can be helpful to build upon what has been done to address the question posed.

      [Editors' note: the authors have provided responses to the each of these points.]

    3. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #2 (Public review):

      Summary:

      This paper formulates an individual-based model to understand the evolution of division of labor in vertebrates. The model considers a population subdivided in groups, each group has a single asexually-reproducing breeder, other group members (subordinates) can perform two types of tasks called "work" or "defense", individuals have different ages, individuals can disperse between groups, each individual has a dominance rank that increases with age, and upon death of the breeder a new breeder is chosen among group members depending on their dominance. "Workers" pay a reproduction cost by having their dominance decreased, and "defenders" pay a survival cost. Every group member receives a survival benefit with increasing group size. There are 6 genetic traits, each controlled by a single locus, that control propensities to help and disperse, and how task choice and dispersal relate to dominance. To study the effect of group augmentation without kin selection, the authors cross-foster individuals to eliminate relatedness. The paper allows for the evolution of the 6 genetic traits under some different parameter values to study the conditions under which division of labor evolves, defined as the occurrence of different subordinates performing "work" and "defense" tasks. The authors envision the model as one of vertebrate division of labor.

      The main conclusion of the paper is that group augmentation is the primary factor causing the evolution of vertebrate division of labor, rather than kin selection. This conclusion is drawn because, for the parameter values considered, when the benefit of group augmentation is set to zero, no division of labor evolves and all subordinates perform "work" tasks but no "defense" tasks.

      Strengths:

      The model incorporates various biologically realistic details, including the possibility to evolve age polytheism where individuals switch from "work" to "defense" tasks as they age or vice versa, as well as the possibility of comparing the action of group augmentation alone with that of kin selection alone.

      Weaknesses:

      The model and its analysis are limited, which in my view makes the results insufficient to reach the main conclusion that group augmentation and not kin selection is the primary cause of the evolution of vertebrate division of labor. There are several reasons.

      (1) First, although the main claim that group augmentation drives the evolution of division of labor in vertebrates, the model is rather conceptual in that it doesn't use quantitative empirical data that applies to all/most vertebrates and vertebrates only. So, I think the approach has a conceptual reach rather than being able to achieve such a conclusion about a real taxon.

      We appreciate the reviewer’s point that our model does not incorporate quantitative empirical data across vertebrate taxa. This is indeed a limitation and reflects the current lack of fine-scale datasets on task division, the influence of life-history traits, and the fitness consequences of different cooperative activities in vertebrates. One of our aims, however, is precisely to stimulate such empirical work by highlighting the value of examining division of labor in species inhabiting harsh environments, considering age/size/dominance structure when evaluating variation in cooperative activities, and incorporating defense behaviors more consistently into analyses of helping, especially since defenders are often overlooked relative to the classic helpers-at-the-nest that provision offspring. The model therefore remains directly relevant to vertebrate systems because it departs from insect-inspired approaches that focus on fitness outcomes based solely in maximizing colony productivity. Instead, it incorporates direct fitness benefits to group members, an essential feature of vertebrate cooperative breeding and of other systems with fertile “workers,” as we clarified in the discussion.

      (2) Second, I think that the model strongly restricts the possibility that kin selection is relevant. The two tasks considered essentially differ only by whether they are costly for reproduction or survival. "Work" tasks are those costly for reproduction and "defense" tasks are those costly for survival. The two tasks provide the same benefits for reproduction (eqs. 4, 5) and survival (through group augmentation, eq. 3.1). So, whether one, the other, or both helper types evolve presumably only depends on which task is less costly, not really on which benefits it provides. As the two tasks give the same benefits, there is no possibility that the two tasks act synergistically, where performing one task increases a benefit (e.g., increasing someone's survival) that is going to be compounded by someone else performing the other task (e.g., increasing that someone's reproduction). So, there is very little scope for kin selection to cause the evolution of labor in this model. Note synergy between tasks is not something unusual in division of labor models, but is in fact a basic element in them, so excluding it from the start in the model and then making general claims about division of labor is unwarranted. In their reply, the authors point out that they only consider fertility benefits as this, according to them, is what happens in cooperative breeders with alloparental care; however, alloparental care entails that workers can increase other's survival *without group augmentation*, such as via workers feeding young or defenders reducing predator-caused mortality, as a mentioned in my previous review but these potentially kin-selected benefits are not allowed here.

      We understand the reviewer’s concern that our model restricts the scope for kin-selected benefits by not including task-specific synergy effects—specifically, help that directly increases the survival of group members (e.g., load-lightening via feeding young, or predator defense that reduces mortality of breeders or offspring independently of group augmentation). We agree that such effects can occur in some cooperative breeders, and that they can, in principle, generate indirect fitness benefits. However, even when helpers increase the survival of breeders or reduce parental investment per offspring, these effects generally translate into higher breeder productivity—either via increased fecundity, increased survival to the next breeding attempt, or increased investment in subsequent broods. Thus, although we treat benefits in terms of enhanced breeder productivity, this formulation implicitly captures a range of help-related effects that ultimately improve the reproductive output of the breeders, including those mediated through increased survival. For this reason, we believe that the model remains relevant for vertebrate systems despite not representing each pathway separately.

      (3) Third, the parameter space is understandably little explored. This is necessarily an issue when trying to make general claims from an individual-based model where only a very narrow parameter region of a necessarily particular model can be feasibly explored. As in this model the two tasks ultimately only differ by their costs, the parameter values specifying their costs should be varied to determine their effects. In the main results, the model sets a very low survival cost for work (yh=0.1) and a very high survival cost for defense (xh=3), the latter of which can be compensated by the benefit of group augmentation (xn=3). Some limited variation of xh and xn is explored, always for very high values, effectively making defense unevolvable except if there is group augmentation. In this revision, additional runs have been included varying yh and keeping xh and xn constant (Fig. S6), so without addressing my comment as xn remains very high. Consequently, the main conclusion that "division of labor" needs group augmentation seems essentially enforced by the limited parameter exploration, in addition to the second reason above.

      As we have explained in previous revisions, the costs associated with work and defense are not directly comparable because they affect different fitness components: work costs reduce dominance, whereas defense costs reduce survival. Whether a particular cost is “high” or “low” can only be evaluated by examining the evolved reaction norms and identifying the ranges over which these norms change. For this reason, we focused on parameter ranges that actually generate shifts in reaction norms rather than presenting large regions of parameter space where nothing changes.

      We also reiterate that we did in fact explore broader parameter ranges than those shown in the main text. Additional analyses, including those specifically designed to identify conditions under which division of labor evolves under kin selection alone, are provided in the Supplementary Material. Specifically, Figure S1 addresses the point raised by the “need” of group augmentation benefits for defense to evolve, by increasing the baseline survival x<sub>0</sub>.

      We now include one additional figure in the Supplementary Material with a lower value for the benefit of group size (x<sub>n</sub> = 1 instead of x<sub>n</sub> = 3), and we extended the range of x<sub>h</sub> to include lower values (x<sub>h</sub> = 1). As we can see in Figure S7 and Table S8, group augmentation benefits are still the primary reason for individuals to group (see dispersal values). For low benefits of group augmentation, defense evolves in harsh environments in the absence of kin selection, and in benign environments when both direct and indirect fitness benefits take place. We have also now expanded the results section to include these last results. Note that we also checked even lower values for x<sub>h</sub> under the only kin selection implementation, with results being qualitatively similar, but chose not to include them in the manuscript since it is already a very long Supplementary Material. Here are the averages for two examples with x<sub>h</sub> = 0.1 and when we promote division of labor:

      Author response table 1.

      In short, the conclusion that division of labor requires group augmentation is not an artifact of limited parameter exploration. It arises because kin selection alone favors division of labor only under highly restrictive parameter combinations, whereas including direct fitness benefits substantially expands the conditions under which division of labor evolves. This pattern is consistent across the full set of parameter combinations we examined.

      (4) Fourth, my view is that what is called "division of labor" here is an overinterpretation. When the two helper types evolve, what exists in the model is some individuals that do reproduction-costly tasks (so-called "work") and survival-costly tasks (so-called "defense"). However, there are really no two tasks that are being completed, in the sense that completing both tasks (e.g., work and defense) is not necessary to achieve a goal (e.g., reproduction). In this model there is only one task (reproduction, equation 4,5) to which both helper types contribute equally and so one task doesn't need to be completed if completing the other task compensates for it; instead, it seems more fitting to say that there are two types of helpers, one that pays a fertility cost and another one a survival cost, for doing the same task. So, this model does not actually consider division of labor but the evolution of different helper types where both helper types are just as good at doing the single task but perhaps do it differently and so pay different types of costs. In this revision, the authors introduced a modified model where "work" and "defense" must be performed to a similar extent. Although I appreciate their effort, this model modification is rather unnatural and forces the evolution of different helper types if any help is to evolve.

      In previous models of division of labor in eusocial insects, the implicit benefit is also colony-level productivity (see Beshers & Fewell, 2001, for a review of division of labor in insects). Even in humans, division of labor functions as a means to increase efficiency toward achieving a shared goal. Our model adopts this same interpretation, as outlined in the Introduction, but extends it by considering that different tasks may impose different fitness costs, an aspect that has been largely overlooked in the existing literature. It is precisely because fitness outcomes are not fully shared among group members in vertebrates that distinguishing these cost structures matters. Unlike eusocial insects with sterile workers, vertebrate helpers can obtain direct fitness benefits, and the model explicitly accounts for these direct benefits—something absent from most insect-inspired approaches even when direct fitness benefits can also arise in some of those systems. Thus, our framework is not simply evolving “two types of helpers doing the same task,” but instead evolving specialization in different cooperative roles that carry different fitness consequences. It is therefore suitable for our model to treat contributions to breeder productivity as a common currency, while allowing individuals to specialize in different cost-distinct forms of help.

      Finally, regarding synergy: with the extension introduced in the previous revision, we now incorporate the requirement that multiple forms of help must be performed for the group to achieve maximal reproductive output. This directly addressed the reviewer’s concern about synergistic dependencies between tasks and aligns our framework with the kinds of complementarity highlighted in other models of division of labor.

      In summary, the structure of the model is consistent with both the theoretical literature on division of labor and the biological realities of vertebrate cooperative systems. We believe it is important for future models to explicitly consider the different fitness benefits and costs associated with distinct cooperative behaviors, and hope that our framework encourages more targeted empirical research on division of labor in vertebrates (e.g. inclusion of data on defense, life-history traits and environmental challenges) to better inform future modelling efforts.

      I should end by saying that these comments don't aim to discourage the authors, who have worked hard to put together a worthwhile model and have patiently attended to my reviews. My hope is that these comments can be helpful to build upon what has been done to address the question posed.

      We appreciate the reviewer’s thoughtful and constructive comments, as well as the time invested in evaluating our work. These insights have greatly helped us improve the clarity and overall quality of the manuscript. We hope that the revisions and additional clarifications we have provided adequately address all remaining concerns.

    1. eLife Assessment

      This manuscript provides valuable novel insights into the role of interpersonal guilt in social decision-making by showing that responsibility for a partner's bad lottery outcomes influences happiness. Through the integration of neuroimaging and computational modelling methods, and by combining findings from two studies, the authors provide solid support for their claims. The findings will be of interest to researchers in the field of social neuroscience and decision making.

    2. Reviewer #1 (Public review):

      Summary:

      The authors aimed to characterize neurocomputational signals underlying interpersonal guilt and responsibility. Across two studies, one behavioral and one fMRI, participants made risky economic decisions for themselves or for themselves and a partner; they also experienced a condition in which the partners made decisions for themselves and the participant. The authors also assessed momentary happiness intermittently between choices in the task. Briefly, the results demonstrated that participants' self-reported happiness decreased after disadvantageous outcomes for themselves and when both they and their partner were affected; and this effect was exacerbated when participants were responsible for their partner's low outcome, rather than the opposite, reflecting experienced guilt. Consistent with previous work, BOLD signals in the insula correlated with experienced guilt and insula-right IFG connectivity was enhanced when participants made risky choices for themselves and safe choices for themselves and a partner.

      Strengths:

      This study implements an interesting approach to investigating guilt and responsibility; the paradigm in particular is well-suited to approach this question, offering participants the chance to make risky vs. safe choices that affect both themselves and others. I appreciate the assessment of happiness as a metric for assessing guilt across the different task/outcome conditions, as well as the implementation of both computational models and fMRI.

      Weaknesses:

      In spite of the overall strengths of the study, I think there are a few areas in which the paper fell a bit short and could be improved.

      Comment on the revised submission:

      I appreciate the authors' attention to all of my comments and questions regarding the initial version of the paper. However, I still do not believe that the point about the small volume correction in the insula has been adequately addressed. The authors claim that because the SVC was done using an anatomically defined ROI, that it is valid and not double dipping. I understand where the authors are coming from. However, there are a few issues here. First, any use of ROIs is best done via pre-registration (Gentili et al., 2021, European Journal of Neuroscience). Second, the whole set of analyses in this section leading up to the SVC seems somewhat circular. The first step was a whole brain contrast of lottery vs. safe outcomes, which revealed activation in many areas including the insula. Then, it appears that the parameter estimates from the insula were extracted and submitted offline to linear mixed models probing for effects of outcome magnitude, social condition and time, which revealed that the insula activation demonstrated the 'sought after' effect. Next, the manuscript states that the authors attempted to confirm these results with a univariate analysis for the so-called guilt effect within regions showing a stronger response to outcomes of risky relative to safe outcomes, which again showed activation in the insula (not surprisingly), and then a small volume correction was applied to these insula voxels. While an anatomical ROI from a different study was used for the correction, the issue is that multiple analyses already revealed that the insula was involved in the effect of interest. It is unclear why this is even necessary given that the LMM analysis demonstrated the expected result.

    3. Reviewer #2 (Public review):

      Summary

      This manuscript focuses on the role of social responsibility and guilt in social decision making by integrating neuroimaging and computational modeling methods. Across two studies, participants completed a lottery task in which they made decisions for themselves or for a social partner. By measuring momentary happiness throughout the task, the authors show that being responsible for a partner's bad lottery outcome leads to decreased happiness compared to trials in which the participant was not responsible for their partner's bad outcome. At the neural level, this guilt effect was reflected in increased neural activity in the anterior insula, and altered functional connectivity between the insula and the inferior frontal gyrus. Using computational modeling, the authors show that trial by trial fluctuations in happiness were successfully captured by a model including participant and partner rewards and prediction errors (a 'responsibility' model), and model-based neuroimaging analyses suggested that prediction errors for the partner were tracked by the superior temporal sulcus. Taken together, these findings suggest that responsibility and interpersonal guilt influence social decision making.

      Strengths

      This manuscript investigates the concept of guilt in social decision making through both statistical and computational modeling. It integrates behavioral and neural data, providing a more comprehensive understanding of the psychological mechanisms. For the behavioral results, data from two different studies is included, and although minor differences are found between the two studies, the main findings remain consistent. The authors share all their code and materials, leading to transparency and reproducibility of their methods.

      The manuscript is well-grounded in prior work. The task design is inspired by a large body of previous work on social decision making, and includes the necessary conditions to support their claims (i.e., Solo, Social, and Partner conditions). The computational models used in this study are inspired by previous work, and build on well-established economic theories of decision making. The research question and hypotheses clearly extend previous findings, and the more traditional univariate results align with prior work.

      The authors conducted extensive analyses, as supported by the inclusion of different linear models and computational models described in the supplemental materials. Psychological concepts like risk preferences are defined and tested in different ways, and different types of analyses (e.g., univariate and multivariate neuroimaging analyses) are used to try to answer the research questions. The inclusion and comparison of different computational models provides compelling support for the claim that partner prediction errors indeed influence task behavior, as illustrated by the multiple model comparison metrics and the good model recovery.

      The authors did a good job acknowledging other factors that could differ between the conditions, including the role of other emotions (like empathy) or agency in the decision making process. These additional analyses and nuances strengthen the manuscript and the interpretability of the findings.

      Weaknesses

      As the authors already note, they did not directly ask participants to report their feelings of guilt. The authors clearly describe this limitation, and also note that in addition to guilt, other emotions like empathy could also be at play in interpersonal decisions. Despite this limitation, this study provides insights into the neural and behavioral mechanisms of responsibility and guilt in social decision making, and how they influence behavior.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary

      The authors aimed to characterize neurocomputational signals underlying interpersonal guilt and responsibility. Across two studies, one behavioral and one fMRI, participants made risky economic decisions for themselves or for themselves and a partner; they also experienced a condition in which the partners made decisions for themselves and the participant. The authors also assessed momentary happiness intermittently between choices in the task. Briefly, results demonstrated that participants' self-reported happiness decreased after disadvantageous outcomes for themselves and when both they and their partner were affected; this effect was exacerbated when participants were responsible for their partner's low outcome, rather than the opposite, reflecting experienced guilt. Consistent with previous work, BOLD signals in the insula correlated with experienced guilt, and insula-right IFG connectivity was enhanced when participants made risky choices for themselves and safe choices for themselves and a partner.

      Strengths:

      This study implements an interesting approach to investigating guilt and responsibility; the paradigm in particular is well-suited to approach this question, offering participants the chance to make risky v. safe choices that affect both themselves and others. I appreciate the assessment of happiness as a metric for assessing guilt across the different task/outcome conditions, as well as the implementation of both computational models and fMRI.

      We thank Reviewer 1 for their positive assessment of our manuscript.

      Weaknesses:

      In spite of the overall strengths of the study, I think there are a few areas in which the paper fell a bit short and could be improved.

      We thank Reviewer 1 for their comments, which we have used to improve our manuscript. We hope that these changes address the issues raised by the Reviewer.

      (1) While the framing and goal of this study was to investigate guilt and felt responsibility, the task implemented - a risky choice task with social conditions - has been conducted in similar ways in past research that were not addressed here. The novelty of this study would appear to be the additional happiness assessments, but it would be helpful to consider the changes noted in risk-taking behavior in the context of additional studies that have investigated changes in risky economic choice in social contexts (e.g., Arioli et al., 2023 Cerebral Cortex; Fareri et al., 2022 Scientific Reports).

      We certainly agree that several previously published studies have relied on risky choice tasks with social conditions. In this revised version, we now mention these two studies in the substantially revised Introduction.

      (2) The authors note they assessed changes in risk preferences between social and solo conditions in two ways - by calculating a 'risk premium' and then by estimating rho from an expected utility model. I am curious why the authors took both approaches (this did not seem clearly justified, though I apologize if I missed it). Relatedly, in the expected utility approach, the authors report that since 'the number of these types of trials varied across participants', they 'only obtained reliable estimates for [gain and loss] trials in some participants' - in study 1, 22 participants had unreliable estimates and in study 2, 28 participants had unreliable estimates. Because of this, and because the task itself only had 20 gains, 20 losses, and 20 mixed gambles per condition, I wonder if the authors can comment on how interpretable these findings are in the Discussion. Other work investigating loss aversion has implemented larger numbers of trials to mitigate the potential for unreliable estimates (e.g., Sokol-Hessner et al., 2009).

      We agree that we have not clearly justified why we have taken two approaches to assess risk preferences. In short, while the expected utility approach is a more comprehensive method to model a participant’s choices, we had not sufficiently considered the need for the large number of trials required to fit such models when designing our experiment. Calculating the risk premium was the less comprehensive, simpler alternative that we could calculate for all participants. We have now mentioned this fact in the Results section. As the only difference in risk aversion across conditions was found in Study 1 using the expected utility method, which could only be successfully applied in a minority of participants, we believe that this difference should not be taken as a strong finding. We have now mentioned this fact in the revised Discussion.

      (3) One thing seemingly not addressed in the Discussion is the fact that the behavioral effect did not replicate significantly in study 2.

      We agree that we had not sufficiently discussed the fact that there were (slight but significant) differences in risk preferences between the Solo and Social conditions in Study 1 but not in Study 2. We now do so in the revised Discussion, and write the following:

      “Participants made slightly more risk-seeking choices when deciding for themselves than for both themselves and the partner in Study 1, but this difference disappeared in Study 2. The ρ parameter on which this finding in Study 1 is based could only be estimated in a minority of participants due to a relatively low number of trials, which suggests that this finding may not be very reliable. The simpler and more robust method (evaluation of a risk premium) showed no difference in risk aversion across conditions in either study. Overall, we believe that we do not have strong evidence of differences in risk preferences across conditions.”

      (4) Regarding the computational models, the authors suggest that the Reponsibility and Responsibility Redux models provided the best fit, but they are claiming this based on separate metrics (e.g., in study 1, the redux model had the lowest AIC, but the responsibility only model had the highest R^2; additionally, the basic model had the lowest BIC). I am wondering if the authors considered conducting a direct model comparison to statistically compare model fits.

      We agree that we should run formal, direct model comparison tests. We now ran likelihood-ratio tests which showed that the Responsibility model was the best. We now report this in the Results section, just below Table 1:

      “A likelihood ratio test (Equation 9) revealed that the Responsibility model fitted better than all the other models, including the Responsibility Redux model (Study 1: all LR ≥ 47.36, p < 0.0001; Study 2: all LR ≥ 77.83, p < 0.0001).”

      (5) In the reporting of imaging results, the authors report in a univariate analysis that a small cluster in the left anterior insula showed a stronger response to low outcomes for the partner as a result of participant choice rather than from partner choice. It then seems as though the authors performed small volume correction on this cluster to see whether it survived. If that is accurate, then I would suggest that this result be removed because it is not recommended to perform SVC where the volume is defined based on a result from the same whole-brain analysis (i.e., it should be done a priori).

      As indicated in the manuscript, the small insula cluster centered at [-28 24 -4] and shown in Figure 4F survived corrections for multiple tests within the anatomically-defined anterior insula (based on the anatomical maximum probability map described in Faillenot et al., 2017), which is independent of the result of our analysis. Functionally defining the small volume based on the same data would indeed be circular and misleading “double-dipping”. We have most certainly NOT done this. The reason why we selected the anterior insula is because it is one of the regions most frequently associated with guilt (see the explanations in our Introduction, which refers for example to Bastin et al., 2016; Lamm & Singer, 2010; Piretti et al., 2023). Thus we feel that performing small-volume correction within the anatomically-defined anterior insula is a valid analysis. We fully acknowledge that, independently of any correction, the effect and the cluster are small. We now write:

      “We found a weak response in a small cluster within the left anterior insula (peak T = 3.95, d = 0.59, 22 voxels, peak intensity at [-28 24 -4]; Figure 4F). Given the documented association between anterior insula and guilt (see Introduction), we proceeded to test whether this result survived correction for family-wise errors due to multiple comparisons restricted to the left anterior insula gray matter [defined anatomically and thus independently from our findings, as the anterior short gyrus, middle short gyrus, and anterior inferior cortex in an anatomical maximum probability map (Faillenot et al., 2017)]. This correction resulted in a p value of 0.024. This result, although it is only a small effect in a small cluster, is consistent with the mixed model analysis reported earlier.”

      Reviewer #2 (Public review):

      Summary

      This manuscript focuses on the role of social responsibility and guilt in social decision-making by integrating neuroimaging and computational modeling methods. Across two studies, participants completed a lottery task in which they made decisions for themselves or for a social partner. By measuring momentary happiness throughout the task, the authors show that being responsible for a partner's bad lottery outcome leads to decreased happiness compared to trials in which the participant was not responsible for their partner's bad outcome. At the neural level, this guilt effect was reflected in increased neural activity in the anterior insula, and altered functional connectivity between the insula and the inferior frontal gyrus. Using computational modeling, the authors show that trial-by-trial fluctuations in happiness were successfully captured by a model including participant and partner rewards and prediction errors (a 'responsibility' model), and model-based neuroimaging analyses suggested that prediction errors for the partner were tracked by the superior temporal sulcus. Taken together, these findings suggest that responsibility and interpersonal guilt influence social decision-making.

      Strengths

      This manuscript investigates the concept of guilt in social decision-making through both statistical and computational modeling. It integrates behavioral and neural data, providing a more comprehensive understanding of the psychological mechanisms. For the behavioral results, data from two different studies is included, and although minor differences are found between the two studies, the main findings remain consistent. The authors share all their code and materials, leading to transparency and reproducibility of their methods.

      The manuscript is well-grounded in prior work. The task design is inspired by a large body of previous work on social decision-making and includes the necessary conditions to support their claims (i.e., Solo, Social, and Partner conditions). The computational models used in this study are inspired by previous work and build on well-established economic theories of decision-making. The research question and hypotheses clearly extend previous findings, and the more traditional univariate results align with prior work.

      The authors conducted extensive analyses, as supported by the inclusion of different linear models and computational models described in the supplemental materials. Psychological concepts like risk preferences are defined and tested in different ways, and different types of analyses (e.g., univariate and multivariate neuroimaging analyses) are used to try to answer the research questions. The inclusion and comparison of different computational models provide compelling support for the claim that partner prediction errors indeed influence task behavior, as illustrated by the multiple model comparison metrics and the good model recovery.

      We thank the reviewer very much for their comprehensive description of our study and the positive assessment of our study and approach.

      Weaknesses

      As the authors already note, they did not directly ask participants to report their feelings of guilt. The decrease in happiness reported after a bad choice for a partner might thus be something else than guilt, for example, empathy or feelings of failure (not necessarily related to guilt towards the other person). Although the patterns of neural activity evoked during the task match with previously found patterns of guilt, there is no direct measure of guilt included in the task. This warrants caution in the interpretation of these findings as guilt per see.

      We fully agree that not directly asking participants about feelings of guilt is a clear limitation of our study. While we already mention this in our Discussion, we have expanded our discussion of the consequences on the interpretation of our results along the lines described by the reviewer in the revised manuscript. We would like to thank the reviewer for proposing these lines of thought, and have now made the following changes to the text:

      In the first paragraph of the discussion, we now write: “Being responsible for choosing a lottery that yielded a low outcome for a partner made our participants feel worse than witnessing the same outcome resulting from their partner’s choice, which we interpret as interpersonal guilt; although we note that we have not asked participants specifically about which emotion they felt in these situations.

      Later on, in the third paragraph focusing on the anterior insula, we now write: “This replicates a large body of evidence associating aIns with feelings of guilt evoked during social decisions (see Introduction). Because we have neither asked our participants specifically what they felt in these situations, nor specifically whether they experienced guilt, we cannot exclude the possibility that they have instead or in addition felt empathy for their partner, a feeling of failure or bad luck, or some other emotion.”

      As most comparisons contrast the social condition (making the decision for your partner) against either the partner condition (watching your partner make their decision) or the solo condition (making your own decision), an open question remains of how agency influences momentary happiness, independent of potential guilt. Other open questions relate to individual differences in interpersonal guilt, and how those might influence behavior.

      How agency influences momentary happiness or variations thereof during the course of an experiment such as ours is an interesting question in itself. We now ran linear mixed models assessing agency (i.e. we compared happiness in conditions Solo & Social conditions vs. Partner condition), which revealed lower happiness in Solo and Social conditions (i.e. when it was the participant’s turn to decide) in both studies. This is interesting in itself and may reflect the drive behind responsibility aversion reported by Edelson et al.’s 2018 study: being assigned the role of the decider in a social setting may make people slightly unhappy, perhaps due to “weight of the responsibility”. We now report these findings in the Results section, including this proposed explanation; because we were not specifically interested in responsibility aversion, we do not discuss this further in the Discussion. The edited text is under the new subsection entitled ‘Momentary happiness: effects of agency, responsibility and guilt’, on page 12:

      “Next, we assessed whether happiness varied depending on the participant’s agency (Social + Solo vs. Partner), and found happiness to be lower when the participant chose, independent of the outcome (Study 1: t(3600) = -3.92, p = 0.00009, β = -0.14, 95% CI = [-0.20 -0.07]; Study 2: t(2870) = -6.07, p = 0.000000001, β = -0.24, 95% CI = [-0.31 -0.16]). . This is interesting in itself and may reflect the drive behind responsibility aversion reported by Edelson et al.’s 2018 study: being assigned the role of the decider in a social setting may make people slightly unhappy, perhaps due to “weight of the responsibility”. To specifically search for a sign of interpersonal guilt, [...]”

      Regarding individual differences: this is a very interesting topic that we have not addressed here due to the (relatively) small number of participants in our studies, but we might consider this for future follow-up studies, which we mention in the Discussion paragraph regarding open questions.

      This manuscript is an impressive combination of multiple approaches, but how these different approaches relate to each other and how they can aid in answering slightly different questions is not very clearly described. The authors could improve this by more clearly describing the different methods and their added value in the introduction, and/or by including a paragraph on implications, open questions, and future work in the discussion.

      We thank the reviewer for their appreciation of our complementary approach, and agree that we had not sufficiently explained the reasons why we used several methods. We have now added a paragraph explaining this at the end of the Introduction (page 5):

      “We analysed our behavioural data using several complementary methods: choices were modelled with mixed-effects regressions serving as manipulation checks; risk preferences expressed in choices were assessed using a comprehensive expected utility model as well as with a simpler, more robust “risk premium” approach; and happiness data were fitted, in addition to the computational models, with several linear mixed models to assess the impact of both the participant’s and their partner’s rewards, the impact of agency and their interactions. Inspired by findings reported in previous neuroimaging of social emotions, we also used several methods to analyse our fMRI data, including conventional methods (both region-of-interest and mass univariate); mixed-effects regression models; computational model-based analyses (inspired by e.g. Konovalov et al., 2021; Rutledge et al., 2014); and functional connectivity (e.g. Edelson et al., 2018; Konovalov et al., 2021). The behavioural modelling is thus complemented by neuroimaging analyses that offer insight about both the activity in regions associated with guilt as well as their place in a wider network, providing an in-depth comprehensive analysis of the mechanisms behind guilt evoked by social responsibility.”

      In addition, as suggested we added the following paragraph on open questions and future work in the Discussion:

      “Several open questions remain at the end of this study. As discussed above, asking participants directly about which emotions they have felt during the different stages of this task would allow us to link subjective experience with our analytical measures. Testing more participants would allow us to assess the impact of inter-individual variations in personality traits on the experience as well as the behavioural and neural correlates of guilt and responsibility. Using more trials in the experiment would allow separate modelling of risk preferences in gain and loss trials in each experimental condition using expected utility models, and could allow testing whether changes in momentary happiness affect subsequent choices. Varying partner identities (friends, strangers, artificial agent) could reveal the impact of social discounting on guilt and responsibility. In sum, we believe that this experimental approach lends itself very well to the study of several aspects of social emotions.”

      However, taken together, this study provides useful insights into the neural and behavioral mechanisms of responsibility and guilt in social decision-making and how they influence behavior. 

      We thank the reviewer again for their appreciation of our work and hope that our revisions improved the manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The majority of my suggestions are in the public review, so I will not repeat them here. But in general, I like the paper, and in addition to my other comments, I think that there should be more discussion of the potential limitations of the study and conclusions that can be drawn. I also thought parts of the results were a little hard to follow, particularly in the 'momentary happiness' section. Perhaps an additional subsection here might help with flow.

      We agree that we could have discussed further the limitations of our study and the conclusions that can be drawn from it, which we have now done in the last paragraphs of the Discussion in this revised version.

      To improve the structure of the section on ‘momentary happiness’, we separated this section into two, entitled: ‘Momentary happiness: links to reward‘ and ‘Momentary happiness: effects of agency, responsibility and guilt’, which should facilitate the reading of this long section. We proceeded in a similar manner for the Choices section, which is now subdivided into ‘Choices: manipulation check’ and ‘Choices: risk preferences’. We believe that these changes have indeed improved the readability of our manuscript.

      Reviewer #2 (Recommendations for the authors):

      Overall, I believe this manuscript was well-designed, consists of extensive analyses, and provides interesting new insights into the mechanisms underlying social decision-making. I mostly have some clarifying questions and minor comments, which are described below. 

      (1) Integration of prior findings in the first paragraphs of the Introduction. Although all the previous work described in the 2nd-5th paragraph introduction is interesting, it felt a bit like an enumeration of findings rather than an integrated introduction leading to the current research question. At the end of paragraph 5, it becomes clear how these findings relate to the current research question, but I believe it will improve the flow and readability of the introduction if this becomes clear earlier on.

      We agree that we could have integrated the cited previous work into the Introduction so that the text builds up to the research question. We have now extensively reworked several paragraphs in the Introduction (pages 3-5) and hope that these changes have made it easier to follow.

      (2) For the risk attitudes (Choices), you describe pooling the gains and losses and then comparing the social and solo conditions. I was wondering whether you also looked at potential differences between gains and losses (delta measure) for social versus the solo condition (so a comparison of the delta). Based on prior work, I can imagine that the difference in risk attitudes for gains and losses might differ when making decisions for yourself versus when you're doing it for a partner. In general, I was wondering how you explain these findings, as there is also a lot of work showing differences in risk-taking patterns for gains and losses.

      We agree that we could have compared delta measures between solo and social conditions. However, as we describe in the Results section and comment on in the Discussion, the relatively low number of trials made separate fitting of gain and loss trials across conditions difficult. While this question could thus be addressed in subsequent versions of our experiment with more trials, such a fine-grained analysis of the decisions was not the focus of our current study.

      (3) On page 11, you state: "in particular the partner's reward prediction errors resulting from the participants' decisions, i.e. those pRPE for which participants were responsible." From the results described in the paragraph above, this doesn't become clear (e.g., there's no distinction made between social_pRPE and partner_pRPE in the text), as it only discusses differences in weights between pRPE and sRPE. I would recommend including some more information in the main text on these main modeling findings, so one doesn't have to go to the Supplemental Materials to understand them.

      We did indeed fail to report these findings in the text! We thank the reviewer for pointing this out. We have now edited this passage as follows:

      “Crucially, we find here that the partner’s reward prediction errors (social_pRPE and partner_pRPE) contributed to explaining changes in participants’ momentary happiness: the Responsibility and ResponsibilityRedux models explained the data better than the models without these parameters (see Table 1). In particular, the partner’s reward prediction errors resulting from the participants’ decisions (social_pRPE), i.e. those pRPE for which participants were responsible, contributed to explaining our data (weights for social_pRPE were greater than 0: Responsibility model: Study 1: Z = 2.85, p = 0.004, Study 2: Z = 3.26, p = 0.001; Responsibility Redux model: Study 1: Z = 2.93, p = 0.003, Study 2: Z = 3.30, p = 0.001; weights for social_pRPE tended to be higher than weights for partner_pRPE: Responsibility model: Study 1: Z = 2.14, p = 0.033; Study 2: Z = 1.41, p = 0.16).”

      (4) The functional connectivity findings seem to come out of nowhere and are not introduced or described anywhere prior in the manuscript. It is therefore not completely clear why you conducted these analyses, or what they add above and beyond previous analyses. Already introducing this method earlier on would fix that.

      We agree that we could have introduced functional connectivity analyses earlier in the text, particularly given the many previous studies in our field using this technique. We have now done this at the end of a new last paragraph of the Introduction:

      “Inspired by findings reported in previous neuroimaging of social emotions, we also used several methods to analyse our fMRI data, including conventional methods (both region-of-interest and mass univariate); mixed-effects regression models; computational model-based analyses (inspired by e.g. Konovalov et al., 2021; Rutledge et al., 2014); and functional connectivity (e.g. Edelson et al., 2018; Konovalov et al., 2021). The behavioural modelling is thus complemented by neuroimaging analyses that offer insight about both the activity in regions associated with guilt as well as their place in a wider network, providing an in-depth comprehensive analysis of the mechanisms behind guilt evoked by social responsibility.”

      (5) For the functional connectivity findings: I was wondering why you only looked at the choice phase, and not at the feedback phase. I understand that previous work focused on the choice phase, but for the purpose of this study (focus on guilt), I can imagine it is also interesting to see what happens with feedback. In the discussion, you also state "How we feel when we witness our decisions' consequences on others is an important signal to consider when attempting to make good social decisions." (p. 19), which is more focused on the feedback rather than choice, and also supports the idea that looking at the feedback moment might be relevant.

      We agree that we could also have looked at the functional connectivity during the feedback phase. The main reason why we had originally not done so was time constraints. At the current time we would in addition point out that the manuscript is already very long and contains many analyses of behavioural and fMRI data. Adding this analysis would cost additional time and would further delay the publication of our manuscript, which we would prefer to avoid. However, one could of course look at these effects in subsequent analyses of the same data or in subsequent versions of this experiment. We have now mentioned this in the Discussion, in the paragraphs on open questions.

      Minor comments:

      (1) For some of the Figures, it would be helpful if the subtitles were more informative. For Figure 2 and Figure 3 for example, it would be nice if Study 1 and Study 2 were not only mentioned in the figure description but also in the actual figure. For Figures 3 and 4, it would be helpful to have significance stars for the bar plots as well.

      We agree that these changes make the figures more easily understandable and have implemented them all, except for adding stars on Figure 4, because all bar plots in panels C and E would have been labeled with two or more stars, which would have made the figure difficult to read. We have now mentioned the fact that all these coefficients were significant in the figure legend.

      (2) For some of the Supplementary Results, it would be very helpful if there was a legend or description. This is already the case for most of the SR, but not for all.

      We have now added a legend to all elements of the Supplementary Results.

      Some questions that came to mind while going through them:

      - Supplementary Table 1: which p-values correspond to the significance stars? This information is included for Supplementary Table 2, but not for ST1. 

      We have now added the missing information in ST1.

      - Supplementary Figure 1: do the colors correspond to different participants? 

      We have now specified that the colors do indeed correspond to different participants.

      - Supplementary Table 5 (final table): what do the - represent? As in, why is there no value for "run" for the MPFC? At first, I thought you only included the significant values, but then I noticed a few non-significant values as well, so it wasn't completely clear to me why some of the values were missing. This also applies to Supplementary Table 6.

      We have indeed forgotten to explain this. The ‘-’ in Supplementary Tables 4 and 6 indicate that the linear mixed model without the factor ‘run’ was the better-fitting one. We have now added the following explanation in the text accompanying Supplementary Table 4:

      “We tested these models both with and without the factor Run and associated interaction, and we report the best-fitting model in the table below: a dash (‘-’) in the row displaying parameters for the run and socialVsSolo:run regressors indicates that the model without factor run was better-fitting for this ROI.”

      (3) I came across a few minor typos or sentences that were not completely clear to me.

      - On page 3: "Patients with damage to ventromedial prefrontal cortex (vmPFC) seem insensitive to guilt when playing social economic games (Krajbich et al., 2009)." This sentence felt a bit out of nowhere and doesn't logically follow from the previous sentences. 

      We have now revised the descriptions of this previous study as well as several others and how they fit into the research question.

      - On page 3: "In another study, participant errors in a difficult perception task lead to a partner feeling pain and evoked activations in left aIns and dlPFC (Koban et al., 2013)." This sentence doesn't really flow, and from the wording, it is not completely clear whether it's the errors or the partner pain that led to the aIns and dlPFC activation.

      We have now revised the description of this study as well, as follows:

      “In another study, partners received painful stimuli when participants made errors during a difficult perception task. These errors evoked activations in the left aIns and dlPFC in the participants (Koban et al., 2013).”

      - Supplementary Figure 1: there is a missing period after the sentence "We then compared these new estimated parameters to the actual parameters from which the synthetic data were generated"

      We have now added a missing comma after “generated”.

      - On page 5: "We ran two experiments, Study 1 outside fMRI and Study 2 during fMRI, with separate groups of participants." I would change "outside fMRI" to outside the MRI scanner or something like that, as it's not completely correct to say "outside fMRI".

      We have changed the sentence to “outside the MRI scanner”.

      - On page 6: for the first result, there are currently two p-values reported (p < 2.5e-20 and p < 2e-16). I believe this is an error?

      This was indeed an error! We have re-run this analysis, noticed that also the degrees of freedom were miscalculated, and have updated this result and the effect of condition (solo vs social). Results are almost identical as previously and all conclusions hold. We have also checked the other analyses reported in this paragraph – all results replicate exactly.

      - On page 6: "Supplemental Table 1" should be "Supplementary Table 1" (for consistency).

      Done.

      On page 8: "participants in both conditions of both studies", I would change "of both studies" to "for both studies".

      Done.

      On page 8: for the "Momentary Happiness" paragraph, it would be helpful if you could briefly describe the Rutledge method here, for people who are unfamiliar with the approach.

      We now write the following at the beginning of this paragraph:

      “Following Rutledge and colleagues’ methodology, which considers that changes in momentary happiness in response to outcomes of a probabilistic reward task are explained by the combined influence of recent reward expectations and prediction errors arising from those expectations, we fitted computational models to each participant’s happiness data.”

      On page 10: "Wilkoxon sign-rank tests", should be "Wilcoxon".

      Done.

      We thank the reviewer for their careful reading of our manuscript. We believe that these changes have indeed improved our manuscript.

    1. eLife Assessment

      This study investigates how RNA molecules modulate phase separation, aggregation, and cytotoxicity of the staphylococcal virulent peptide PSMα3 and the human host‑defence peptide LL‑37 using an array of biophysical and cell‑based assays. If validated, these findings would be important, as they suggest that nucleic acids can tune the material state and bioactivity of amyloids, with implications for host-pathogen interactions and for the design of therapeutics that target phase behaviour. However, the evidence is incomplete: many key claims rest on qualitative imaging and contested assumptions about "functional" amyloids, and the absence of quantitative binding data, phase diagrams, and appropriate controls limits confidence in the conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      The manuscript by Rayan et al. aims to elucidate the role of RNA as a context-dependent modulator of liquid-liquid phase separation (LLPS), aggregation, and bioactivity of the amyloidogenic peptides PSMα3 and LL-37, motivated by their structural and functional similarities.

      Strengths:

      The authors combine extensive biophysical characterization with cell-based assays to investigate how RNA differentially regulates peptide aggregation states and associated cytotoxic and antimicrobial functions.

      Weaknesses:

      While the study addresses an interesting and timely question with potentially broad implications for host-pathogen interactions and amyloid biology, several aspects of the experimental design and data analysis require further clarification and strengthening.

      Major Comments:

      (1) In Figure 1A, the author showed "stronger binding affinity" based on shifts at lower peptide concentrations, but no quantitative binding parameters (e.g., apparent Kd, fraction bound, or densitometric analysis) are presented. This claim would be better supported by including: (i) A binding curve with quantification of free vs bound RNA band intensities (ii) Replicates and error estimates (mean {plus minus} SD).

      (2) The authors report droplet formation at low RNA (50 ng/µL) but protein aggregation at high RNA (400 ng/µL) through fluorescence microscopy. However, no intermediate RNA concentrations (e.g., 100-300 ng/µL) are tested or discussed, leaving a critical gap in understanding the full phase diagram and transition mechanisms. Additionally, the behaviour of PSMα3 in the absence of RNA under LLPS conditions is not shown. Without protein-only data, it is difficult to assess if droplets are RNA-induced or if protein has a weak baseline LLPS that RNA tunes. The saturation concentration (csat) for PSMα3 phase separation, either in the absence or presence of RNA, should be reported.

      (3) For a convincing LLPS claim, it is important to show: Quantitative FRAP curves (mobile fraction and half-time of recovery) rather than only microscopy images and qualitative statements.

      (4) The manuscript highly relies on fluorescence microscopy to show colocalization. However, the colocalization is presented in a qualitative manner only. The manuscript would benefit from the inclusion of quantitative metrics (e.g., Pearson's correlation coefficient, Manders' overlap coefficients, or intensity correlation analysis).

      (5) In Figures 3 B and 3C, the contrast between "no AT630 at 30 min, strong at 2 h" (50 ng/μL) and "strong at 30 min" (400 ng/μL) is compelling, but a simple quantification (e.g., mean fluorescence intensity per area) would greatly increase rigor.

      (6) In Figure S3 ssCD data, if possible, indicate whether the α-helical signal increases with RNA concentration or shows a non-linear dependence, which might link to the LLPS vs solid aggregate regimes.

      (7) In Figure 5B, FRAP recovery in dying cells may reflect artifactual mobility rather than biological relevance. Additionally, the absence of quantification data limits interpretation; providing recovery curves would clarify relevance.

      (8) The narrative conflates cytotoxicity endpoints (membrane damage, PI staining, aggregates) with localization data (nucleolar foci), creating ambiguity about whether nucleolar targeting drives toxicity or is a consequence of cell death. Separating toxicity assessment from localization analysis, or clearly demonstrating that nucleolar accumulation precedes cytotoxicity, would resolve this ambiguity.

      (9) In Figure 8, to strengthen the LLPS assignment for LL-37, additional evidence, such as FRAP analysis or observation of droplet fusion events, would be valuable. This is particularly relevant given that the heat shock conditions (65{degree sign}C for 15 minutes) could potentially induce partial denaturation or nonspecific coacervation.

    3. Reviewer #2 (Public review):

      In this paper, Rayan et al. report that RNA influences cytotoxic activity of the staphylococcal secreted peptide cytolysin PSMalpha3 versus human cells and E. coli by impacting its aggregation. The authors used sophisticated methods of structural analysis and described the associated liquid-liquid phase separation. They also compare the influence of RNA on the aggregation and activity of LL-37, which shows differences from that on PSMalpha3.

      Strengths:

      That RNA impacts PSM cytotoxicity when co-incubated in vitro becomes clear.

      Weaknesses:

      I have two major and fundamental problems with this study:

      (1) The premise, as stated in the introduction and elsewhere, that PSMalpha3 amyloids are biologically functional, is highly debatable and has never been conclusively substantiated. The property that matters most for the present study, cytotoxicity, is generally attributed to PSM monomers, not amyloids. The likely erroneous notion that PSM amyloids are the predominant cytotoxic form is derived from an earlier study by the authors that has described a specific amyloid structure of aggregated PSMalpha3. Other authors have later produced evidence that, quite unsurprisingly, indicated that aggregation into amyloids decreases, rather than increases, PSM cytotoxicity. Unfortunately, yet other groups have, in the meantime, published in-vitro studies on "functional amyloids" by PSMs without critically challenging the concept of PSM amyloid "functionality". Of note, the authors' own data in the present study, which show strongly decreased cytotoxicity of PSMalpha3 after prolonged incubation, are in agreement with monomer-associated cytotoxicity as they can be easily explained by the removal of biologically active monomers from the solution.

      (2) That RNA may interfere with PSM aggregation and influence activity is not very surprising, given that PSM attachment to nucleic acids - while not studied in as much detail as here - has been described. Importantly, it does not become clear whether this effect has biologically significant consequences beyond influencing, again not surprisingly, cytotoxicity in vitro. The authors do show in nice microscopic analyses that labeled PSMalpha3 attaches to nuclei when incubated with HeLa cells. However, given that the cells are killed rapidly by membrane perturbation by the applied PSM concentrations, it remains unclear and untested whether the attachment to nucleic acids in dying cells makes any contribution to PSM-induced cell death or has any other biological significance.

      Overall, the findings can be explained in a much more straightforward way with the common concept of cytotoxicity being due to monomeric PSMs, and the impact of nucleic acids on cytotoxicity being due to lowering of the concentration of that active form by RNA attachment. Further limiting the significance of the findings, whether this interaction has any biological significance on the physiology or infectivity of the PSM producer remains largely unexplored.

    4. Reviewer #3 (Public review):

      Summary:

      The manuscript by Rayan et al. aims to investigate the role of RNA in modulating both virulent amyloid and host-defense peptides, with the objective of understanding their self-assembly mechanisms, morphological features, and aggregation pathways.

      Strengths:

      The overall content is well-structured with a logical flow of ideas that effectively conveys the research objectives.

      Weaknesses:

      (1) Figure 2 displays representative FRAP images demonstrating fluorescence recovery within seconds. To gain a more comprehensive understanding of how recovery after photobleaching varies under different conditions, it is recommended to supplement these images with corresponding quantitative fluorescence recovery curves for analysis.

      (2) Ostwald ripening typically leads to the shrinkage or even disappearance of smaller droplets, accompanied by the further growth of large droplets. However, the droplet size in Figure 2D decreases significantly after 2 h of incubation. This observation prompts the question, what is the driving force underlying RNA-regulated phase separation and phase transition?

      (3) The manuscript aims to study the role of RNA in modulating PSMα3 aggregation by using solution-state NMR to obtain residue-specific structural information. The current NMR data, as described in the method and figure captions, were recorded in the absence of RNA. Whether RNA binding induces conformational changes of PSMα3, and how these changes alter the NMR spectra? Also, the sequential NOE walk between neighboring residues can be annotated on the spectrum for clarity.

      (4) The authors claim that LL-37 shares functional, sequence, and structural similarities with PSMα3. However, no droplet formation was observed of LL-37 in the presence of RNA only. The authors then applied thermal stress to induce phase separation of LL-37. What are the main factors contributing to the different phase behaviors exhibited by LL-37 and PSMα3? What are the differences in the conformation of amyloid aggregates and the kinetics of aggregation between the condensation-induced aggregation in the presence of RNA and the conventional nucleation-elongation process in the absence of RNA for these two proteins?

    5. Author response:

      We thank the reviewers for their thoughtful and constructive comments, which greatly helped us to clarify, quantify, and strengthen both our findings and interpretations. Below, we provide a point-by-point response to each comment and describe the corresponding changes made.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Rayan et al. aims to elucidate the role of RNA as a context-dependent modulator of liquid-liquid phase separation (LLPS), aggregation, and bioactivity of the amyloidogenic peptides PSMα3 and LL-37, motivated by their structural and functional similarities.

      Strengths:

      The authors combine extensive biophysical characterization with cell-based assays to investigate how RNA differentially regulates peptide aggregation states and associated cytotoxic and antimicrobial functions.

      Weaknesses:

      While the study addresses an interesting and timely question with potentially broad implications for host-pathogen interactions and amyloid biology, several aspects of the experimental design and data analysis require further clarification and strengthening.

      Major Comments:

      (1) In Figure 1A, the author showed "stronger binding affinity" based on shifts at lower peptide concentrations, but no quantitative binding parameters (e.g., apparent Kd, fraction bound, or densitometric analysis) are presented. This claim would be better supported by including: (i) A binding curve with quantification of free vs bound RNA band intensities ,(ii) Replicates and error estimates (mean {plus minus} SD).

      We thank the reviewer for this suggestion. To quantitatively support the binding differences observed in Figure 1A, we have now performed densitometric analysis of the EMSA data and included the results in Figure S1. The analysis showed that the Kd for PSMα3 binding to polyAU and polyA RNA is in the same order of magnitude but lower for the polyAU, indicating a stronger binding. A description was added to the results in lines 137-145 of the revised version.

      (2) The authors report droplet formation at low RNA (50 ng/µL) but protein aggregation at high RNA (400 ng/µL) through fluorescence microscopy. However, no intermediate RNA concentrations (e.g., 100-300 ng/µL) are tested or discussed, leaving a critical gap in understanding the full phase diagram and transition mechanisms.

      Our initial choice of 50 ng/µL (low RNA) and 400 ng/µL (high RNA) was guided by a broader RNA titration performed by turbidity measurements across 0, 10, 20, 50, 100, 200, and 400 ng/µL (Figure S2 in the revised version). In this screen, turbidity increased up to 50 ng/µL and then decreased dose-dependently from 100–400 ng/µL. We interpret this non-monotonic behavior as consistent with a transition from a dropletrich regime (maximal light scattering at intermediate dense-phase volume) toward conditions where assemblies become larger and/or more compact and sediment out of the optical path. This is described in lines 158-161 of the revised version.

      Of note, additional intermediate RNA conditions (100 and 200 ng/µL) are included in Figure S14 (of the revised version). While these experiments were performed under the heat-shock perturbation, they nevertheless support the central point that RNA tunes assembly state across intermediate concentrations rather than producing a binary low/high outcome.

      Importantly, we agree with the reviewer that a full phase diagram would be the most rigorous way to define the transition mechanism. However, establishing csat and constructing a complete phase diagram would require systematic measurements of dilute-phase concentrations (e.g., centrifugation/quantification or fluorescence calibration), controlled ionic strength titrations, and time-resolved mapping, which is beyond the scope of the present study. We have therefore revised the text to avoid implying that we provide a complete phase diagram. Instead, we frame our results as a qualitative with multi-assay characterization showing that RNA concentration drives a shift from liquid-like condensates (at low RNA) toward solid-like assemblies (at high RNA), with an intermediate regime suggested by the turbidity transition and supported by additional imaging under stress. Finally, to address the “critical gap” concern directly, we add a sentence (lines 239-241) stating that: “Future work will be required to quantitatively define the phase boundaries and delineate the dominant mechanisms, such as sedimentation, dissolution, or coarsening/aging, across intermediate RNA concentrations.

      (3) Additionally, the behaviour of PSMα3 in the absence of RNA under LLPS conditions is not shown. Without protein-only data, it is difficult to assess if droplets are RNA-induced or if protein has a weak baseline LLPS that RNA tunes. The saturation concentration (csat) for PSMα3 phase separation, either in the absence or presence of RNA, should be reported.

      In response to the reviewer’s request, we have added Figure 2F, which shows PSMα3 alone in the absence of RNA under the same conditions. PSMα3 does not form droplets in this condition, indicating that condensate formation is RNA-dependent in the tested conditions. This is referred to in the text in lines 190-193 of the revised version. Please see our response about determining the csat in the response to the previous comment.

      (4) For a convincing LLPS claim, it is important to show: Quantitative FRAP curves (mobile fraction and half-time of recovery) rather than only microscopy images and qualitative statements.

      We have included quantitative FRAP analysis in Figure S4 of the revised version, showing normalized recovery curves along with extracted mobile fractions and half-times of recovery (t₁/₂). These quantitative measurements support the dynamic nature of the PSMα3–RNA. This is referred to in the text in lines 179-184 of the revised version.

      (5) The manuscript highly relies on fluorescence microscopy to show colocalization. However, the colocalization is presented in a qualitative manner only. The manuscript would benefit from the inclusion of quantitative metrics (e.g., Pearson's correlation coefficient, Manders' overlap coefficients, or intensity correlation analysis).

      In response, we have added quantitative colocalization analysis to the revised manuscript. Specifically, we now report Pearson’s correlation coefficients and Manders’ overlap coefficients for the dual-channel fluorescence microscopy datasets in Figure S5 of the revised version. These metrics provide an objective measure of codistribution and complement the qualitative imaging.

      The analysis supports that at low RNA concentrations (droplet/condensate conditions), PSMα3 and RNA show strong colocalization, consistent with RNA being incorporated within, or closely associated with, the peptide-rich phase. In contrast, at high RNA concentrations, where the assemblies are more solid-like/amyloid-positive, the quantitative coefficients decrease, consistent with reduced overlap and an apparent spatial demixing in which RNA becomes partially excluded from the peptide-rich structures. This is referred to in the text in lines 194-203 of the revised version.

      (6) In Figures 3 B and 3C, the contrast between "no AT630 at 30 min, strong at 2 h" (50 ng/μL) and "strong at 30 min" (400 ng/μL) is compelling, but a simple quantification (e.g., mean fluorescence intensity per area) would greatly increase rigor.

      We have included quantitative analysis of AmyTracker630 fluorescence intensity in Figure S6 of the revised version, reporting the mean fluorescence intensity per area for the indicated conditions and time points. This quantification supports the qualitative differences observed in Figures 3B and 3C. This is now referred to in the text in lines 233-236 of the revised version.

      (7) In Figure S3 ssCD data, if possible, indicate whether the α-helical signal increases with RNA concentration or shows a non-linear dependence, which might link to the LLPS vs solid aggregate regimes.

      The ssCD spectra displayed in Figure S7 in the revised version (corresponding to Figure S3 in the original submission) show that the α-helical signature of PSMα3 is markedly enhanced in the presence of RNA compared to peptide alone, as evidenced by increased signal intensity, deeper minima, and more pronounced spectral features characteristic of α-helical structure. Importantly, this enhancement is more pronounced at 400 ng/µL Poly(AU) RNA than at 50 ng/µL, particularly after 2 hours of coincubation, indicating that RNA concentration influences the stabilization of α-helical assemblies. This is now more specifically detailed in the text in lines 258-263 of the revised version.

      We note that solid-state CD does not allow direct quantitative deconvolution of secondary structure content (e.g., % helix) in the same manner as solution CD, due to sample anisotropy, scattering, and orientation effects inherent to dried or aggregated films. Consequently, our interpretation is qualitative rather than strictly quantitative. The ssCD data therefore suggest a non-linear dependence on RNA concentration, rather than a simple linear dose–response. This is also expected considering that phase transition, suggested by the other findings, is intrinsically non-linear.

      (8) In Figure 5B, FRAP recovery in dying cells may reflect artifactual mobility rather than biological relevance. Additionally, the absence of quantification data limits interpretation; providing recovery curves would clarify relevance.

      We added quantitative FRAP analysis of the effect on PSMα3 within HeLa cells, shown in Figure S8 of the revised version. Compared to PSMα3 assemblies in vitro, nucleolar PSMα3 exhibits slower fluorescence recovery and a reduced mobile fraction. The nucleolus represents a highly crowded, RNA-rich cellular environment, which is expected to impose additional constraints on molecular mobility and likely contributes to the slower recovery kinetics observed in cells. This is now more specifically detailed in the text in lines 324-333 and discussed in lines 597-607 of the revised version.

      (9) The narrative conflates cytotoxicity endpoints (membrane damage, PI staining, aggregates) with localization data (nucleolar foci), creating ambiguity about whether nucleolar targeting drives toxicity or is a consequence of cell death. Separating toxicity assessment from localization analysis, or clearly demonstrating that nucleolar accumulation precedes cytotoxicity, would resolve this ambiguity.

      We thank the reviewer for raising this important point. We agree that, in the current dataset, cytotoxicity readouts (membrane damage, PI staining, aggregate formation) and subcellular localization (nucleolar accumulation) are observed in close temporal proximity, which limits our ability to unambiguously assign causality. In the experiments presented here, PSMα3 was applied at concentrations known to induce rapid membrane disruption and cytotoxicity in HeLa cells. Under these conditions, PSMα3 accumulates on cellular membranes and penetrates into the cell and nucleus on very short timescales (seconds to minutes), likely preceding the temporal resolution accessible by standard live-cell fluorescence microscopy. As a result, nucleolar accumulation and cytotoxic endpoints are detected essentially concurrently, precluding a definitive determination of whether nucleolar association actively drives toxicity or occurs as a downstream consequence of membrane permeabilization and cell damage.

      We therefore emphasize that, in this study, nucleolar localization is presented as a phenomenological observation consistent with RNA-rich compartment association, rather than as a demonstrated causal mechanism of cytotoxicity. We have revised the Discussion (lines 597-607) to clarify this distinction and to avoid implying that nucleolar targeting is the primary driver of cell death.

      We agree that resolving this ambiguity would require systematic time-resolved and concentration-dependent experiments, including analysis at sub-toxic PSMα3 concentrations below the membrane-disruptive threshold, combined with orthogonal imaging approaches. Such experiments are planned for future work but are beyond the scope of the present study.

      (10) In Figure 8, to strengthen the LLPS assignment for LL-37, additional evidence, such as FRAP analysis or observation of droplet fusion events, would be valuable. This is particularly relevant given that the heat shock conditions (65 °C for 15 minutes) could potentially induce partial denaturation or nonspecific coacervation.

      In response to this comment, we have added FRAP analysis of LL-37 assemblies in the revised manuscript (Figure S12), including representative images and corresponding fluorescence recovery curves. The FRAP measurements show minimal fluorescence recovery over the acquisition window, indicating that the LL-37–RNA assemblies formed under these conditions are largely immobile and solid-like, rather than liquid-like droplets. This is now referred to in the text in lines 458-462 of the revised version.

      Reviewer #2 (Public review):

      In this paper, Rayan et al. report that RNA influences cytotoxic activity of the staphylococcal secreted peptide cytolysin PSMalpha3 versus human cells and E. coli by impacting its aggregation. The authors used sophisticated methods of structural analysis and described the associated liquid-liquid phase separation. They also compare the influence of RNA on the aggregation and activity of LL-37, which shows differences from that on PSMalpha3.

      Strengths:

      That RNA impacts PSM cytotoxicity when co-incubated in vitro becomes clear.

      Weaknesses:

      I have two major and fundamental problems with this study:

      (1) The premise, as stated in the introduction and elsewhere, that PSMalpha3 amyloids are biologically functional, is highly debatable and has never been conclusively substantiated. The property that matters most for the present study, cytotoxicity, is generally attributed to PSM monomers, not amyloids. The likely erroneous notion that PSM amyloids are the predominant cytotoxic form is derived from an earlier study by the authors that has described a specific amyloid structure of aggregated PSMalpha3. Other authors have later produced evidence that, quite unsurprisingly, indicated that aggregation into amyloids decreases, rather than increases, PSM cytotoxicity. Unfortunately, yet other groups have, in the meantime, published in-vitro studies on "functional amyloids" by PSMs without critically challenging the concept of PSM amyloid "functionality". Of note, the authors' own data in the present study, which show strongly decreased cytotoxicity of PSMalpha3 after prolonged incubation, are in agreement with monomer-associated cytotoxicity as they can be easily explained by the removal of biologically active monomers from the solution.

      We thank the reviewer for this important critique and agree that direct cytotoxicity is most plausibly mediated by soluble PSM species, while extensive fibrillation generally reduces toxicity by depleting these forms, a conclusion supported by our data and by other studies (e.g., Zheng et al 2018 and Yao et al 2019). We do not propose mature amyloid fibrils as the primary toxic entities. Rather, we use the term functional amyloid in a regulatory sense, consistent with other biological amyloids whose fibrillar states modulate activity (e.g., hormone storage amyloids or RNA-binding proteins).

      In line with emerging findings, we interpret PSMα3 toxicity as arising from a dynamic assembly process rather than from a single static molecular species. We previously showed that PSMα3 forms cross-α fibrils that are thermodynamically and mechanically less stable than cross-β amyloids and readily disassemble upon heat stress, fully restoring cytotoxic activity (Rayan et al., 2023). This behavior contrasts with PSMα1, which forms highly stable cross-β fibrils that do not recover activity after heat shock, suggesting that the limited thermostability of PSMα3 is an evolved feature enabling reversible switching between inactive (stored) and active states.

      Consistent with this view, both PSMα1 and PSMα3 are cytotoxic in their soluble states, yet mutants unable to fibrillate lose activity, indicating that fibrillation is required but not itself the toxic end state (Tayeb-Fligelman et al., 2017, 2020; Malishev et al., 2018). Our other studies further show that cytotoxicity toward human cells correlates with inherent or lipid-induced α-helical assemblies, rather than with inert β-sheet amyloids (RagonisBachar et al., 2022, 2026; Salinas 2020, Bücker 2022). Together, these findings support a model in which membrane-associated, dynamic α-helical assembly, which requires continuous exchange between soluble species and growing fibrils, drives membrane disruption, potentially through lipid recruitment or extraction, analogous to mechanisms proposed for human amyloids such as islet amyloid polypeptide (Sparr et al., 2004).

      In the present study, we further show that RNA reshapes this dynamic landscape: while PSMα3 alone progressively loses activity upon incubation, co-incubation with RNA preserves cytotoxicity by stabilizing bioactive polymorphs and condensate-like states, whereas high RNA concentrations promote solid aggregation but nevertheless preserve activity. Thus, aggregation is neither inherently functional nor toxic, but context-dependent and environmentally regulated. Taken together, our data support a model in which PSMα3 amyloids act as a dynamic reservoir, enabling S. aureus to tune virulence by reversibly shifting between dormant and active states in response to environmental cues such as heat or RNA.

      This is now discussed in lines 56-76 and 523-553 of the revised version.

      (2) That RNA may interfere with PSM aggregation and influence activity is not very surprising, given that PSM attachment to nucleic acids - while not studied in as much detail as here - has been described. Importantly, it does not become clear whether this effect has biologically significant consequences beyond influencing, again not surprisingly, cytotoxicity in vitro. The authors do show in nice microscopic analyses that labeled PSMalpha3 attaches to nuclei when incubated with HeLa cells. However, given that the cells are killed rapidly by membrane perturbation by the applied PSM concentrations, it remains unclear and untested whether the attachment to nucleic acids in dying cells makes any contribution to PSM-induced cell death or has any other biological significance.

      We thank the reviewer for this important point and agree that PSM–nucleic acid interactions are not unexpected and that our data do not support a direct intracellular role for RNA binding in mediating cytotoxicity. Accordingly, we do not propose nucleolar or nuclear association of PSMα3 as a causal mechanism of cell death. At the concentrations used, PSMα3 induces rapid membrane disruption, and nucleic acid association is observed along with membrane attachment, precluding conclusions about intracellular function. This limitation is now explicitly clarified in the revised manuscript. The biological significance of our findings lies instead in extracellular and environmental contexts, where PSMα3 encounters abundant nucleic acids, such as RNA or DNA released from damaged host cells or present in biofilms as now addressed in lines 622631. Our data show that RNA modulates PSMα3 aggregation trajectories, shifting the balance between liquid-like condensates and solid aggregates, and thereby regulates the persistence and timing of cytotoxic activity. In this framework, RNA acts as a context-dependent regulator of virulence, rather than as an intracellular cytotoxic cofactor, an aspect which would be studied in depth in future work. This is now addressed in the text in lines 597-607 of the revised version.

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Rayan et al. aims to investigate the role of RNA in modulating both virulent amyloid and host-defense peptides, with the objective of understanding their self-assembly mechanisms, morphological features, and aggregation pathways.

      Strengths:

      The overall content is well-structured with a logical flow of ideas that effectively conveys the research objectives.

      Weaknesses:

      (1) Figure 2 displays representative FRAP images demonstrating fluorescence recovery within seconds. To gain a more comprehensive understanding of how recovery after photobleaching varies under different conditions, it is recommended to supplement these images with corresponding quantitative fluorescence recovery curves for analysis.

      In response to this comment, we have supplemented the representative FRAP images with quantitative fluorescence recovery curves, reporting normalized recovery kinetics for the indicated conditions. These data are now provided in Figure S4 of the revised manuscript, allowing direct comparison of recovery behavior across conditions (shown by microscopy in Figure 2). In addition, we have included quantitative FRAP analyses for the cellular imaging shown in Figure 5 (presented in Figure S8) and for LL-37 assemblies formed under heat-shock conditions (Figure S12). Together, these additions provide a quantitative framework for interpreting the FRAP results and strengthen the distinction between liquid-like and solid-like assembly states.

      (2) Ostwald ripening typically leads to the shrinkage or even disappearance of smaller droplets, accompanied by the further growth of large droplets. However, the droplet size in Figure 2D decreases significantly after 2 h of incubation. This observation prompts the question, what is the driving force underlying RNA-regulated phase separation and phase transition?

      We thank the reviewer for this observation. Across multiple samples, we consistently observe a coexistence of small droplets and larger aggregates, rather than systematic growth of larger droplets at the expense of smaller ones or a uniform decrease in droplet size. In addition, the timescales examined do not allow us to reliably assess whether diffusion-driven droplet coalescence is fast enough to draw firm conclusions about droplet size evolution. This is now addressed in the text in lines 181-184 of the revised version.

      A decrease in droplet size over time is nevertheless observed in some instances and is more consistent with a time-dependent conversion of initially liquid-like condensates into more solid-like assemblies, which would reduce molecular mobility and suppress droplet coalescence. In parallel, progressive fibril formation may act as a sink for soluble peptide, leading to partial dissolution or shrinkage of less mature condensates. Together, these observations are consistent with a non-equilibrium aging process, in which RNAregulated assemblies evolve from dynamic condensates toward more solid structures rather than following equilibrium Ostwald ripening.

      (3) The manuscript aims to study the role of RNA in modulating PSMα3 aggregation by using solution-state NMR to obtain residue-specific structural information. The current NMR data, as described in the method and figure captions, were recorded in the absence of RNA. Whether RNA binding induces conformational changes of PSMα3, and how these changes alter the NMR spectra? Also, the sequential NOE walk between neighboring residues can be annotated on the spectrum for clarity.

      The solution-state NMR experiments were performed specifically to characterize the potential binding of EGCG to PSMα3. Due to the strong tendency of PSMα3 to undergo rapid aggregation and line broadening upon RNA addition, solutionstate NMR spectra in the presence of RNA could not be obtained at sufficient quality for residue-specific analysis. As suggested, we have updated and annotated the sequential NOE walk between neighboring residues on the relevant NOESY spectra to improve clarity.

      (4) The authors claim that LL-37 shares functional, sequence, and structural similarities with PSMα3. However, no droplet formation was observed of LL-37 in the presence of RNA only. The authors then applied thermal stress to induce phase separation of LL-37. What are the main factors contributing to the different phase behaviors exhibited by LL37 and PSMα3? What are the differences in the conformation of amyloid aggregates and the kinetics of aggregation between the condensation-induced aggregation in the presence of RNA and the conventional nucleation-elongation process in the absence of RNA for these two proteins?”

      We appreciate this important question and have clarified both the basis of the comparison and the origin of the divergent phase behaviors of LL-37 and PSMα3. While PSMα3 and LL-37 share key properties as short, cationic, amphipathic α-helical peptides that self-assemble and interact with nucleic acids, they differ fundamentally in their assembly architectures. PSMα3 is an amyloidogenic peptide that forms cross-α amyloid fibrils, in which α-helices stack perpendicular to the fibril axis. In contrast, LL-37 can form fibrillar or sheet-like assemblies (observed in cryo grids), but these lack canonical amyloid features without clear cross-α or cross-β amyloid order, as so far observed by crystal structures. This is now clarified in different parts of the text of the revised version. Thus, the comparison between the two peptides is functional and physicochemical rather than implying identical amyloid mechanisms. These structural differences likely underlie their distinct phase behaviors.

      Because LL-37 does not follow a classical amyloid nucleation–elongation pathway, and high-resolution structural information (e.g., cryo-EM) is currently lacking, partly due to its sheet-like, non-twisted morphology (unpublished results), it is not possible to directly compare aggregation kinetics or nucleation mechanisms between LL-37 and PSMα3. It is possible that amyloidogenic systems such as PSMα3 exhibit greater flexibility in prefibrillar and fibrillar polymorphism, enabling RNA-regulated phase behavior, whereas nonamyloid assemblies such as LL-37 are more prone to stress-induced solid aggregation. We note that this interpretation is necessarily tentative and does not imply a general rule, but rather reflects differences evident in the present system.

    1. eLife Assessment

      In this valuable study, the authors present traces of bone modification on ~1.8 million-year-old proboscidean remains from Tanzania, which they infer to be the earliest evidence for stone-tool-assisted megafaunal consumption by hominins. Challenging published claims, the authors argue that persistent megafaunal exploitation roughly coincided with the earliest Acheulean tools. Notwithstanding the rich descriptive and spatial data, the behavioral inferences about hominin agency rely on traces (such as bone fracture patterns and spatial overlap) that are not unequivocal; the evidence presented to support the inferences thus remains incomplete. Given the implications of the timing and extent of hominin consumption of nutritious and energy-dense food resources, as well as of bone toolmaking, the findings of this study will be of interest to paleoanthropologists and other evolutionary biologists.

    2. Reviewer #1 (Public review):

      Domínguez-Rodrigo and colleagues make a moderately convincing case for habitual elephant butchery by Early Pleistocene hominins at Olduvai Gorge (Tanzania), ca. 1.8-1.7 million years ago. They present this at the site scale (the EAK locality, which they excavated), as well as across the penecontemporaneous landscape, analyzing a series of findspots that contain stone tools and large-mammal bones. The latter are primarily elephants, but giraffids and bovids were also butchered in a few localities. The authors claim that this is the earliest well-documented evidence for elephant butchery; doing so requires debunking other purported cases of elephant butchery in the literature, or in one case, reinterpreting elephant bone manipulation as being nutritional (fracturing to obtain marrow) rather than technological (to make bone tools). The authors' critical discussion of these cases may not be consensual, but it surely advances the scientific discourse. The authors conclude by suggesting that an evolutionary threshold was achieved at ca. 1.8 ma, whereby regular elephant consumption rich in fats and perhaps food surplus, more advanced extractive technology (the Acheulian toolkit), and larger human group size had coincided.

      The fieldwork and spatial statistics methods are presented in detail and are solid and helpful, especially the excellent description (all too rare in zooarchaeology papers) of bone conservation and preservation procedures. However, the methods of the zooarchaeological and taphonomic analysis - the core of the study - are peculiarly missing. Some of these are explained along the manuscript, but not in a standard Methods paragraph with suitable references and an explicit account of how the authors recorded bone-surface modifications and the mode of bone fragmentation. This seems more of a technical omission that can be easily fixed than a true shortcoming of the study. The results are detailed and clearly presented.

      By and large, the authors achieved their aims, showcasing recurring elephant butchery in 1.8-1.7 million-year-old archaeological contexts. Nevertheless, some ambiguity surrounds the evolutionary significance part. The authors emphasize the temporal and spatial correlation of (1) elephant butchery, (2) Acheulian toolkits, and (3) larger sites, but do not actually discuss how these elements may be causally related. Is it not possible that larger group size or the adoption of Acheulian technology have nothing to do with megafaunal exploitation? Alternative hypotheses exist, and at least, the authors should try to defend the causation, not just put forward the correlation. The only exception is briefly mentioning food surplus as a "significant advantage", but how exactly, in the absence of food-preservation technologies? Moreover, in a landscape full of aggressive scavengers, such excess carcass parts may become a death trap for hominins, not an advantage. I do think that demonstrating habitual butchery bears very significant implications for human evolution, but more effort should be invested in explaining how this might have worked.

      Overall, this is an interesting manuscript of broad interest that presents original data and interpretations from the Early Pleistocene archaeology of Olduvai Gorge. These observations and the authors' critical review of previously published evidence are an important contribution that will form the basis for building models of Early Pleistocene hominin adaptation.

    3. Reviewer #2 (Public review):

      The authors argue that the Emiliano Aguirre Korongo (EAK) assemblage from the base of Bed II at Olduvai Gorge shows systematic exploitation of elephants by hominins about 1.78 million years ago. They describe it as the earliest clear case of proboscidean butchery at Olduvai and link it to a larger behavioral shift from the Oldowan to the Acheulean.

      The manuscript makes a valuable contribution to the Olduvai Gorge record, offering a detailed description of the EAK faunal assemblage. In particular, the paper provides a high-resolution record of a juvenile Elephas recki carcass, associated lithic artifacts, and several green-broken bone specimens. These data are inherently valuable and will be of significant interest to researchers studying Early Pleistocene taphonomy.

      Comments on previous round of revisions:

      The revised manuscript does a good job of using less definitive language, particularly by adding "possible" qualifiers to several interpretations. This addresses the concern about overstatement.

      The main issue raised in the original review, however, remains unresolved. Only two elephant bone specimens at EAK show green-bone breakage interpreted as anthropogenic, and the diagnostic basis for that interpretation is not demonstrated clearly on the EAK material itself. The manuscript discusses a suite of fracture attributes described as diagnostic of dynamic percussive breakage, but these attributes are not explicitly documented on the EAK specimens. Instead, the diagnostic traits are illustrated using material from other Olduvai contexts, and that behavior is then extrapolated to make similar claims at EAK. For a paper making a potentially important behavioral argument, the key diagnostic evidence is not clearly demonstrated at the focal assemblage.

      This problem is evident in the presentation of the EAK specimens. In their response, the authors state that one EAK specimen shows "overlapping scars" and constitutes a "long bone flake"; however, these features are not clearly identifiable in the figures or captions as currently presented. The authors state that Figures S21-S23 clearly indicate human agency, including a long bone flake with overlapping scars and a view of the medullary surface, but it is unclear which specimens or surfaces these descriptions refer to. Figure S21 does appear to show green fracture and is described only as an "elephant-sized flat bone fragment with green-bone curvilinear break." Figure S22 shows the same bone and cortical surface in a different orientation, providing no additional information. In Figure S23, I cannot clearly identify a medullary surface or evidence of green-bone fracture from this image. None of these images clearly demonstrates overlapping scars, and the figures would be substantially improved by explicitly identifying the features described in the text. Even if both EAK specimens are accepted as green-broken, they do not demonstrate the co-occurrence of multiple diagnostic fracture traits such as multiple green breaks, large step fractures, hackle marks, and overlapping scars that the authors state is required to attribute dynamic percussive activity to hominins and address equifinality.

      I appreciate that the authors are careful to state that spatial association between stone tools and fossils alone does not demonstrate hominin behavior, and that they treat the spatial analyses as supportive rather than decisive. While the association is intriguing, the problem is downstream: spatial association is used to strengthen an interpretation of butchery at EAK that still depends on fracture evidence that is not clearly documented at the assemblage level.

      The critique concerning Nyayanga is not addressed in the revision. The manuscript proposes alternative explanations for the Nyayanga material but does not demonstrate why these are more plausible than the interpretation advanced by Plummer et al. (2023). I am not arguing that the Nyayanga material should be accepted as butchery; rather, showing that trampling is possible does not establish it as more probable than cut marks. In contrast, the EAK material is treated as evidence of butchery on the basis of evidence that, in my opinion, is more limited and less clearly demonstrated. Even if this is not the authors' intention, the uneven treatment removes an earlier megafaunal case from the comparison and strengthens the case for interpreting EAK as marking a behavioral shift toward megafaunal butchery by excluding other early cases.

      While I remain concerned about how the EAK evidence is documented and interpreted, I think the manuscript is appropriate for publication and will generate useful discussion. Readers can then assess for themselves whether the available evidence supports the strength of the behavioral claims.

      [Editors' note: the authors are encouraged to make this version the Version of Record.]

    4. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #2 (Public review):

      This problem is evident in the presentation of the EAK specimens. In their response, the authors state that one EAK specimen shows "overlapping scars" and constitutes a "long bone flake"; however, these features are not clearly identifiable in the figures or captions as currently presented. The authors state that Figures S21-S23 clearly indicate human agency, including a long bone flake with overlapping scars and a view of the medullary surface, but it is unclear which specimens or surfaces these descriptions refer to. Figure S21 does appear to show green fracture and is described only as an "elephant-sized flat bone fragment with green-bone curvilinear break." Figure S22 shows the same bone and cortical surface in a different orientation, providing no additional information. In Figure S23, I cannot clearly identify a medullary surface or evidence of green-bone fracture from this image. None of these images clearly demonstrates overlapping scars, and the figures would be substantially improved by explicitly identifying the features described in the text. Even if both EAK specimens are accepted as green-broken, they do not demonstrate the co-occurrence of multiple diagnostic fracture traits such as multiple green breaks, large step fractures, hackle marks, and overlapping scars that the authors state is required to attribute dynamic percussive activity to hominins and address equifinality.

      We appreciate the reviewer’s careful evaluation of the EAK specimens. We acknowledge that the overlapping scars and medullary surface of the specimen originally shown in Figure S23 were not sufficiently clear. To address this, we have extensively revised Figure S23. In the updated Supplementary File, we have provided new annotations and line drawings that explicitly trace the outlines of the overlapping scars and clearly shows the green-bone fracture features. These enhancements ensure that the diagnostic traits discussed in the text are now directly identifiable in the visual record. This demonstrates the co-occurrence of traits: green-broken outlines and overlapping scars, which meet the criteria for identifying dynamic percussive activity. This is so following Reviewer´s 2 partial handling of our arguments; since we argued in our previous response that clear simple green-broken elephant long limb bones were an anthropogenic signature per se, given that currently no durophagous predator/scavenger (including spotted hyenas) are able to produce them. Additional secondary features like hackle marks are supportive but not necessary to attribute human agency.

      I appreciate that the authors are careful to state that spatial association between stone tools and fossils alone does not demonstrate hominin behavior, and that they treat the spatial analyses as supportive rather than decisive. While the association is intriguing, the problem is downstream: spatial association is used to strengthen an interpretation of butchery at EAK that still depends on fracture evidence that is not clearly documented at the assemblage level.

      The association is inferred (not demonstrated) by the strong statistical spatial association between lithics and bones. Additional taphonomic evidence (like cut marks or green-broken bones) do further support the inference but they do not demonstrate it, given the highly subjective nature of cut mark identification and the plethora of alternative scenarios: one green-broken bone would not demonstrate complete elephant butchery (it could result from a marginal exploitation of just that bone); one cutmarked bone could equally reflect several alternative access types to the remains. The reviewer recognized above the presence of green-broken elements at EAK; again, this supports anthropogenic agency better than any other alternative scenario, because one of the green-broken bones is a long bone and modern hyenas are not able to produce this kind of specimens.

      The critique concerning Nyayanga is not addressed in the revision. The manuscript proposes alternative explanations for the Nyayanga material but does not demonstrate why these are more plausible than the interpretation advanced by Plummer et al. (2023). I am not arguing that the Nyayanga material should be accepted as butchery; rather, showing that trampling is possible does not establish it as more probable than cut marks. In contrast, the EAK material is treated as evidence of butchery on the basis of evidence that, in my opinion, is more limited and less clearly demonstrated. Even if this is not the authors' intention, the uneven treatment removes an earlier megafaunal case from the comparison and strengthens the case for interpreting EAK as marking a behavioral shift toward megafaunal butchery by excluding other early cases.

      Again, it was never our intention to “demonstrate” anything. The reviewer is misusing this term. These types of arguments are epistemologically impossible to demonstrate. One can just discuss the heuristics of alternative scenarios. The point that we tried to make was that the Nyayanga purported cut marks on megafaunal remains are (as identified and published) impossible to differentiate from natural sedimentary abrasive marks (like trampling). Therefore, they cannot be argued to represent anthropogenic butchery on a secure basis. Especially, when they do not occur in conjunction with green-broken elements of clear dynamic loading nature.

    1. eLife Assessment

      Mitochondrial DNA (mtDNA) exhibits a degree of resistance to mutagenesis under genotoxic stress, and this study on the mitochondrial Transcription Factor A (TFAM) presents important data concerning the possible mechanisms involved. The presented data are solid, technically rigorous, and consistent with established literature findings. The experiments are well-executed, providing convincing evidence on the change of TFAM-DNA interactions following UVC irradiation.

    2. Reviewer #1 (Public review):

      Summary:

      The authors investigate how UVC induced DNA damage alters the interaction between the mitochondrial transcription factor TFAM and mtDNA. Using live-cell imaging, qPCR, atomic force microscopy (AFM), fluorescence anisotropy, and high-throughput DNA-chip assays, they show that UVC irradiation reduces TFAM sequence specificity and increases mtDNA compaction without protecting mtDNA from lesion formation. From these findings the authors suggest that TFAM acts as a "sensor" of damage rather than a protective or repair-promoting factor.

      Strengths:

      (1) The focus on UVC damage offers a clean system to study mtDNA damage sensing independently of more commonly studied repair pathways, such as oxidative DNA damage. The impact of UVC damage is not well understood in the mitochondria and this study fills that gap in knowledge.

      (2) In particular, the custom mitochondrial genome DNA chip provides high resolution mapping of TFAM binding and reveals a global loss of sequence specificity following UVC exposure.

      (3) The combination of in vitro TFAM DNA biophysical approaches combined with cellular responses (gene expression, mtDNA turnover) provides a coherent multi-scale view.

      (4) The authors demonstrate that TFAM induced compaction does not protect mtDNA from UVC lesions, an important contribution given assumptions about TFAM providing protection.

      Weaknesses:

      (1) The authors show a decrease in mtDNA levels and increased lysosomal colocalization but do not define the pathway responsible for degradation. Distinguishing between replication dilution, mitophagy, or targeted degradation would strengthen the interpretation and justifies future experiments.

      (2) The manuscript briefly notes enrichment of TFAM at certain regions of the mitochondrial genome but provides little interpretation of why these regions are favored. Discussion of whether high-occupancy sites correspond to regulatory or structural elements would add valuable context.

      (3) The authors provide a discrepancy between the anisotropy and binding array results. The reason for this is not clear and one wonders if an orthogonal approach for the binding experiments would elucidate this difference (minor point).

      Assessment of conclusions:

      The manuscript successfully meets its primary goal of testing whether TFAM protects mtDNA from UVC damage and the impact this has on the mtDNA. While their data points to an intriguing model that TFAM acts as a sensor of damaged mtDNA, the validation of this model requires further investigation to make the model more convincing. This is likely warranted for a followup study. Also the biological impact of this compaction, such as altering transcription levels is not clear in this study.

      Impact and utility of the methods:

      This work advances our understanding of how mitochondria manage UVC genome damage and proposes a structural mechanism for damage "sensing" independent of canonical repair. The methodology, including the custom TFAM DNA chip, will be broadly useful to the scientific community.

      Context: The study supports a model in which mitochondrial genome integrity is maintained not only by repair factors, but also by selective sequestration or removal of damaged genomes. The demonstration that TFAM compaction correlates with damage rather than protection reframes an interesting role in mtDNA quality control.

      Comments on revised version:

      The authors addressed all concerns during the revision.

    3. Reviewer #2 (Public review):

      Summary:

      King et al. present several sets of experiments aimed to address potential impact of UV irradiation on human mitochondrial DNA as well as possible role of mitochondrial TFAM protein in handling UV irradiated mitochondrial genomes. The carefully worded conclusion derived from the results of experiments performed with human HeLa cells, in vitro small plasmid DNA, with PCR-generated human mitochondrial DNA and with UV-irradiated small oligonucleotides is presented in the title of the manuscript: "UV irradiation alters TFAM binding to mitochondrial DNA". Authors also interpret results of somewhat unconnected experimental approaches to speculate that "TFAM as a potential DNA damage sensing protein in that it promotes UVC-dependent conformational changes in the [mitochondrial] nucleoids, making them more compact. They further propose that such a proposed compaction might trigger removal of UV-damaged mitochondrial genomes as well as facilitates replication of undamaged mitochondrial genomes.

      Strengths:

      (1) Authors presented convincing evidence that a very high dose (1500 J/m2) of UVC applied to oligonucleotides covering the entire mitochondrial DNA genome alleviates sequence specificity of TFAM binding (Figure 3). This high dose was sufficient to cause UV-lesions in a large fraction of individual oligonucleotides. The method has been developed in the lab of one of the corresponding authors (ref. 74) and is technically well refined. This result can be published as is or in combination with other data.

      (2) Manuscript also presents AFM evidence (Figure 4) that TFAM, which was long known to facilitate compaction of mitochondrial genome (Alam et al., 2003; PMID 12626705 and follow up citations), causes in vitro compaction of a small pUC19 plasmid and that approximately 3 UVC lesions per plasmid molecule results in slight albeit detectable increase in TFAM compaction of the plasmid.

      Both results are discussed in line of a possible extrapolation to in vivo phenomena. The revised version of the discussion includes a clear statement that no in vivo support was provided within the set of experiments presented in the manuscript.

      Weaknesses:

      The experiments presented on Figures 3 and 4 may support the speculation that TFAM can carry protective role of eliminating mitochondrial genomes with bulky lesions by way of excessive compaction and removal damaged genomes from the in vivo pool, however extensive additional studies that would go well beyond the experiments described in this paper are needed to fill the gap between this set of results and the proposed explanations.

    4. Reviewer #3 (Public review):

      Summary

      The study is grounded in the observation that mitochondrial DNA (mtDNA) shows some resistance to mutagenesis under genotoxic stress. The manuscript focuses on the effects of UVC-induced DNA damage on TFAM-DNA binding in vitro and in cells. The authors demonstrate increased TFAM-DNA compaction following UVC irradiation in vitro, as assessed by high-throughput protein-DNA binding assays and atomic force microscopy (AFM). The authors did not observe a similar trend in fluorescence polarization assays and attributed the difference in the extent of TFAM oligomerization as a potential reason. In cells, the authors found that UVC exposure increased mRNA levels of TFAM, POLG, and POLRMT without altering mitochondrial membrane potential. Overexpressing TFAM in cells or varying TFAM concentration in reconstituted nucleoids did not alter the accumulation or disappearance of mtDNA damage. Based on their data, the authors proposed a plausible model: following UVC-induced DNA damage, TFAM facilitates nucleoid compaction, which may signal damage in the mitochondrial genome. The proposed model may inspire future follow-up studies to further study the role of TFAM in sensing UVC-induced damage.

      Comments on revised version:

      The authors have addressed the reviewer's concerns.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors investigate how UVC-induced DNA damage alters the interaction between the mitochondrial transcription factor TFAM and mtDNA. Using live-cell imaging, qPCR, atomic force microscopy (AFM), fluorescence anisotropy, and high-throughput DNA-chip assays, they show that UVC irradiation reduces TFAM sequence specificity and increases mtDNA compaction without protecting mtDNA from lesion formation. From these findings, the authors suggest that TFAM acts as a "sensor" of damage rather than a protective or repair-promoting factor.

      Strengths:

      (1) The focus on UVC damage offers a clean system to study mtDNA damage sensing independently of more commonly studied repair pathways, such as oxidative DNA damage. The impact of UVC damage is not well understood in the mitochondria, and this study fills that gap in knowledge.

      (2) In particular, the custom mitochondrial genome DNA chip provides high-resolution mapping of TFAM binding and reveals a global loss of sequence specificity following UVC exposure.

      (3) The combination of in vitro TFAM DNA biophysical approaches, combined with cellular responses (gene expression, mtDNA turnover), provides a coherent multi-scale view.

      (4) The authors demonstrate that TFAM-induced compaction does not protect mtDNA from UVC lesions, an important contribution given assumptions about TFAM providing protection.

      Weaknesses:

      (1) The authors show a decrease in mtDNA levels and increased lysosomal colocalization but do not define the pathway responsible for degradation. Distinguishing between replication dilution, mitophagy, or targeted degradation would strengthen the interpretation

      We thank the reviewer for their careful reading of our manuscript and thoughtful suggestions. We agree that distinguishing between replication dilution, mitophagy, and/or targeted degradation would strengthen our understanding of how UV-induced DNA damage is handled in the mitochondria. Currently we are undertaking experiments to tease this apart, but consider the scope of those experiments to be beyond this manuscript and expect to publish them in a subsequent paper rather than this one. We added text explicitly stating that these possibilities are not distinguished by our results in pages 8-9 in the Discussion under the subsection ‘Mitochondria respond to UVC-induced mtDNA damage in the absence of apparent mitochondrial dysfunction’.

      (2) The sudden induction of mtDNA replication genes and transcription at 24 h suggests that intermediate timepoints (e.g., 12 hours) could clarify the kinetics of the response and avoid the impression that the sampling coincidentally captured the peak.

      We agree and have added additional timepoints of 12 hours and 18 hours post exposure. We have updated Figure 2 to include the new data and have added text on page 4 to include these results.

      (3) The authors report no loss of mitochondrial membrane potential, but this single measure is limited. Complementary assays such as Seahorse analysis, ATP quantification, or reactive oxygen species measurement could more fully assess functional integrity.

      We focused on membrane potential because loss of membrane potential is such a well-understood of mechanism for triggering mitophagy, but agree that these additional measurements are useful. We have added experiments to assess ATP levels, but did not see changes; we have added this data to Figure 2. We have also added text highlighting that we previously assessed mtROS following the same levels of UV exposure and observed no changes (in the results section on page 5 and in the discussion section on page 9). Given that we observe no changes in membrane potential or ATP, we have opted to not move forward with Seahorse analysis for the purposes of this paper.

      (4) The manuscript briefly notes enrichment of TFAM at certain regions of the mitochondrial genome but provides little interpretation of why these regions are favored. Discussion of whether high-occupancy sites correspond to regulatory or structural elements would add valuable context.

      We agree a discussion of these findings provides context and insight into where the field is currently in understanding TFAM sequence specificity. We have updated text in the discussion (pages 9-10) to include our thoughts on the drivers of TFAM sequence specificity with regard to the discrepancy with the anisotropy data and the lack of overlap with regulatory/structural elements.

      (5) It remains unclear whether the altered DNA topology promotes TFAM compaction or vice versa. Addressing this directionality, perhaps by including UVC-only controls for plasmid conformation, would help disentangle these effects if UVC is causing compaction alone.

      We have added an additional control making this comparison and updated the text on page 7 in the results section. UVC by itself (without TFAM being present) does not alter the plasmid compaction; see new supplemental Figure S16.

      (6) The authors provide a discrepancy between the anisotropy and binding array results. The reason for this is not clear, and one wonders if an orthogonal approach for the binding experiments would elucidate this difference (minor point).

      The discrepancy between anisotropy and the binding array results is certainly unusual and contrary to previous studies that have used these arrays. In addition to the anisotropy experiments, we selected a ‘high occupancy’ and ‘low occupancy’ sequence from the binding array and performed oligomerization experiments using atomic force microscopy, which allowed us to detect small changes in cooperativity (see supplemental Figure S15). We previously only discussed this briefly in the results section on page 6, but we have now updated the discussion section (pages 9-10) to highlight this finding and put forth ideas for the field as to why we think this might be the case. While we do see that the binding array data aligns with oligomerization and cooperativity of TFAM, we still do not know what it is about these sequences that would drive such differences in TFAM binding, but we speculate that it could have something to do with flexibility of the DNA sequences.

      Assessment of conclusions:

      The manuscript successfully meets its primary goal of testing whether TFAM protects mtDNA from UVC damage and the impact this has on the mtDNA. While their data points to an intriguing model that TFAM acts as a sensor of damaged mtDNA, the validation of this model requires further investigation to make the model more convincing. This is likely warranted for a follow-up study. Also, the biological impact of this compaction, such as altering transcription levels, is not clear in this study.

      We have updated wording in the Abstract, Introduction, and elsewhere in the text (as detailed in other portions of our response) to make as explicit and clear as possible which results are supported by the in vitro versus in vivo data, and which parts are conclusions supported by the data versus hypothesized models to be tested in future work.

      Impact and utility of the methods:

      This work advances our understanding of how mitochondria manage UVC genome damage and proposes a structural mechanism for damage "sensing" independent of canonical repair. The methodology, including the custom TFAM DNA chip, will be broadly useful to the scientific community.

      Context:

      The study supports a model in which mitochondrial genome integrity is maintained not only by repair factors, but also by selective sequestration or removal of damaged genomes. The demonstration that TFAM compaction correlates with damage rather than protection reframes an interesting role in mtDNA quality control.

      Reviewer #2 (Public review):

      Summary:

      King et al. present several sets of experiments aimed to address the potential impact of UV irradiation on human mitochondrial DNA as well as the possible role of mitochondrial TFAM protein in handling UV-irradiated mitochondrial genomes. The carefully worded conclusion derived from the results of experiments performed with human HeLa cells, in vitro small plasmid DNA, with PCR-generated human mitochondrial DNA, and with UV-irradiated small oligonucleotides is presented in the title of the manuscript: "UV irradiation alters TFAM binding to mitochondrial DNA". The authors also interpret results of somewhat unconnected experimental approaches to speculate that "TFAM is a potential DNA damage sensing protein in that it promotes UVC-dependent conformational changes in the [mitochondrial] nucleoids, making them more compact." They further propose that such a proposed compaction triggers the removal of UV-damaged mitochondrial genomes as well as facilitates replication of undamaged mitochondrial genomes.

      Strengths:

      (1) The authors presented convincing evidence that a very high dose (1500 J/m2) of UVC applied to oligonucleotides covering the entire mitochondrial DNA genome alleviates sequence specificity of TFAM binding (Figure 3). This high dose was sufficient to cause UV lesions in a large fraction of individual oligonucleotides. The method was developed in the lab of one of the corresponding authors (reference 74) and is technically well-refined. This result can be published as is or in combination with other data.

      (2) The manuscript also presents AFM evidence (Figure 4) that TFAM, which was long known to facilitate compaction of the mitochondrial genome (Alam et al., 2003; PMID 12626705 and follow-up citations), causes in vitro compaction of a small pUC19 plasmid and that approximately 3 UVC lesions per plasmid molecule result in a slight, albeit detectable, increase in TFAM compaction of the plasmid. Both results can be discussed in line with a possible extrapolation to in vivo phenomena, but such a discussion should include a clear statement that no in vivo support was provided within the set of experiments presented in the manuscript.

      We thank this reviewer for their careful reading and interpretation of the manuscript. We agree that discussion of in vivo implications and extrapolations need clear statements indicating where there is not currently in vivo support. We have updated the text throughout the paper to include this.

      Weaknesses:

      Besides the experiments presented in Figures 3 and 4, other results do not either support or contradict the speculation that TFAM can play a protective role, eliminating mitochondrial genomes with bulky lesions by way of excessive compaction and removing damaged genomes from the in vivo pool.

      To specify these weaknesses:

      (1) Figure 1 - presents evidence that UVC causes a reduction in the number of mitochondrial spots in cells. The role of TFAM is not assessed.

      We are working to understand the role of TFAM in vivo following UV irradiation, but believe that work should be included in follow up studies rather than this publication.

      (2) Figure 2 - presents evidence that UVC causes lesions in mitochondrial genomes in vivo, detectable by qPCR. No direct assessment of TFAM roles in damage repair or mitochondrial DNA turnover is assessed despite the statements in the title of Figure 2 or in associated text. Approximately 2-fold change in gene expression of TFAM and of the three other genes does not provide any reasonable support to suggestion about increased mitochondrial DNA turnover over multiple explanations on related to mitochondrial DNA maintenance.

      We agree and have updated the title of Figure 2 to better reflect the findings outlined in the figure as well as the text.

      The new title is, “UVC causes mtDNA damage that decreases over time and is associated with upregulation of mtDNA replication genes, in the absence of apparent mitochondrial dysfunction.”

      We agree that there are numerous mechanistic hypotheses that could explain the decrease in mtDNA damage over time. In Figure 1, we show that there is an overall decrease in mtDNA spots, and an increase in mtDNA-lysosome colocalization, suggestive of mtDNA degradation, which could serve to remove damaged genomes. One possibility is that TFAM is playing a role in the damage removal (but not repair per cell as these lesions are not repaired). Another is changes in mtDNA turnover via increasing the replication machinery in order the synthesize non-damaged mtDNA molecules to dilute out damage. These and other possibilities are not mutually exclusive. We have added text (pages 8-9) to make explicit that additional work will be required to distinguish these possibilities. We note that we have also added an additional experiment showing that TFAM knockdown affects mtDNA damage at baseline, as well as after UVC exposure (Figure 5J).

      (3) Figure 5. Shows that TFAM does not protect either mitochondrial nucleoids formed in vitro or mitochondrial DNA in vivo from UVC lesions as well as has no effect on in vivo repair of UV lesions.

      We agree that Figure 5 shows that TFAM does not protect DNA from UVC-induced lesions, and that a roughly 2-fold increase in TFAM protein does not alter damage reduction over time. We have added new data showing that in vivo, knockdown of TFAM results in an increase in baseline (control conditions) mtDNA damage, and also alters the rate of decrease of mtDNA damage over time after UVC (Figure 5J).

      (4) Figure 6: Based on the above analysis, the model of the role of TFAM in sensing mtDNA damage and elimination of damaged genomes in vivo appears unsupported.

      We have updated the legend for Figure 6 in which we outline our hypothesized role of TFAM in sensing mtDNA damage to ensure that readers know this has yet to be fully tested in vivo. We have also updated the Figure legend title from “proposed model” to “hypothesized model,” and changed the wording in the conclusion section (page 11) to highlight more clearly that this is a working model.

      (5) Additional concern about Figure 3 and relevant discussion: It is not clear if more uniform TFAM binding to UV irradiated oligonucleotides with varying sequence as compared to non-irradiated oligonucleotides can be explained by just overall reduced binding eliminating sequence specific peaks.

      We do not believe this is the case given the similar K<sub>D</sub> values for the sequences tested. In our hands and in other publications (reviewed in PMID: 34440420), it has been well established that TFAM binds damaged DNA very well—essentially just as well as nondamaged DNA or better.

      Additionally, a reduction in overall binding on these DNA arrays tends to make sequence specific peaks more apparent. We ran our experiments at both 30 nM and 300 nM TFAM specifically to be able to assess this question. The 300 nM data can be found in supplemental Figure S7. In this figure, we notice that the peaks appear more uniform at the high concentration (comparing Figure 3A to Figure S7A). That is presumably because there is so much more binding happening across the array that the peaks associated with the strongest binders become less pronounced. For the sake of brevity, we have not added this reasoning to the text, but are willing to do so if the Reviewers and Editor feel that it is important to include.

      Reviewer #3 (Public review):

      Summary:

      The study is grounded in the observations that mitochondrial DNA (mtDNA) exhibits a degree of resistance to mutagenesis under genotoxic stress. The manuscript focuses on the effects of UVC-induced DNA damage on TFAM-DNA binding in vitro and in cells. The authors demonstrate increased TFAM-DNA compaction following UVC irradiation in vitro based on high-throughput protein-DNA binding and atomic force microscopy (AFM) experiments. They did not observe a similar trend in fluorescence polarization assays. In cells, the authors found that UVC exposure upregulated TFAM, POLG, and POLRMT mRNA levels without affecting the mitochondrial membrane potential. Overexpressing TFAM in cells or varying TFAM concentration in reconstituted nucleoids did not alter the accumulation or disappearance of mtDNA damage. Based on their data, the authors proposed a plausible model that, following UVC-induced DNA damage, TFAM facilitates nucleoid compaction, which may serve to signal damage in the mitochondrial genome.

      Strengths:

      The presented data are solid, technically rigorous, and consistent with established literature findings. The experiments are well-executed, providing reliable evidence on the change of TFAM-DNA interactions following UVC irradiation. The proposed model may inspire future follow-up studies to further study the role of TFAM in sensing UVC-induced damage.

      Weaknesses:

      The manuscript could be further improved by refining specific interpretations and ensuring terminology aligns precisely with the data presented.

      (1) In line 322, the claim of increased "nucleoid compaction" in cells should be removed, as there is a lack of direct cellular evidence. Given that non-DNA-bound TFAM is subject to protease digestion, it is uncertain to what extent the overexpressed TFAM actually integrates into and compacts mitochondrial nucleoids in the absence of supporting immunofluorescence data.

      We would like to thank this reviewer for their comments and suggestions. We feel these specific language changes have strengthened the interpretability of the text. The TFAM overexpression cells used in this experiment were given to us by Isaac et al., who demonstrated that when TFAM was overexpressed in this specific cell line, the nucleoids were indeed more compact, measured by Fiber-seq (Isaac et al., 2024; PMID: 38347148). We have removed the claim “increased compaction” from the section title, Figure 5 legend title, and from line 322 (now on page 8), and have also added an additional sentence to ensure the reader knows these cells have been shown to have presumed increased compaction by other groups.

      (2) In lines 405 and 406, the authors should avoid equating TFAM overexpression with compaction in the cellular context unless the compaction is directly visualized or measured.

      We have updated the text to ensure that it is clear that this was tested by other groups. We also changed the wording to “inaccessible (presumably compacted) nucleoids.” While we did not demonstrate altered compaction in our study, we think that based on the results from Isaac et al., it is likely that there was increased compaction. In addition, some readers might not have the context to make the connection between compaction and accessibility, so eliminating all reference to compaction could obscure the point.

      (3) In lines 304 and 305 (and several other places throughout the manuscript), the authors use the term "removal rates". A "removal rate" requires a direct comparison of accumulated lesion levels over a time course under different conditions. Given the complexity of UV-induced DNA damage-which involves both damage formation and potential removal via multiple pathways-a more accurate term that reflects the net result of these opposing processes is "accumulated DNA damage levels." This terminology better reflects the final state measured and avoids implying a single, active 'removal' pathway without sufficient kinetic data.

      We agree and have updated the language throughout the text as well as the results heading for this section.

      (4) In line 357, the authors refer to the decrease in the total DNA damage level as "The removal of damaged mtDNA". The decrease may be simply due to the turnover and resynthesis of non-damaged mtDNA molecules. The term "removal" may mislead the casual reader into interpreting the effect as an active repair/removal process.

      We agree and have restructured this sentence for clarity. We do believe there is some removal happening, given the increase in mtDNA colocalization in lysosomes alongside decrease of mtDNA spots in our live cell imaging. We have written it to reflect the inclusion of removal and resynthesis of nondamaged mtDNA molecules (see pages 8-9).

      Recommendations for the authors:

      Reviewing Editor Comments:

      The reviewers appreciate the quality of the presented data but concur that they do not support the primary claims in the title and abstract. The reviewers also realize that in vivo evidence for the model would require extensive new experimentation that goes beyond a reasonable revision. The recommendation is to change the title and significantly revise text, figure titles and legends for transparency, and conclusions within results and discussion sections.

      We thank the editor and all the reviewers for their feedback. We have added additional experiments, updated text throughout the entire paper to ensure our claims are supported, and revised our title. We feel that the changes we have made have indeed made the paper stronger, more transparent, and that the evidence put forth in this paper provides support for all claims made.

      Reviewer #1 (Recommendations for the authors):

      (1) Clarify mitochondrial response kinetics by adding an intermediate (e.g., 12 hrs) recovery timepoint for transcriptional analysis to resolve when TFAM and replication genes are induced.

      We have added additional timepoints of 12 and 18 hours following exposure in Figure 2. These results strengthen our finding that the nuclear transcriptional program supporting mtDNA replication appears to be activated prior to the nuclear transcriptional program supporting mitochondrial transcription, in that POLG and TFAM come up before POLRMT and ND1.

      (2) Strengthen functional readouts by assessing additional parameters of mitochondrial function to substantiate the claim that UVC does not impair mitochondrial performance.

      We have referenced our previously-published data on mtROS and added a measurement of ATP following UVC exposure in Figure 2.

      (3) Consider exploring whether mtDNA degradation occurs via mitophagy, nucleoid-phagy, or another pathway-potentially by using inhibitors or markers of these processes.

      While we agree that this is an important follow up question and are currently working on experiments to address this, those experiments are outside the scope of this manuscript.

      (4) Provide additional details for the high occupancy TFAM sites. Provide brief annotation or discussion of genomic regions showing strong TFAM binding under non-irradiated conditions that are lost during UVC treatment. This would be helpful to the field as a whole.

      We have updated our discussion section to include this.

      (5) Include or discuss a control using UVC irradiated pUC19 without TFAM to confirm that observed compaction categories are TFAM dependent rather than an UVC induced DNA distortion.

      We have added in a supplemental figure (Figure S16) containing comparison of area analysis of control pUC19 and UV-irradiated pUC19 and we have added associated text in the results section of the paper.

      (6) It would be interesting to explore the link between compaction to transcriptional output. In the TFAM overexpression model, the authors could measure expression of mtDNA encoded transcripts (e.g., ND1, COX1) to connect increased compaction with altered mitochondrial transcription.

      While we agree that understanding how the compactional status alters mitochondrial transcription is worthwhile, we believe this is beyond the scope of this paper. Furthermore, this connection has previously been shown by Bruser et al., 2021 (PMID: 34818548) who showed that more compact nucleoids are not undergoing active transcription. It will be interesting to see in future work if mtDNA damage drives changes in both compaction as well as transcriptional activity.

      (7) Clarify quantitative presentation in figure 2F to explicitly note whether the observed increase in fluorescence intensity was statistically insignificant and confirm that the assay sensitivity is sufficient to detect small potential changes. As presented it is not clear if there is a change.

      We have changed the presentation of Figure 2F. There is a slight increase in membrane potential at the 24-hour time point and we have made that clear in the text as well. We included FCCP as a (standard) positive control, for which we can detect the associated decrease in membrane potential for. While it is always possible that a very small decrease occurred that we were unable to detect, we note that none of the six UVC-exposed groups that we tested even trended towards a decrease in MMP, making it less likely that there was an effect that we simply lacked the power or sensitivity to detect.

      (8) It would be interesting if the authors can comment on whether TFAM induced compaction after UVC might shield mtDNA from other, repairable lesions (e.g., oxidative or alkylation damage), offering a broader context for this mechanism beyond just UVC.

      In theory, we believe this is possible. It will also be interesting to see if the increased compaction following UVC also protects or shields the mtDNA from other enzymatic processes, such as repair proteins that may be searching for repairable lesions such as oxidative or alkylation damage. In this case, it seems as though the increased compaction would prevent the repair from happening at genomes harboring damage.

      In this study we show with our in vitro nucleoids that the increased compaction does not protect against UVC, but this is likely because UVC does not need physical access to the DNA in order to damage it, as the wavelengths of UVC (centered in this case at 254nm) are readily absorbed by proteins and thus can go right through the proteins. Currently, we know that increased compaction by TFAM makes the DNA inaccessible to the enzymes required to methylate DNA used in Fiber-seq (PMID: 38347148), but we do not know if the compaction is tight enough to prevent ROS or alkylating agents from damaging the DNA. We have updated text in the discussion on page 10 to highlight some of these ideas.

      Reviewer #2 (Recommendations for the authors):

      Please, go over all display items and text and clarify details that can help readers to understand important specifics of the experiments. Examples are provided below:

      (1) Abstract and Introduction - indicate species and cell line

      We have updated the text to include this information.

      (2) Table 1 "TFAM KD measurements"- title and footnotes are entirely cryptic. Please, clarify the experimental design, question(s) addressed and conclusions drawn from data.

      We have updated the title of Table 1 to "Binding of TFAM to array sequences, measured using fluorescence anisotropy,” and clarified the footnotes to make sure it is clear which sequences were selected for AFM oligomerization experiments.

      (3) Figure 3 and Material and Methods - specify UVC dose.

      We have added this information to both the figure legend and the methods section.

      (4) Figure 4 - specify UVC dose.

      We have added this information to the figure legend.

      (5) Figure 5. Panel B indicate which band is TFAM and which is HA-tag; Indicate clearly which panel is showing in vivo or in vitro results.

      We have updated the figure to label the untagged TFAM and HA-tagged TFAM and changed the panel titles to specify if they are in vivo results.

    1. eLife Assessment

      This valuable study provides convincing evidence that MgdE, a conserved mycobacterial nucleomodulin, downregulates inflammatory gene transcription by interacting with the histone methyltransferase COMPASS complex and altering histone H3 lysine methylation. This work will interest microbiologists as well as cell and cancer biologists.

    2. Reviewer #1 (Public review):

      Summary:

      This fundamental study identifies a new mechanism that involves a mycobacterial nucleomodulin manipulation of the host histone methyltransferase COMPASS complex to promote infection. Although other intracellular pathogens are known to manipulate histone methylation, this is the first report demonstrating specific targeting the COMPASS complex by a pathogen. The rigorous experimental design using of state-of-the art bioinformatic analysis, protein modeling, molecular and cellular interaction and functional approaches, culminating with in vivo infection modeling provide convincing, unequivocal evidence that supports the authors claims. This work will be of particular interest to cellular microbiologist working on microbial virulence mechanisms and effectors, specifically nucleomodulins, and cell/cancer biologists that examine COMPASS dysfunction in cancer biology.

      Strengths:

      (1) The strengths of this study include the rigorous and comprehensive experimental design that involved numerous state-of-the-art approaches to identify potential nucleomodulins, define molecular nucleomodulin-host interactions, cellular nucleomodulin localization, intracellular survival, and inflammatory gene transcriptional responses, and confirmation of the inflammatory and infection phenotype in a small animal model.

      (2) The use of bioinformatic, cellular and in vivo modeling that are consistent and support the overall conclusions is a strengthen of the study. In addition, the rigorous experimental design and data analysis including the supplemental data provided, further strengthens the evidence supporting the conclusions.

      Comments on revisions:

      The authors have previously addressed the weaknesses that were identified by this reviewer by providing rational explanation and specific references that support the findings and conclusions.

    3. Reviewer #2 (Public review):

      Summary:

      The manuscript by Chen et al addresses an important aspect of pathogenesis for mycobacterial pathogens, seeking to understand how bacterial effector proteins disrupt the host immune response. To address this question the authors sought to identify bacterial effectors from M. tuberculosis (Mtb) that localize to the host nucleus and disrupt host gene expression as a means of impairing host immune function. Their revised manuscript has strengthened their observations by performing additional experiments with BCG strains expressing tagged MgdE.

      Strengths:

      The researchers conducted a rigorous bioinformatic analysis to identify secreted effectors containing mammalian nuclear localization signal (NLS) sequences, which formed the basis of quantitative microscopy analysis to identify bacterial proteins that had nuclear targeting within human cells. The study used two complementary methods to detect protein-protein interaction: yeast two-hybrid assays and reciprocal immunoprecipitation (IP). The combined use of these techniques provides strong evidence of interactions between MgdE and SET1 components and suggests the interactions are in fact direct. The authors also carried out rigorous analysis of changes in gene expression in macrophages infected with MgdE mutant BCG. They found strong and consistent effects on key cytokines such as IL6 and CSF1/2, suggesting that nuclear-localized MgdE does in fact alter gene expression during infection of macrophages. The revised manuscript contains additional biochemical analyses of BCG strains expressing tagged MgdE that further supports their microscopy findings.

    4. Reviewer #3 (Public review):

      In this study, Chen L et al. systematically analyzed the mycobacterial nucleomodulins and identified MgdE as a key nucleomodulin in pathogenesis. They found that MgdE enters into host cell nucleus through two nuclear localization signals, KRIR108-111 and RLRRPR300-305, and then interacts with COMPASS complex subunits ASH2L and WDR5 to suppress H3K4 methylation-mediated transcription of pro-inflammatory cytokines, thereby promoting mycobacterial survival.

      Comments on revisions:

      The authors have previously adequately addressed previous concerns through additional experimentation. The revised data robustly support the main conclusions, demonstrating that MgdE engages the host COMPASS complex to suppress H3K4 methylation, thereby repressing pro-inflammatory gene expression and promoting mycobacterial survival. This work represents a significant conceptual advance.

    5. Author response:

      The following is the authors’ response to the previous reviews

      Reviewer #2 (Recommendations for the authors):

      Major:

      Over-interpretation of data. There are a few instances of this:

      The authors claim "Our work shows that MgdE interacts with both WDR5 and ASH2L and inhibits the methyltransferase activity of the COMPASS complex" (Line 318). However, they provide no biochemical analysis of methyltransferase activity to support this claim. While they cite Figure 4A-C and Figure 5, these data simply show (slightly) decreased cellular levels of H3K4Me. There are multiple ways H3K4Me could decrease including blocking recruitment of COMPASS to promoters or the enzymatic activity of MgdE itself.

      The data itself related to H3K4Me changes (Figure 5D) is difficult to interpret in light of the controls they now provide. Examining the blot itself there seems to be a massive increase in H3K4Me in control cells expressing GFP that is not reflected in the quantification that shows only a ~2x increase in GFP-expressing cells. In addition, there is very little decrease in H3K4Me in the MgdE-expressing cells relative to controls or site-mutant (no change apparent visually and ~10% change per their quantification). However, the authors interpret this as," revealed that cells expressing WT MgdE exhibited lower levels of H3K4me3". In both these cases I would recommend the authors consider modifying their interpretation of the data.

      We thank the reviewer for the comment.

      (1) We have now revised this interpretation in the manuscript as follows:

      Lines 311-312: “Our work shows that MgdE interacts with both WDR5 and ASH2L, leading to a decrease in H3K4me3 levels.”

      (2) Figure 5D presents the results of three independent biological replicates. The bar graph shows the average signal intensity of H3K4me3 normalized to the corresponding loading controls. Accordingly, we have revised the analysis and description of the experimental results.

      Lines 214-217: “Immunoblot analysis of nuclear extracts showed that cells expressing WT MgdE had ~25% lower H3K4me3 levels than EGFP-expressing cells and ~40% lower levels than those expressing the D244A/H47A mutant (Figure 5D).”

      Minor

      What is "CK"? Please clarify (Figure 2F).

      We thank the reviewer for the comment. In this context, "CK" refers to the uninfected control group, which serves as the negative control in the experiment. We have revised the label in Figure 2F.

      How many times was the BCG mouse experiment performed? This should be indicated in the figure legend? (Figure 7A).

      We thank the reviewer for the comment. The BCG mouse experiment was performed once, and we have added this information to the figure legend of Figure 7A.

      It is unclear why the secreted protein (after signal peptide removal) migrates at the same size as the full-length protein (Figure S2).

      We thank the reviewer for the comment. The precursors of secreted proteins after translation in the cytoplasm will be translated into the periplasm immediately. Therefore, MgdE or Ag85B obtained from the whole-cell lysate (Figure S2A) mostly have had the signal peptides removed. This is also validated in the case of Rv0455c secretion by Mtb (Zhang et al., Nature Communications, 2022). This explains why MgdE (or Ag85B) proteins from whole-cell lysates or from supernatants show same size in SDS-PAGE gels.

      It is still unclear why the transcripts with very little fold-change in expression (in grey) have the most significant p-values for being different (Figure 6).

      We thank the reviewer for the comment. The p-value calculation takes into account not only the magnitude of expression change but also the consistency of expression levels within each group and the number of biological replicates. When the variation among replicates is minimal, even a small difference in group means can result in a statistically significant p-value. In our RNA-seq analysis, we used DESeq2 with three biological replicates per group. DESeq2 employs a model based on the negative binomial distribution and accounts for multiple factors, including the mean expression level, within-group variance (dispersion), sample size, and normalization accuracy. As a result, it is common to observe that genes with small variability and strong consistency between replicates may show significant p-values even with modest fold changes. Conversely, genes with larger fold changes but greater variability might not reach statistical significance.

      Reference

      Zhang L, Kent JE, Whitaker M, Young DC, Herrmann D, Aleshin AE, Ko YH, Cingolani G, Saad JS, Moody DB, Marassi FM, Ehrt S, Niederweis M (2022) A periplasmic cinched protein is required for siderophore secretion and virulence of Mycobacterium tuberculosis Nat Commun 13(1):2255.

    1. eLife Assessment

      In their valuable study, Beaudet, Berger and Hendricks provide a mechanistic link between disease-associated tau hyperphosphorylation, loss of cooperative tau envelope formation on microtubules, and dysregulation of axonal transport prior to aggregation. Using complementary in vitro reconstitution and human iPSC-derived neuronal assays with phosphodeficient and phosphomimetic tau constructs targeting 14 disease-relevant sites, the authors convincingly show that phosphorylation state alters tau organization on microtubules and differentially impacts kinesin- and lysosome-based transport. The evidence is solid and well aligned with the conclusions, yet the work could be further strengthened by incorporating additional controls and motor-specific assays to refine the mechanistic depth.

    2. Reviewer #1 (Public review):

      Summary:

      This work by Beaudet and colleagues aims at exploring the effect of phosphorylation on the formation of tau envelopes and consequently on axonal transport, both in vitro on reconstituted microtubules and in human excitatory neurons derived from IPSCs.

      The authors found that a relatively widely used construct in which 14 serine or threonine residues, often hyperphosphorylated in Alzheimer's disease, are mutated to alanines (phosphodeficient), increases the density of tau envelopes compared to wildtype tau, whereas a phosphomimetic (same residues mutated to glutamic acid) reduces envelope density both in vitro and in human excitatory neurons derived from IPSCs.

      By analysing the trafficking of different kinesins (KIF1a and KIF5C), they observed different effects of tau phosphorylation status on the movement of these two motors.

      They then analyse transport of lysosomes by employing live imaging of lysotracker in human excitatory neurons derived from IPSCs transfected with wildtype, phosphodeficient or phosphomimetic tau, observing that phosphodeficient tau seems to reduce transport of lysosomes while phosphomimetic increases transport compared to wildtype tau.

      Strengths:

      (1) The work aims to study a novel and underexplored topic in the tau field, tau envelopes, and investigate their relevance to Alzheimer's disease pathology.

      (2) Experiments are well conducted and of high quality.

      Weaknesses:

      Relying only on in vitro reconstituted microtubules and human neurons derived from IPSCs leaves some doubts about the relevance of these results for Alzheimer's disease, considering the embryonic state of IPSCs-derived neurons.

    3. Reviewer #2 (Public review):

      This manuscript examines how disease-associated hyperphosphorylation disrupts tau's role as a cooperative microtubule-binding regulator of intracellular transport. Using in vitro reconstitution assays and live-cell imaging in iPSC-derived neurons, the authors employ phosphomutant tau constructs (E14 to mimic hyperphosphorylation, AP to prevent phosphorylation) at 14 disease-associated residues to isolate phosphorylation effects independent of expression system-dependent PTM heterogeneity. The results show that hyperphosphorylated tau fails to form cooperative envelope-like structures on microtubules, instead binding diffusely and dissociating rapidly. In contrast, wild-type and phospho-resistant tau form cohesive envelopes that regulate motor protein access. At the single-molecule level, hyperphosphorylation reduces KIF5C inhibition while maintaining or enhancing KIF1A inhibition through altered processivity and detachment rates. In live neurons, hyperphosphorylated tau phenocopies tau knockout conditions, weakening tau-mediated inhibition of lysosome transport and increasing processive motility. The authors quantify tau binding using Gaussian mixture model-based image analysis and measure tau kinetics via FRAP, demonstrating that hyperphosphorylation-induced loss of cooperative binding correlates with dysregulated organelle transport. These findings establish a mechanism by which phosphorylation-driven disruption of tau's gatekeeper function on microtubules compromises axonal transport prior to aggregation in tauopathies. The paper provides interesting new knowledge for the field, but there are outstanding concerns that could be further addressed by the authors to strengthen and clarify the current manuscript:

      (1) Lack of Phosphatase-Treated Control and Explicit WT Phosphorylation Quantification

      Wild-type tau expressed in insect and mammalian cells is known to be phosphorylated by endogenous kinases (eg, GSK3, CDK5, MARK). The manuscript acknowledges this in the Discussion but provides no phosphatase-treated lysate control or quantification of endogenous phosphorylation on WT tau via phospho-specific Western blots. This leaves ambiguity about whether observed differences between WT and E14 reflect purely the introduced mutations or confounding baseline differences in phosphostate content.

      (2) Limited Normalization of Motor Effects to Measured Tau Lattice Occupancy

      Although kinesin trajectories are classified inside vs. outside tau envelopes (inherently normalizing to local tau density), motor parameters are not systematically reported as functions of tau fluorescence intensity across all constructs. Co-purifying MAPs or microtubule-modifying enzymes in cell lysates is not quantified or excluded, leaving residual uncertainty about tau-specificity of observed motor inhibition. This should be at least acknowledged in the results section.

      (3) Insufficient Citation of Prior Neuronal Tau Envelope Evidence

      In the Introduction, the authors state, "it was an open question if tau forms envelopes in neurons," but this understates existing evidence. Tan et al. (2019) report tau neuronal staining consistent with envelope formation, while Siahaan et al. (2021) provide more direct evidence in non-neuronal cells. The framing should acknowledge and integrate these prior findings.

      (4) Unclear Wording on Expression System-Dependent Phosphorylation

      The sentence "The phosphostate of tau is strongly dependent on the expression system" requires rewording. It is ambiguous whether this refers to the final phosphostate achieved after expression or the inherent phosphorylating capacity of each system. Clearer language would strengthen the methodological justification.

      (5) Insufficient Quantification of Motor and Lysosome Transport Effect Magnitudes in Results Section

      The data on molecular motor motility and lysosome transport are densely described. The magnitude of effects (fold-changes, percentage differences) should be explicitly stated in the Results section when first presenting findings to orient readers to biological significance. For example, effect magnitudes for lysosome run lengths, velocities, and directional bias should be quantified in text, not left to figure inspection.

      (6) Incomplete Discussion of Projection Domain Necessity for Envelope Formation

      The Discussion states the projection domain is "a critical regulator of both tau-tau and tau-microtubule interactions," but does not engage with prior domain dissection work. Tan et al. (2019) found that the entire projection domain is not necessary for envelope formation in vitro. The authors should discuss which projection domain regions are specifically regulated by phosphorylation vs. required for cooperativity, providing a more nuanced interpretation than implied by their current framing.

    4. Author response:

      We thank the reviewers for their thoughtful and constructive feedback. Addressing these points will strengthen the manuscript and improve its clarity.

      A primary concern involved the justification for using COS7 cell lysates in reconstitution approaches and iPSC-derived neuronal model systems as models for AD. We will clarify the language throughout the manuscript to more explicitly state the study’s goals, emphasize that these systems were selected as robust, well-controlled platforms to test the mechanisms through which tau hyperphosphorylation affects microtubule interactions and tau’s role in regulating intracellular transport, and the limitations of in vitro and iPSC models.

      Reviewers also raised the possibility that background phosphorylation could contribute to the effects observed in the pseudo-phosphorylation model. We cite two recent preprints that provide insight into this question through quantitatively assessing tau phosphorylation across expression systems. In the revised manuscript, we will elaborate on how their assessment of tau phosphorylation fits within the scope of our approach and clarify how our experimental controls effectively minimize uncertainty related to background phosphorylation.

      Another point concerned the potential influence of other microtubule-associated proteins in lysates and the impact of tau lattice occupancy on motility outcomes. To further strengthen this aspect, we will include additional analyses correlating tau intensity along microtubules with kinesin intensity and motility behavior, and we will more clearly explain how the AP and WT controls provide confidence in the robustness of the system.

      Detailed responses to each reviewer comment are provided below point by point. The planned revisions, which include clearer language, stronger justification of the experimental approaches, and additional supporting analyses, will substantially improve the clarity, rationale, and overall impact of the study.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work by Beaudet and colleagues aims at exploring the effect of phosphorylation on the formation of tau envelopes and consequently on axonal transport, both in vitro on reconstituted microtubules and in human excitatory neurons derived from IPSCs.

      The authors found that a relatively widely used construct in which 14 serine or threonine residues, often hyperphosphorylated in Alzheimer's disease, are mutated to alanines (phosphodeficient), increases the density of tau envelopes compared to wildtype tau, whereas a phosphomimetic (same residues mutated to glutamic acid) reduces envelope density both in vitro and in human excitatory neurons derived from IPSCs.

      By analysing the trafficking of different kinesins (KIF1a and KIF5C), they observed different effects of tau phosphorylation status on the movement of these two motors.

      They then analyse transport of lysosomes by employing live imaging of lysotracker in human excitatory neurons derived from IPSCs transfected with wildtype, phosphodeficient or phosphomimetic tau, observing that phosphodeficient tau seems to reduce transport of lysosomes while phosphomimetic increases transport compared to wildtype tau.

      Strengths:

      (1) The work aims to study a novel and underexplored topic in the tau field, tau envelopes, and investigate their relevance to Alzheimer's disease pathology.

      (2) Experiments are well conducted and of high quality.

      Weaknesses:

      Relying only on in vitro reconstituted microtubules and human neurons derived from IPSCs leaves some doubts about the relevance of these results for Alzheimer's disease, considering the embryonic state of IPSCs-derived neurons.

      We agree with the reviewer that iPSC-derived neurons represent an immature state compared with the neurons affected in Alzheimer’s disease. However, iPSC-derived neurons, together with in vitro reconstitution, provide insight into (1) whether tau hyperphosphorylation influences its association with microtubules and its ability to form envelope-like structures thought to regulate transport, (2) how tau hyperphosphorylation affects the motility of kinesin motors that are strongly inhibited by tau, and (3) how transport of endogenous degradative organelles such as lysosomes are impacted by tau hyperphosphorylation. We hope that our studies will help to inform future studies examining how tau-related dysfunction evolves in more mature neurons and contributes to the more severe pathological effects observed at later disease stages.

      We will include a paragraph in the Discussion section addressing the limitations of this study to better contextualize our findings within the broader effort to understand tauopathies and Alzheimer’s disease.

      Reviewer #2 (Public review):

      This manuscript examines how disease-associated hyperphosphorylation disrupts tau's role as a cooperative microtubule-binding regulator of intracellular transport. Using in vitro reconstitution assays and live-cell imaging in iPSC-derived neurons, the authors employ phosphomutant tau constructs (E14 to mimic hyperphosphorylation, AP to prevent phosphorylation) at 14 disease-associated residues to isolate phosphorylation effects independent of expression system-dependent PTM heterogeneity. The results show that hyperphosphorylated tau fails to form cooperative envelope-like structures on microtubules, instead binding diffusely and dissociating rapidly. In contrast, wild-type and phospho-resistant tau form cohesive envelopes that regulate motor protein access. At the single-molecule level, hyperphosphorylation reduces KIF5C inhibition while maintaining or enhancing KIF1A inhibition through altered processivity and detachment rates. In live neurons, hyperphosphorylated tau phenocopies tau knockout conditions, weakening tau-mediated inhibition of lysosome transport and increasing processive motility. The authors quantify tau binding using Gaussian mixture model-based image analysis and measure tau kinetics via FRAP, demonstrating that hyperphosphorylation-induced loss of cooperative binding correlates with dysregulated organelle transport. These findings establish a mechanism by which phosphorylation-driven disruption of tau's gatekeeper function on microtubules compromises axonal transport prior to aggregation in tauopathies. The paper provides interesting new knowledge for the field, but there are outstanding concerns that could be further addressed by the authors to strengthen and clarify the current manuscript:

      (1) Lack of Phosphatase-Treated Control and Explicit WT Phosphorylation Quantification

      Wild-type tau expressed in insect and mammalian cells is known to be phosphorylated by endogenous kinases (eg, GSK3, CDK5, MARK). The manuscript acknowledges this in the Discussion but provides no phosphatase-treated lysate control or quantification of endogenous phosphorylation on WT tau via phospho-specific Western blots. This leaves ambiguity about whether observed differences between WT and E14 reflect purely the introduced mutations or confounding baseline differences in phosphostate content.

      Tau contains ~85 putative phosphorylation sites and is modified by several kinases in cells. Studies by Siahaan et al. (2024) and Fan et al. (2025) provide detailed insight into tau phosphorylation, its role in protecting the microtubule lattice from severing enzymes, and the implications of phosphorylation patterns for aggregate formation. Specifically, Fan et al. (2025) show that HEK-expressed tau is phosphorylated by endogenous kinases at 58 residues, with most phospho-occupancy levels below 15%, indicating substantial heterogeneity among individual tau molecules. In the revised manuscript, we will (1) provide justification for the use of the pseudo-phosphorylation model system as an approach to limit heterogeneity among tau molecules, (2) clarify the importance of the WT and AP controls, (3) discuss that E14, WT, and AP tau likely exhibit similar degrees of background phospho-heterogeneity, with WT tau likely exhibiting some overlap between background phosphorylation and the 14 AD-associated sites examined, and (4) expand the discussion to emphasize that although background phosphorylation is present, our results do not suggest that it contributes significantly to the observations reported in this study.

      (2) Limited Normalization of Motor Effects to Measured Tau Lattice Occupancy

      Although kinesin trajectories are classified inside vs. outside tau envelopes (inherently normalizing to local tau density), motor parameters are not systematically reported as functions of tau fluorescence intensity across all constructs. Co-purifying MAPs or microtubule-modifying enzymes in cell lysates is not quantified or excluded, leaving residual uncertainty about tau-specificity of observed motor inhibition. This should be at least acknowledged in the results section.

      The reviewer raises a valid point. It is challenging to compare conditions where the occupancy of tau on microtubules is similar across conditions, as phosphorylation strongly effects the interaction between tau and microtubules. We will quantify and report tau intensity in single-molecule motility assays. On the second point, while effects from other MAPs or motor proteins could potentially affect kinesin motility, we would expect that these effects would be similar for all tau phosphomutant constructs, such that the effect of tau phospho-states on kinesin motility can be assessed.

      (3) Insufficient Citation of Prior Neuronal Tau Envelope Evidence

      In the Introduction, the authors state, "it was an open question if tau forms envelopes in neurons," but this understates existing evidence. Tan et al. (2019) report tau neuronal staining consistent with envelope formation, while Siahaan et al. (2021) provide more direct evidence in non-neuronal cells. The framing should acknowledge and integrate these prior findings.

      We agree with the reviewer that evidence from several studies using reconstitution systems, fixed neurons, and live cultured cells provides evidence of tau envelope formation in neurons. Specifically, tau envelopes have been observed along taxol-stabilized or GMPCPP-capped GDP microtubules in vitro (e.g., Dixit et al., 2008; Monroy et al., 2018; Tan et al., 2019; Siahaan et al., 2019), in 4% PFA-fixed and Triton X-100–extracted DIV7 mouse hippocampal neurons (Tan et al., 2019), and in live, non-neuronal U-2 OS cells following taxol treatment (Siahaan et al., 2022) or elevated pH (Siahaan et al., 2024). However, to our knowledge, our study is the first to demonstrate tau envelope formation in live neuronal cells under normal cell culture conditions. We will revise this sentence in the manuscript to more precisely position our findings within the context of prior studies.

      (4) Unclear Wording on Expression System-Dependent Phosphorylation

      The sentence "The phosphostate of tau is strongly dependent on the expression system" requires rewording. It is ambiguous whether this refers to the final phosphostate achieved after expression or the inherent phosphorylating capacity of each system. Clearer language would strengthen the methodological justification.

      We agree that the wording here is ambiguous and requires clarification. In the revised manuscript, we will clarify that tau phosphorylation depends on the expression system used; bacterial systems lack the capacity for many post-translational modifications compared with insect and mammalian systems. We will also emphasize that in insect and mammalian expression systems, tau phosphorylation occurs heterogeneously, as demonstrated in previous studies by Siahaan et al. (2024) and Fan et al. (2025).

      (5) Insufficient Quantification of Motor and Lysosome Transport Effect Magnitudes in Results Section

      The data on molecular motor motility and lysosome transport are densely described. The magnitude of effects (fold-changes, percentage differences) should be explicitly stated in the Results section when first presenting findings to orient readers to biological significance. For example, effect magnitudes for lysosome run lengths, velocities, and directional bias should be quantified in text, not left to figure inspection.

      Our initial justification for omitting quantitative data from the results text was to improve readability; however, in doing so, we may have reduced the accessibility and clarity regarding the significance of the findings. In the revised manuscript, we will incorporate the relevant quantifications and statistical significance for the motility data in the text.

      (6) Incomplete Discussion of Projection Domain Necessity for Envelope Formation

      The Discussion states the projection domain is "a critical regulator of both tau-tau and tau-microtubule interactions," but does not engage with prior domain dissection work. Tan et al. (2019) found that the entire projection domain is not necessary for envelope formation in vitro. The authors should discuss which projection domain regions are specifically regulated by phosphorylation vs. required for cooperativity, providing a more nuanced interpretation than implied by their current framing.

      We agree with the reviewer. Tan et al. (2019) demonstrated that the proline-rich region (residues 198–244) within the projection domain of full-length 2N4R tau is the minimal region required to maintain tau’s ability to form envelopes along microtubules. We will incorporate this work on the dissection of the projection domain and discuss how the phosphorylation sites examined in our study are primarily located within this region. Together, these data highlight the proline-rich region as a potential major regulator of tau–tau cooperativity.

    1. eLife Assessment

      This study is a valuable contribution that comprehensively identifies and characterizes LC3B-binding peptides through a bacterial cell-surface display screen covering approximately 500,000 human peptides. The data presented are solid, although this approach has limitations (e.g., it cannot assess the effects of post-translational modifications, which are often relevant to LIR-mediated interactions). Validation of the newly identified binding peptides by demonstrating their interactions with full-length proteins in cells would further strengthen this manuscript.

    2. Reviewer #1 (Public review):

      Summary:

      This study uses high-throughput bacterial cell-surface display to identify LC3B-interacting peptides in the human proteome. The screen is unbiased, and this type of assay has not previously been used for selecting LC3B-interacting peptides. The screen was done with a library of 500,000 peptides, and they ended up with 427 peptides that they scored as high-confidence LC3B binders. The experiments performed are solid, and data are analyzed using well-documented methods and statistics.

      The aim of the authors was to isolate LC3B-interacting peptides from the human proteome, and the screen succeeded in doing so. The selected set of peptides included several previously reported LIR motifs, but also many novel LC3B binding peptides that either contained or did not contain the canonical core LIR motif [WFY]xx[LVI].

      Another aim was to identify binding determinants important for the LC3B interaction, and they made an interesting sequence logo based on selected LIR-containing peptides. However, this study does not really extend our knowledge related to binding determinants essential for LIR motifs in LC3B binding. They basically verify known characteristics, including the importance of varied types of electrostatic interactions supporting the docking of the core LIR into the LDS of LC3B.

      Strengths:

      The approach used here (high-throughput bacterial-surface-display) is new. The screen is unbiased, and the fact that peptides are directly tested for LC3B binding may facilitate the discovery of non-canonical LIR motifs. The screen appears to be highly selective and manages to distinguish between peptides that interact with LC3B and peptides that do not interact.

      Weaknesses:

      It is a limitation that no proteins are analyzed in this study. Further work is therefore needed to verify that identified LIR motifs are functional in full-length proteins and in cells.

    3. Reviewer #2 (Public review):

      Summary:

      To discover peptides that interact with autophagy-related protein LC3B and profile the key binding determinants, the authors screened a library of ~500,000 36-residue peptides derived from the human proteome using bacterial cell-surface display. Analysis of the screening data revealed exceptions to the reported LIR motif and a strong preference for negatively charged residues adjacent to the LIR.<br /> These results support a refinement of the LIR motif definition and expand the network of candidate LC3B interaction partners.

      Strengths:

      High-throughput approach.

      Weaknesses:

      Lack of in vitro data and molecular dynamics simulations.

    4. Reviewer #3 (Public review):

      Summary:

      The LC3 family of proteins, which includes LC3B, are ubiquitin-like proteins that are covalently linked to phosphatidylethanolamine in the expanding autophagosomal membrane during autophagy. LC3 family members bind to short sequences of amino acids that reside within dynamic regions in a wide variety of proteins. These sequences, termed LC3 Interacting Regions (LIRs), were initially thought to function primarily to link LIR-containing autophagy cargo receptors to LC3 family members to help facilitate their capture during autophagy. However, the functional importance of LIRs in autophagy has broadened to include more general functions in autophagy as well. While a general consensus for LIR sequences has been described as [FWY]0-X1-X2-[LVI]3, recent work has suggested that additional sequences outside of the canonical LIR sequence can bind LC3 family members and play important roles in autophagy. In this manuscript by Kosmatka et al, the authors perform a high-throughput screen using bacterial surface display coupled with fluorescence-associated cell sorting to identify which human sequences can bind to LC3B. They identify a variety of peptides capable of binding LC3B, including peptides from proteins that have not previously been described as LC3B-binding proteins. The results from the bacterial surface display were then used to guide sequence analysis, mutational analysis, and structural studies to further characterize the range of LIR sequences that are capable of binding LC3B. Taken together, this work adds to the growing knowledge of how LIR sequences interact with LC3 family members and demonstrates which amino acids both inside and outside of the LIR sequence aid in binding. This work also identifies new potential LC3 binding proteins, which may play unknown roles in autophagy regulation. Lastly, this work reinforces the importance of alternative LIR sequences such as the [WFY]0-X1-X2-[WFY]3 sequence, which the authors have dubbed the LIR+ sequence.

      Strengths:

      The manuscript uses a robust approach to identify and characterize different peptide sequences that can interact with LC3B. They validate a large number of sequences using biolayer interferometry (BLI) and attempt to correlate different amino acids with their binding affinity for LC3B. The large number of LC3B binding sequences and their dissociation constants adds significant new information to the field that will help others understand what sequences can bind to LC3B. The authors are also very careful to accurately report on their data and not overly interpret their findings.

      Weaknesses:

      After the authors identify proteins from their bacterial display assay, the remainder of the manuscript is focused on characterizing the different types of sequences that are identified in addition to validating the LC3B-LIR interactions using biochemical approaches, including BLI and X-ray crystallography. However, it's not entirely clear if the screen identified novel LC3B binders that interact with LC3B in cells. While I acknowledge that the focus of the manuscript is on the characterization of LIR sequences that can bind LC3B, it seems like a missed opportunity not to validate a few of the novel LC3B binders in vivo. This may result in the demonstration of novel binders of LC3B in cells and may further demonstrate the strength of this approach for identifying LC3 family member binding partners. Therefore, it would be helpful to look at a few proteins identified in the HC set that have not previously been identified as LC3B binders in cells to determine if they CO-IP with LC3B or interact with LC3B using a different approach.

    1. eLife Assessment

      The work convincingly demonstrates the role of the mycobacterial secreted effector protein MmpE, which translocates to the host nucleus and exhibits phosphatase activity. The study is particularly valuable in showing that both the nuclear localization signal sequences and residues critical for phosphatase function are essential for host gene regulation, lysosomal biogenesis, and intracellular survival. Future studies will be needed to explore additional host pathways modulated by MmpE, particularly in the context of infection with a fully virulent Mycobacterium tuberculosis strain.

    2. Reviewer #1 (Public review):

      Summary:

      The study provides insightful characterization of the mycobacterial secreted effector protein MmpE which translocates to the host nucleus and exhibits phosphatase activity. The study characterizes the nuclear localization signal sequences and residues critical for the phosphatase activity, both of which are required for intracellular survival

      Strengths:

      (1) The study addresses the role of nucleomodulins, an understudied aspect in mycobacterial infections.

      (2) The authors employ a combination of biochemical and computational analyses along with in vitro and in vivo validations to characterize the role of MmpE.

      Weaknesses:

      (1) While the study establishes that the phosphatase activity of MmpE operates independently of its NLS, there is a clear gap in understanding how this phosphatase activity supports mycobacterial infection. The investigation lacks experimental data on specific substrates of MmpE or pathways influenced by this virulence factor.

      (2) The study does not explore whether the phosphatase activity of MmpE is dependent on the NLS within macrophages, which would provide critical insights into its biological relevance in host cells. Conducting experiments with double knockout/mutant strains and comparing their intracellular survival with single mutants could elucidate these dependencies and further validate the significance of MmpE's dual functions.

      (3) The study does not provide direct experimental validation of the MmpE deletion on lysosomal trafficking of the bacteria.

      (4) The role of MmpE as a mycobacterial effector would be more relevant using virulent mycobacterial strains such as H37Rv.

      Comments on revisions:

      I appreciate the work the authors have done to address reviewers comments. The revised manuscript looks significantly improved. My major concern in the revised version is the microscopy data where the BCG staining using the DiD fluorescent stain does not bring out the rod-shaped bacilli structure. I suggest the authors either use a GFP reporter or some other fluorescent stain to address this issue.

    3. Reviewer #2 (Public review):

      Summary:

      In this paper, the authors have characterized Rv2577 as a Fe3+/Zn2+ -dependent metallophosphatase and a nucleomodulin protein. The authors have also identified His348 and Asn359 as critical residues for Fe3+ coordination. The authors show that the proteins encode for two nuclease localization signals. Using C-terminal Flag expression constructs, the authors have shown that MmpE protein is secretory. The authors have prepared genetic deletion strains and show that MmpE is essential for intracellular survival of M. bovis BCG in THP-1 macrophages, RAW264.7 macrophages and mice model of infection. The authors have also performed RNA-seq analysis to compare the transcriptional profiles of macrophages infected with wild type and mmpE mutant strain. The relative levels of ~ 175 transcripts were altered in mmpE mutant infected macrophages and majority of these were associated with various immune and inflammatory signalling pathways. Using these deletion strains, the authors proposed that MmpE inhibits inflammatory gene expression by binding to the promoter region of vitamin D receptor. The authors also showed that MmpE arrests phagosome maturation by regulating the expression of several lysosome associated genes such as TFEB, LAMP1, LAMP2 etc. These findings reveal a sophisticated mechanism by which a bacterial effector protein manipulates gene transcription and promotes intracellular survival.

      Strength:

      The authors have used a combination of cell biology, microbiology and transcriptomics to elucidate the mechanisms by which Rv2577 contributes to intracellular survival.

      Weakness:

      The authors should thoroughly check the mice data and show individual replicate values in bar graphs.

      Comments on revisions:

      Thanks to the authors for addressing the concerns raised during the review of the original manuscript. The data is now presented with clarity, and discrepancies in mouse experiments have also been addressed with additional experiments.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript titled "Mycobacterial Metallophosphatase MmpE Acts as a Nucleomodulin to Regulate Host Gene Expression and Promote Intracellular Survival", Chen et al describe biochemical characterisation, localisation and potential functions of the gene using a genetic approach in M. bovis BCG and perform macrophage and mice infections to understand the roles of this potentially secreted protein in the host cell nucleus. The findings demonstrate the role of a secreted phosphatase of M. bovis BCG in shaping the transcriptional profile of infected macrophages, potentially through nuclear localisation and direct binding to transcriptional start sites, thereby regulating the inflammatory response to infection.

      Strengths:

      The authors demonstrate using a transient transfection method that MmpE when expressed as a GFP-tagged protein in HEK293T cells, exhibits nuclear localisation. The authors identify two NLS motifs that together are required for nuclear localisation of the protein. A deletion of the gene in M. bovis BCG results in poorer survival compared to the wild type parent strain, which is also killed by macrophages. Relative to the WT strain infected macrophages, macrophages infected with the mmpE strain exhibited differential gene expression. Overexpression of the gene in HEK293T led to occupancy of the transcription start site of several genes, including the Vitamin D Receptor. Expression of VDR in THP1 macrophages was lower in case of mmpE infection compared to WT infection. This data supports the utility of the overexpression system in identifying potential target loci of MmpE using the HEK293T transfection model. The authors also demonstrate that the protein is a phosphatase and the phosphatase activity of the protein is partially required for bacterial survival but not for regulation of the VDR gene expression.

      Weaknesses:

      There are significant differences in lysosomal retention between M. tuberculosis and M. bovis BCG. This study uses BCG and MMPE overexpression to draw conclusions about the impact of the MMPE gene on host gene expression and the bacteria's lysosomal localisation. While the authors have convincingly supported their claims with this model system, the relevance of this mechanism in M. tuberculosis infection remains unaddressed.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Review of the manuscript titled " Mycobacterial Metallophosphatase MmpE acts as a nucleomodulin to regulate host gene expression and promotes intracellular survival".

      The study provides an insightful characterization of the mycobacterial secreted effector protein MmpE, which translocates to the host nucleus and exhibits phosphatase activity. The study characterizes the nuclear localization signal sequences and residues critical for the phosphatase activity, both of which are required for intracellular survival.

      Strengths:

      (1) The study addresses the role of nucleomodulins, an understudied aspect in mycobacterial infections.

      (2) The authors employ a combination of biochemical and computational analyses along with in vitro and in vivo validations to characterize the role of MmpE.

      Weaknesses:

      (1) While the study establishes that the phosphatase activity of MmpE operates independently of its NLS, there is a clear gap in understanding how this phosphatase activity supports mycobacterial infection. The investigation lacks experimental data on specific substrates of MmpE or pathways influenced by this virulence factor.

      We thank the reviewer for this insightful comment and agree that identification of the substrates of MmpE is important to fully understand its role in mycobacterial infection. MmpE is a putative purple acid phosphatase (PAP) and a member of the metallophosphoesterase (MPE) superfamily. Enzymes in this family are known for their catalytic promiscuity and broad substrate specificity, acting on phosphomonoesters, phosphodiesters, and phosphotriesters (Matange et al., Biochem J, 2015). In bacteria, several characterized MPEs have been shown to hydrolyze substrates such as cyclic nucleotides (e.g., cAMP) (Keppetipola et al., J Biol Chem, 2008; Shenoy et al., J Mol Biol, 2007), nucleotide derivatives (e.g., AMP, UDP-glucose) (Innokentev et al., mBio, 2025), and pyrophosphate-containing compounds (e.g., Ap4A, UDP-DAGn) (Matange et al., Biochem J., 2015). Although the binding motif of MmpE has been identified, determining its physiological substrates remains challenging due to the low abundance and instability of potential metabolites, as well as the limited sensitivity and coverage of current metabolomic technologies in mycobacteria.

      (2) The study does not explore whether the phosphatase activity of MmpE is dependent on the NLS within macrophages, which would provide critical insights into its biological relevance in host cells. Conducting experiments with double knockout/mutant strains and comparing their intracellular survival with single mutants could elucidate these dependencies and further validate the significance of MmpE's dual functions.

      We thank the reviewer for the comment. Deletion of the NLS motifs did not impair MmpE’s phosphatase activity in vitro (Figure 2F), indicating that MmpE's enzymatic function operates independently of its nuclear localization. Indeed, we confirmed that Fe<sup>3+</sup>-binding ability via the residues H348 and N359 is required for enzymatic activity of MmpE. We have expanded on this point in the Discussion section “MmpE is a bifunctional virulence factor in Mtb”.

      (3) The study does not provide direct experimental validation of the MmpE deletion on lysosomal trafficking of the bacteria.

      We thank the reviewer for the comment. To validate the role of MmpE in lysosome maturation during infection, we conducted fluorescence colocalization assays in THP-1 macrophages infected with BCG strains, including WT, ∆MmpE, Comp-MmpE, Comp-MmpE<sup>ΔNLS1</sup>, Comp-MmpE<sup>ΔNLS2</sup>, Comp-MmpE<sup>ΔNLS1-2</sup>. These strains were stained with the lipophilic membrane dye DiD, while macrophages were treated with the acidotropic probe LysoTracker<sup>TM</sup> Green (Martins et al., Autophagy, 2019). The result indicated that ΔMmpE and MmpE<sup>NLS1-2</sup> mutants exhibited significantly higher co-localization with LysoTracker compared to WT and Comp-MmpE strains (New Figure 5G), suggesting that MmpE deletion leads to enhanced lysosomal maturation during infection.

      (4) The role of MmpE as a mycobacterial effector would be more relevant using virulent mycobacterial strains such as H37Rv.

      We thank the reviewer for the comment. Previously, the role of Rv2577/MmpE as a virulence factor has been demonstrated in M. tuberculosis CDC 1551, where its deletion significantly reduced bacterial replication in mouse lungs at 30 days post-infection (Forrellad et al., Front Microbiol, 2020). However, that study did not explore the underlying mechanism of MmpE function. In our study, we found that MmpE enhances M. bovis BCG survival in macrophages (THP-1 and RAW264.7 both) and in mice (Figure 3, Figure 7A), consistent with its proposed role in virulence. To investigate the molecular mechanism by which MmpE promotes intracellular survival, we used M. bovis BCG as a biosafe surrogate and this model is widely accepted for studying mycobacterial pathogenesis (Wang et al., Nat Immunol, 2015; Wang et al., Nat Commun, 2017; Péan et al., Nat Commun, 2017).

      Reviewer #2 (Public review):

      Summary:

      In this paper, the authors have characterized Rv2577 as a Fe3+/Zn2+ -dependent metallophosphatase and a nucleomodulin protein. The authors have also identified His348 and Asn359 as critical residues for Fe3+ coordination. The authors show that the proteins encode for two nuclease localization signals. Using C-terminal Flag expression constructs, the authors have shown that the MmpE protein is secretory. The authors have prepared genetic deletion strains and show that MmpE is essential for intracellular survival of M. bovis BCG in THP-1 macrophages, RAW264.7 macrophages, and a mouse model of infection. The authors have also performed RNA-seq analysis to compare the transcriptional profiles of macrophages infected with wild-type and MmpE mutant strains. The relative levels of ~ 175 transcripts were altered in MmpE mutant-infected macrophages and the majority of these were associated with various immune and inflammatory signalling pathways. Using these deletion strains, the authors proposed that MmpE inhibits inflammatory gene expression by binding to the promoter region of a vitamin D receptor. The authors also showed that MmpE arrests phagosome maturation by regulating the expression of several lysosome-associated genes such as TFEB, LAMP1, LAMP2, etc. These findings reveal a sophisticated mechanism by which a bacterial effector protein manipulates gene transcription and promotes intracellular survival.

      Strength:

      The authors have used a combination of cell biology, microbiology, and transcriptomics to elucidate the mechanisms by which Rv2577 contributes to intracellular survival.

      Weakness:

      The authors should thoroughly check the mice data and show individual replicate values in bar graphs.

      We kindly appreciate the reviewer for the advice. We have now updated the relevant mice data in the revised manuscript.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript titled "Mycobacterial Metallophosphatase MmpE Acts as a Nucleomodulin to Regulate Host Gene Expression and Promote Intracellular Survival", Chen et al describe biochemical characterisation, localisation and potential functions of the gene using a genetic approach in M. bovis BCG and perform macrophage and mice infections to understand the roles of this potentially secreted protein in the host cell nucleus. The findings demonstrate the role of a secreted phosphatase of M. bovis BCG in shaping the transcriptional profile of infected macrophages, potentially through nuclear localisation and direct binding to transcriptional start sites, thereby regulating the inflammatory response to infection.

      Strengths:

      The authors demonstrate using a transient transfection method that MmpE when expressed as a GFP-tagged protein in HEK293T cells, exhibits nuclear localisation. The authors identify two NLS motifs that together are required for nuclear localisation of the protein. A deletion of the gene in M. bovis BCG results in poorer survival compared to the wild-type parent strain, which is also killed by macrophages. Relative to the WT strain-infected macrophages, macrophages infected with the ∆mmpE strain exhibited differential gene expression. Overexpression of the gene in HEK293T led to occupancy of the transcription start site of several genes, including the Vitamin D Receptor. Expression of VDR in THP1 macrophages was lower in the case of ∆mmpE infection compared to WT infection. This data supports the utility of the overexpression system in identifying potential target loci of MmpE using the HEK293T transfection model. The authors also demonstrate that the protein is a phosphatase, and the phosphatase activity of the protein is partially required for bacterial survival but not for the regulation of the VDR gene expression.

      Weaknesses:

      (1) While the motifs can most certainly behave as NLSs, the overexpression of a mycobacterial protein in HEK293T cells can also result in artefacts of nuclear localisation. This is not unprecedented. Therefore, to prove that the protein is indeed secreted from BCG, and is able to elicit transcriptional changes during infection, I recommend that the authors (i) establish that the protein is indeed secreted into the host cell nucleus, and (ii) the NLS mutation prevents its localisation to the nucleus without disrupting its secretion.

      We kindly appreciate the reviewer for this insightful comment. To confirm the translocation of MmpE into the host nucleus during BCG infection, we first detected the secretion of MmpE by M. bovis BCG, using Ag85B as a positive control and GlpX as a negative control (Zhang et al., Nat commun, 2022). Our results showed that MmpE- Flag was present in the culture supernatant, indicating that MmpE is secreted by BCG indeed (new Figure S1C).

      Next, we performed immunoblot analysis of the nuclear fractions from infected THP-1 macrophages expressing FLAG-tagged wild-type MmpE and NLS mutants. The results revealed that only wild-type MmpE was detected in the nucleus, while MmpE<sup>ΔNLS1</sup>, MmpE<sup>ΔNLS2</sup> and MmpE<sup>ΔNLS1-2</sup> were not detectable in the nucleus (New Figure S1D). Taken together, these findings demonstrated that MmpE is a secreted protein and that its nuclear translocation during infection requires both NLS motifs.

      Demonstration that the protein is secreted: Supplementary Figure 3 - Immunoblotting should be performed for a cytosolic protein, also to rule out detection of proteins from lysis of dead cells. Also, for detecting proteins in the secreted fraction, it would be better to use Sauton's media without detergent, and grow the cultures without agitation or with gentle agitation. The method used by the authors is not a recommended protocol for obtaining the secreted fraction of mycobacteria.

      We kindly appreciate the reviewer for the advice. To avoid the effects of bacterial lysis, we cultured the BCG strains expressing MmpE-Flag in Middlebrook 7H9 broth with 0.5% glycerol, 0.02% Tyloxapol, and 50 µg/mL kanamycin at 37 °C with gentle agitation (80 rpm) until an OD<sub>600</sub> of approximately 0.6 (Zhang et al., Nat Commun, 2022). Subsequently, we assessed the secretion of MmpE-Flag in the culture supernatant, using Ag85B as a positive control and GlpX as a negative control (New Figure S1C). The results showed that GlpX was not detected in the supernatant, while MmpE and Ag85B were detected, indicating that MmpE is indeed a secreted protein in BCG.

      Demonstration that the protein localises to the host cell nucleus upon infection: Perform an infection followed by immunofluorescence to demonstrate that the endogenous protein of BCG can translocate to the host cell nucleus. This should be done for an NLS1-2 mutant expressing cell also.

      We thank the reviewer for the suggestion. We agree that this experiment would be helpful to further verify the ability of MmpE for nuclear import. However, MmpE specific antibody is not available for us for immunofluorescence experiment. Alternatively, we performed nuclear-cytoplasmic fractionation for the THP-1 cells infected with the M. bovis BCG strains expressing FLAG-tagged wild-type MmpE, as well as NLS deletion mutants (MmpE<sup>ΔNLS1</sup>, MmpE<sup>ΔNLS2</sup>, and MmpE<sup>ΔNLS1-2</sup>). The WT MmpE is detectable in both cytoplasmic and nuclear compartments, while MmpE<sup>ΔNLS1</sup>, MmpE<sup>ΔNLS2</sup> or MmpE<sup>ΔNLS1-2</sup> were almost undetectable in nuclear fractions (New Figure S1D), suggesting that both NLS motifs are necessary for nuclear import.

      (2) In the RNA-seq analysis, the directionality of change of each of the reported pathways is not apparent in the way the data have been presented. For example, are genes in the cytokine-cytokine receptor interaction or TNF signalling pathway expressed more, or less in the ∆mmpE strain?

      We thank the reviewer for the comment. The KEGG pathway enrichment diagrams in our RNA-seq analysis primarily reflect the statistical significance of pathway enrichment based on differentially expressed genes, but do not indicate the directionality of genes expression changes. To address this concern, we conducted qRT-PCR on genes associated with the cytokine-cytokine receptor interaction pathway, specifically IL23A, CSF2, and IL12B. The results showed that, compared to the WT strain, infection with the ΔMmpE strain resulted in significantly increased expression levels of these genes in THP-1 cells (Figure 4F, Figure S4B), consistent with the RNA-seq data. Furthermore, we have submitted the complete RNA-seq dataset to the NCBI GEO repository [GSE312039], which includes normalized expression values and differential expression results for all detected genes.

      (3) Several of these pathways are affected as a result of infection, while others are not induced by BCG infection. For example, BCG infection does not, on its own, produce changes in IL1β levels. As the author s did not compare the uninfected macrophages as a control, it is difficult to interpret whether ∆mmpE induced higher expression than the WT strain, or simply did not induce a gene while the WT strain suppressed expression of a gene. This is particularly important because the strain is attenuated. Does the attenuation have anything to do with the ability of the protein to induce lysosomal pathway genes? Does induction of this pathway lead to attenuation of the strain? Similarly, for pathways that seem to be downregulated in the ∆mmpE strain compared to the WT strain, these might have been induced upon infection with the WT strain but not sufficiently by the ∆mmpE strain due to its attenuation/ lower bacterial burden.

      We thank the reviewer for the comment. Previous studies have shown that wild-type BCG induces relatively low levels of IL-1β, while retaining partial capacity to activate the inflammasome (Qu et al., Sci Adv, 2020). Our data (Figures 3G) show that infection with the ΔMmpE strain results in enhanced IL-1β expression, consistent with findings by Master et al. (Cell Host Microbe, 2008), in which deletion of zmp1 in BCG or M. tuberculosis led to increased IL-1β levels due to reduced inhibition of inflammasome activation.

      In the revised manuscript, we have provided additional qRT-PCR data using uninfected macrophages as a baseline control. These results demonstrate that the WT strain suppresses lysosome-associated gene expression, whereas the ΔMmpE strain upregulates these genes, indicating that MmpE inhibits lysosome-related genes expression (Figure 4G). Furthermore, bacterial burden analysis revealed that ∆mmpE exhibited ~3-fold lower intracellular survival than the WT strain in THP-1 cells. However, when lysosomal maturation was inhibited, the difference in bacterial load between the two strains was reduced to ~1-fold (New Figures S6B and C). These findings indicate that MmpE promotes intracellular survival primarily by inhibiting lysosomal maturation, which is consistent with a previous study (Chandra et al., Sci Rep, 2015).

      (4) CHIP-seq should be performed in THP1 macrophages, and not in HEK293T. Overexpression of a nuclear-localised protein in a non-relevant line is likely to lead to several transcriptional changes that do not inform us of the role of the gene as a transcriptional regulator during infection.

      We thank the reviewer for the comment. We performed ChIP-seq in HEK293T cells based on their high transfection efficiency, robust nuclear protein expression, and well-annotated genome (Lampe et al., Nat Biotechnol, 2024; Marasco et al., Cell, 2022). These characteristics make HEK293T an ideal system for the initial identification of genome-wide chromatin binding profiles by MmpE.

      Further, we performed comprehensive validation of the ChIP-seq findings in THP-1 macrophages. First, CUT&Tag and RNA-seq analyses in THP-1 cells revealed that MmpE modulates genes involved in the PI3K–AKT signaling and lysosomal maturation pathways (Figure 4C; Figure S5A-B). Correspondingly, we found that infection with the ΔMmpE strain led to reduced phosphorylation of AKT (S473), mTOR (S2448), and p70S6K (T389) (New Figure 5E-F), and upregulation of lysosomal genes such as TFEB, LAMP1, and LAMP2 (Figure 4G), compared to infection with the WT strain, and lysosomal maturation in cells infected with the ΔMmpE strain more obviously (New Figure 5G). Additionally, CUT&Tag profiling identified MmpE binding at the promoter region of the VDR gene, which was further validated by EMSA and ChIP-qPCR. Also, qRT-PCR demonstrated that MmpE suppresses VDR transcription, supporting its role as a transcriptional regulator (Figure 6). Collectively, these data confirm the biological relevance and functional significance of the ChIP-seq findings obtained in HEK293T cells.

      (5) I would not expect to see such large inflammatory reactions persisting 56 days post-infection with M. bovis BCG. Is this something peculiar for an intratracheal infection with 1x107 bacilli? For images of animal tissue, the authors should provide images of the entire lung lobe with the zoomed-in image indicated as an inset.

      We thank the reviewer for the comment. The lung inflammation peaked at days 21–28 and had clearly subsided by day 56 across all groups (New Figure 7B), consistent with the expected resolution of immune responses to an attenuated strain like M. bovis BCG. This temporal pattern is in line with previous studies using intravenous or intratracheal BCG vaccination in mice and macaques, which also demonstrated robust early immune activation followed by resolution over time (Smith et al., Nat Microbiol, 2025; Darrah et al., Nature, 2020).

      In this study, the infectious dose (1×10<sup>7</sup> CFU intratracheal) was selected based on previous studies in which intratracheal delivery of 1×10<sup>7</sup> CFU produced consistent and measurable lung immune responses and pathology without causing overt illness or mortality (Xu et al., Sci Rep, 2017; Niroula et al., Sci Rep, 2025). We have provided whole-lung lobe images with zoomed-in insets in the source dataset.

      (6) For the qRT-PCR based validation, infections should be performed with the MmpE-complemented strain in the same experiments as those for the WT and ∆mmpE strain so that they can be on the same graph, in the main manuscript file. Supplementary Figure 4 has three complementary strains. Again, the absence of the uninfected, WT, and ∆mmpE infected condition makes interpretation of these data very difficult.

      We thank the reviewer for the comment. As suggested, we have conducted the qRT-PCR experiment including the uninfected, WT, ∆mmpE, Comp-MmpE, and the three complementary strains infecting THP-1 cells (Figure 4F and G; New Figure S4B–D).

      (7) The abstract mentions that MmpE represses the PI3K-Akt-mTOR pathway, which arrests phagosome maturation. There is not enough data in this manuscript in support of this claim. Supplementary Figure 5 does provide qRT-PCR validation of genes of this pathway, but the data do not indicate that higher expression of these pathways, whether by VDR repression or otherwise, is driving the growth restriction of the ∆mmpE strain.

      We thank the reviewer for the comment. In the updated manuscript, we have provided more evidence. First, the RNA-seq analysis indicated that MmpE affects the PI3K-AKT signaling pathway (Figure 4C). Second, CUT&Tag analysis suggested that MmpE binds to the promoter regions of key pathway components, including PRKCBPLCG2, and PIK3CB (Figure S5A). Third, confocal microscopy showed that ΔMmpE strain promotes significantly increased lysosomal maturation compared to the WT, a process downstream of the PI3K-AKT-mTOR axis (New Figure 5G).

      Further, we measured protein phosphorylation for validating activation of the pathway (Zhang et al., Stem Cell Reports, 2017). Our results showed that cells infected with WT strains exhibited significantly higher phosphorylation of Akt, mTOR, and p70S6K compared to those infected with ΔMmpE strains (New Figures 5E and F). Moreover, the dual PI3K/mTOR inhibitor BEZ235 abolished the survival advantage of WT strains over ΔMmpE mutants in THP-1 macrophages (New Figure S6B and C). Collectively, these results support that MmpE activates the PI3K–Akt–mTOR signaling pathway to enhance bacterial survival within the host.

      (8) The relevance of the NLS and the phosphatase activity is not completely clear in the CFU assays and in the gene expression data. Firstly, there needs to be immunoblot data provided for the expression and secretion of the NLS-deficient and phosphatase mutants. Secondly, CFU data in Figure 3A, C, and E must consistently include both the WT and ∆mmpE strain.

      We thank the reviewer for the comment. We have now added immunoblot analysis for expression and secretion of MmpE mutants. The result show that NLS-deficient and phosphatase mutants can detected in supernatant (New Figure S1C). Additionally, we have revised Figures 3A, 3C, and 3E to consistently include both the WT and ΔMmpE strains in the CFU assays (Figures 3A, 3C, and 3E).

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The authors should attempt to address the following comments:

      (1) Please perform densitometric analysis for the western blot shown in Figure 1E.

      We sincerely thank the reviewer for the suggestion. In the updated manuscript, we have performed densitometric analysis of the western blot shown in New Figure 1F and G.

      (2) Is it possible to measure the protein levels for MmpE in lysates prepared from infected macrophages.

      We thank the reviewer for the comment. In the revised manuscript, we performed immunoblot analysis to measure MmpE levels in lysates from infected macrophages. The results demonstrated that wild-type MmpE was present in both the cytoplasmic and nuclear fractions during infection in THP-1 cells (New Figure S1D).

      (3) The authors should perform circular dichroism studies to compare the secondary structure of wild type and mutant proteins (in particular MmpEHis348 and MmpEAsn359.

      We thank the reviewer for this valuable suggestion. We agree that circular dichroism spectroscopy could provide useful information in comparison of the differences on the secondary structures. However, due to the technical limitations, we instead compared the structures of wild-type MmpE and the His348 and Asn359 mutant proteins predicted by AlphaFold. These structural models showed almost no differences in secondary structures between the wild-type and mutants (Figure S1B).

      (4) The authors should perform more experiments to determine the binding motif for MmpE in the promoter region of VDR.

      We thank the reviewer for this suggestion. In the current study, we have identified the MmpE-binding motif within the promoter region of VDR using CUT&Tag sequencing. This prediction was further validated by ChIP-qPCR and EMSA (Figure 6). These complementary approaches collectively support the identification of a specific MmpE-binding motif and demonstrate its functional relevance. Such approach was acceptable in many publications (Wen et al., Commun Biol, 2020; Li et al., Nat Commun, 2022).

      (5) Were the transcript levels of VDR also measured in the lung tissues of infected animals?

      We thank the reviewer for this suggestion. In the revised manuscript, we have performed qRT-PCR to assess VDR transcript levels in the lung tissues of infected mice (New Figure S8B).

      (6) How does MmpE regulate the expression of lysosome-associated genes?

      We thank the reviewer for this question. Our experiments suggested that MmpE suppresses lysosomal maturation probably by activating the host PI3K–AKT–mTOR signaling pathway (New Figure 5E–I). This pathway is well established as a negative regulator of lysosome biogenesis and function (Yang et al., Signal Transduct Target Ther, 2020; Cui et al., Nature, 2023; Cui et al., Nature, 2025). During infection, THP-1 cells infected with the WT showed increased phosphorylation of Akt, mTOR, and p70S6K compared to those infected with ΔMmpE (New Figure S5C, New Figure 5E and F), and concurrently downregulated key lysosomal maturation markers, including TFEB, LAMP1, LAMP2, and multiple V-ATPase subunits (Figure 4G). Given that PI3K–AKT–mTOR signaling suppresses TFEB activity and lysosomal gene transcription (Palmieri et al., Nat Commun, 2017), we propose that MmpE modulates lysosome-associated gene expression and lysosomal function probably by PI3K–AKT–mTOR signaling pathway.

      (7) Mice experiment:

      (a) The methods section states that mice were infected intranasally, but the legend for Figure 6 states intratracheally. Kindly check?

      (b) Supplementary Figure 7 - this is not clear. The legend says bacterial loads in spleens (CFU/g) instead of DNA expression, as shown in the figure.

      (c) The data in Figure 6 and Figure S7 seem to be derived from the same experiment, but the number of animals is different. In Figure 6, it is n = 6, and in Figure S7, it is n=3.

      We thank the reviewer for the comments.

      (a) The infection was performed intranasally, and the figure legend for New Figure 7 has now been corrected.

      (b) We adopted quantitative PCR method to measure bacterial DNA levels in the spleens of infected mice. We have now revised the legend.

      (c) We have conducted new experiments where each experiment now includes six mice. The results are showed in Figure 7B and C, as well as in the new Figure S8.

      (8) The authors should show individual values for various replicates in bar graphs (for all figures).

      We thank the reviewer for this helpful suggestion. We have now updated all relevant bar graphs to include individual data points for each biological replicate.

      (9) The authors should validate the relative levels of a few DEGs shown in Figure 3F, Figure 3G, and Figure S4C, in the lung tissues of mice infected with wild-type, mutant, and complemented strains.

      We thank the reviewer for this suggestion. In the revised manuscript, we have performed qRT-PCR to validate the expression levels of selected DEGs, including inflammation-related and lysosome-associated genes, in lung tissues from mice infected with wild-type, mutant, and complemented strains (New Figure S8C-H).

      (10) Did the authors perform an animal experiment using a mutant strain complemented with the phosphatase-deficient MmpE (Comp-MmpE-H348AN359H)?

      We appreciate the reviewer's comment. We agree that an additional animal experiment would be useful to assess the effects of the phosphatase. However, our study mainly focused on interpreting the function of the nuclear localization of MmpE during BCG infection. Additionally, we have assessed the role of the phosphatase of MmpE during infection with cell model (Figure 3E).

      Minor comment:

      The mutant strain should be verified by either Southern blot or whole genome sequencing.

      We thank the reviewer for this comment. We verified deletion of mmpE gene by PCR method (Figure S3A-D) which was acceptable in many publications (Zhang et al., PLoS Pathog, 2020; Zhang et al., Nat Commun, 2022).

      Reviewer #3 (Recommendations for the authors):

      (1) Line 195: cytokine.

      We thank the reviewer for the comments. We have now corrected it.

      (2) Line 225: rewording required.

      Corrected.

      (3) Figure 4A. "No difference" instead of "No different".

      Corrected.

      (4) "KommpE" should be replaced with "∆mmpE strain" (∆=delta symbol).

      Corrected.

      (5) Supplementary Figure 7. The figure legend states CFU assays, but the y-axis and the graph seem to depict IS1081 quantification.

      We thank the reviewer for the comment. The figure is based on IS1081 quantification using qRT-PCR, not CFU assays. We have now revised the legend for New Figure S8A.

      References

      Chandra P, Ghanwat S, Matta SK, Yadav SS, Mehta M, Siddiqui Z, Singh A, Kumar D (2015) Mycobacterium tuberculosis Inhibits RAB7 Recruitment to Selectively Modulate Autophagy Flux in Macrophages Sci Rep 5:16320.

      Darrah PA, Zeppa JJ, Maiello P, Hackney JA, Wadsworth MH 2nd, Hughes TK, Pokkali S, Swanson PA 2nd, Grant NL, Rodgers MA, Kamath M, Causgrove CM, Laddy DJ, Bonavia A, Casimiro D, Lin PL, Klein E, White AG, Scanga CA, Shalek AK, Roederer M, Flynn JL, Seder RA (2020) Prevention of tuberculosis in macaques after intravenous BCG immunization Nature 577:95-102. 

      Forrellad MA, Blanco FC, Marrero Diaz de Villegas R, Vázquez CL, Yaneff A, García EA, Gutierrez MG, Durán R, Villarino A, Bigi F (2020) Rv2577 of Mycobacterium tuberculosis Is a virulence factor with dual phosphatase and phosphodiesterase functions Front Microbiol 11:570794.

      Innokentev A, Sanchez AM, Monetti M, Schwer B, Shuman S (2025) Efn1 and Efn2 are extracellular 5'-nucleotidases induced during the fission yeast response to phosphate starvation mBio 16: e0299224.

      Keppetipola N, Shuman S (2008) A phosphate-binding histidine of binuclear metallophosphodiesterase enzymes is a determinant of 2',3'-cyclic nucleotide phosphodiesterase activity J Biol Chem 283:30942-9.

      Lampe GD, King RT, Halpin-Healy TS, Klompe SE, Hogan MI, Vo PLH, Tang S, Chavez A, Sternberg SH (2024) Targeted DNA integration in human cells without double-strand breaks using CRISPR-associated transposases Nat Biotechnol 42:87-98.

      Li Z, Sheerin DJ, von Roepenack-Lahaye E, Stahl M, Hiltbrunner A (2022) The phytochrome interacting proteins ERF55 and ERF58 repress light-induced seed germination in Arabidopsis thaliana Nat Commun 13:1656.

      Marasco LE, Dujardin G, Sousa-Luís R, Liu YH, Stigliano JN, Nomakuchi T, Proudfoot NJ, Krainer AR, Kornblihtt AR (2022) Counteracting chromatin effects of a splicing-correcting antisense oligonucleotide improves its therapeutic efficacy in spinal muscular atrophy Cell 185:2057-2070.e15.

      Martins WK, Santos NF, Rocha CS, Bacellar IOL, Tsubone TM, Viotto AC, Matsukuma AY, Abrantes ABP, Siani P, Dias LG, Baptista MS (2019) Parallel damage in mitochondria and lysosomes is an efficient way to photoinduce cell death Autophagy 15:259-279.

      Master SS, Rampini SK, Davis AS, Keller C, Ehlers S, Springer B, Timmins GS, Sander P, Deretic V (2008) Mycobacterium tuberculosis prevents inflammasome activation Cell Host Microbe 3:224-32.

      Matange N, Podobnik M, Visweswariah SS (2015) Metallophosphoesterases: structural fidelity with functional promiscuity Biochem J 467:201-16.

      Niroula N, Ghodasara P, Marreros N, Fuller B, Sanderson H, Zriba S, Walker S, Shury TK, Chen JM (2025) Orally administered live BCG and heat-inactivated Mycobacterium bovis protect bison against experimental bovine tuberculosis Sci Rep 15:3764.

      Palmieri M, Pal R, Nelvagal HR, Lotfi P, Stinnett GR, Seymour ML, Chaudhury A, Bajaj L, Bondar VV, Bremner L, Saleem U, Tse DY, Sanagasetti D, Wu SM, Neilson JR, Pereira FA, Pautler RG, Rodney GG, Cooper JD, Sardiello M (2017) mTORC1-independent TFEB activation via Akt inhibition promotes cellular clearance in neurodegenerative storage diseases Nat Commun 8:14338.

      Péan CB, Schiebler M, Tan SW, Sharrock JA, Kierdorf K, Brown KP, Maserumule MC, Menezes S, Pilátová M, Bronda K, Guermonprez P, Stramer BM, Andres Floto R, Dionne MS (2017) Regulation of phagocyte triglyceride by a STAT-ATG2 pathway controls mycobacterial infection Nat Commun 8:14642.

      Qu Z, Zhou J, Zhou Y, Xie Y, Jiang Y, Wu J, Luo Z, Liu G, Yin L, Zhang XL (2020) Mycobacterial EST12 activates a RACK1-NLRP3-gasdermin D pyroptosis-IL-1β immune pathway Sci Adv 6: eaba4733.

      Shenoy AR, Capuder M, Draskovic P, Lamba D, Visweswariah SS, Podobnik M (2007) Structural and biochemical analysis of the Rv0805 cyclic nucleotide phosphodiesterase from Mycobacterium tuberculosis J Mol Biol 365:211-25.

      Smith AA, Su H, Wallach J, Liu Y, Maiello P, Borish HJ, Winchell C, Simonson AW, Lin PL, Rodgers M, Fillmore D, Sakal J, Lin K, Vinette V, Schnappinger D, Ehrt S, Flynn JL (2025) A BCG kill switch strain protects against Mycobacterium tuberculosis in mice and non-human primates with improved safety and immunogenicity Nat Microbiol 10:468-481.

      Wang J, Ge P, Qiang L, Tian F, Zhao D, Chai Q, Zhu M, Zhou R, Meng G, Iwakura Y, Gao GF, Liu CH (2017) The mycobacterial phosphatase PtpA regulates the expression of host genes and promotes cell proliferation Nat Commun 8:244.

      Wang J, Li BX, Ge PP, Li J, Wang Q, Gao GF, Qiu XB, Liu CH (2015) Mycobacterium tuberculosis suppresses innate immunity by coopting the host ubiquitin system Nat Immunol 16:237–245

      Wen X, Wang J, Zhang D, Ding Y, Ji X, Tan Z, Wang Y (2020) Reverse Chromatin Immunoprecipitation (R-ChIP) enables investigation of the upstream regulators of plant genes Commun Biol 3:770.

      Xu X, Lu X, Dong X, Luo Y, Wang Q, Liu X, Fu J, Zhang Y, Zhu B, Ma X (2017) Effects of hMASP-2 on the formation of BCG infection-induced granuloma in the lungs of BALB/c mice Sci Rep 7:2300.

      Zhang L, Hendrickson RC, Meikle V, Lefkowitz EJ, Ioerger TR, Niederweis M. (2020) Comprehensive analysis of iron utilization by Mycobacterium tuberculosis PLoS Pathog 16: e1008337.

      Zhang L, Kent JE, Whitaker M, Young DC, Herrmann D, Aleshin AE, Ko YH, Cingolani G, Saad JS, Moody DB, Marassi FM, Ehrt S, Niederweis M (2022) A periplasmic cinched protein is required for siderophore secretion and virulence of Mycobacterium tuberculosis Nat Commun 13:2255.

      Zhang X, He X, Li Q, Kong X, Ou Z, Zhang L, Gong Z, Long D, Li J, Zhang M, Ji W, Zhang W, Xu L, Xuan A (2017) PI3K/AKT/mTOR Signaling Mediates Valproic Acid-Induced Neuronal Differentiation of Neural Stem Cells through Epigenetic Modifications Stem Cell Reports 8:1256-1269.

    1. eLife Assessment

      The authors provide a useful resource and approach to identify early-stage biomarkers of MASLD progression, notably when no other apparent symptoms have arisen. The strength of evidence to support new MASLD signatures is solid as the work combines metabolomic and transcriptomic measures in blood and liver biopsies.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      Summary:

      Metabolic dysfunction-associated steatotic liver disease (MASLD) ranges from simple steatosis, steatohepatitis, fibrosis/cirrhosis, and hepatocellular carcinoma. In the current study, the authors aimed to determine the early molecular signatures differentiating patients with MASLD associated fibrosis from those patients with early MASLD but no symptoms. The authors recruited 109 obese individuals before bariatric surgery. They separated the cohorts as no MASLD (without histological abnormalities) and MASLD. The liver samples were then subjected to transcriptomic and metabolomic analysis. The serum samples were subjected to metabolomic analysis. The authors identified dysregulated lipid metabolism, including glyceride lipids, in the liver samples of MASLD patients compared to the no MASLD ones. Circulating metabolomic changes in lipid profiles slightly correlated with MASLD, possibly due to the no MASLD samples derived from obese patients. Several genes involved in lipid droplet formation were also found elevated in MASLD patients. Besides, elevated levels of amino acids, which are possibly related to collagen synthesis, were observed in MASLD patients. Several antioxidant metabolites were increased in MASLD patients. Furthermore, dysregulated genes involved in mitochondrial function and autophagy were identified in MASLD patients, likely linking oxidative stress to MASLD progression. The authors then determined the representative gene signatures in the development of fibrosis by comparing this cohort with the other two published cohorts. Top enriched pathways in fibrotic patients included GTPas signaling and innate immune responses, suggesting the involvement of GTPas in MASLD progression to fibrosis. The authors then challenged human patient derived 3D spheroid system with a dual PPARa/d agonist and found that this treatment restored the expression levels of GTPase-related genes in MASLD 3D spheroids. In conclusion, the authors suggested the involvement of upregulated GTPase-related genes during fibrosis initiation.

      Significance:

      Overall, the current study might provide some new resources regarding transcriptomic and metabolomic data derived from obese patients with and without MASLD. The MASLD research community will be interested in the resource data.

      Comments on revised version:

      I have no further comments. Thank you.

    3. Reviewer #3 (Public review):

      Summary:

      Metabolic dysfunction associated liver disease (MASLD) describes a spectrum of progressive liver pathologies linked to life style-associated metabolic alterations (such as increased body weight and elevated blood sugar levels), reaching from steatosis over steatohepatitis to fibrosis and finally end stage complications, such as liver failure and hepatocellular carcinoma. Treatment options for MASLD include diet adjustments, weight loss, and the receptor-β (THR-β) agonist resmetirom, but remain limited at this stage, motivating further studies to elucidate molecular disease mechanisms to identify novel therapeutic targets.

      In their present study, the authors aim to identify early molecular changes in MASLD linked to obesity. To this end, they study a cohort of 109 obese individuals with no or early-stage MASLD combining measurements from two anatomic sides: 1. bulk RNA-sequencing and metabolomics of liver biopsies, and 2. metabolomics from patient blood. Their major finding is that GTPase-related genes are transcriptionally altered in livers of individuals with steatosis with fibrosis compared to steatosis without fibrosis.

      Major comments:

      (1) Confounders (such as (pre-)diabetes)

      The patient table shows significant differences in non-MASLD vs. MASLD individuals, with the latter suffering more often from diabetes or hypertriglyceridemia. Rather than just stating corrections, subgroup analyses should be performed (accompanied with designated statistical power analyses) to infer the degree to which these conditions contribute to the observations. I.e., major findings stating MASLD-associated changes should hold true in the subgroup of MASLD patients without diabetes/of female sex and so forth (testing for each of the significant differences between groups).

      Post-rebuttal update: The authors have performed the requested sub-group analysis and find the gene signatures hold for the non-diabetic sub-cohort, but not the diabetic subgroup. They denote a likely interaction between fibrosis and diabetes, that was not corrected for in the original analysis.

      Post-post-rebuttal update: I thank the authors for having added Figure 5-figure supplement 2 to show this analysis.

      (2) External validation

      Additionally, to back up the major GTPase signature findings, it would be desirable to analyze an external dataset of (pre)diabetes patients (other biased groups) for alternations in these genes. It would be important to know if this signature also shows in non-MASLD diabetic patients vs. healthy patients or is a feature specific to MASLD. Also, could the matched metabolic data be used to validate metabolite alterations that would be expected under GTPase-associated protein dysregulation?

      Post-rebuttal update: The authors confirm that with the present data, insulin resistance cannot be fully ruled out as a confounder to the GTP-ase related gene signature. They however plan future mouse model experiments to study whether the GTPase-fibrosis signature differs in diabetic vs. non-diabetic conditions.

      (3) 3D liver spheroid MASH model, Fig. 6D/E

      This 3D experiment is technically not an external validation of GTPase-related genes being involved in MASLD, since patient-derived cells may only retain changes that have happened in vivo. To demonstrate that the GTPase expression signature is specifically invoked by fibrosis the LX-2 set up is more convincing, however, the up-regulation of the GTPase-related genes upon fibrosis induction with TGF-beta, in concordance with the patient data, needs to be shown first (qPCR or RNA-seq). Additionally, the description of the 3D model is too uncritical. The maintenance of functional PHHs is a major challenge (PMID: 38750036, PMID: 21953633, PMID: 40240606, PMID: 31023926). It cannot be ruled out that their findings are largely attributable to either 1) the (other present) mesenchymal cells (i.e., mesenchyme-derived cells, such as for example hepatic stellate cells, not to be confused with mesenchymal stem cells, MSCs), or 2) related to potential changes in PHHs in culture, and these limitations need to be stated.

      Post-rebuttal update: To address the concern of other cells than hepatocytes contributing to the observed effects in culture, the authors performed TGF-beta treatment in independent mono-cultures (Figure R4): LX-2 and hepatocytes, and the spheroid system. Surprisingly, important genes highlighted in Figure 6E for the spheroid system (RAB6A, ARL4A, RAB27B, DIRAS2) are all absent from this qPCR(?) validation experiment. The authors evaluate instead RAC1, RHOU, VAV1, DOCK2, RAB32. ­In spheroids, RHOU and RAB32 are down-regulated with TGF-B. In hepatocytes DOCK2 and RAC seemed up-regulated. They find no difference in these genes in LX-2 cells. Surprisingly, ACTA2 expression values are missing for LX-2 cells. Together, it is hard to judge which individual cell type recapitulates the changes observed in patients in this validation experiment, as the major genes called out in Figure 6E are not analyzed.

      Post-post-rebuttal update: I thank the authors for having added Figure 6-figure supplement 5 to show qPCR results for this question.

      Unfortunately, the 3D liver spheroid model used (as presente­d in PMID39605182) lacks important functional validation tests of maintained hepatocyte identity in culture (at the very least Albumin expression and secretion plus CYP3A4 assay). This functional data (acquired at the time point in culture when the RNA expression analysis in 6E was performed) is indispensable prior to stating that mature hepatocytes cause the observed effects.

      Post-post-rebuttal update: I thank the authors for having added more references, I still think a quick functional validation of the system (at the time point in culture when the RNA expression analysis in 6E was performed) would be beneficial.

      (4) Novelty / references

      Similar studies that also combined liver and blood lipidomics/metabolomics in obese individuals with and without MASLD (e.g. PMID 39731853, 39653777) should be cited. Additionally, it would benefit the quality of the discussion to state how findings in this study add new insights over previous studies, if their findings/insights differ, and if so, why.

      Post-rebuttal update: The authors have included the studies into their discussion.

      Overall post-post-rebuttal update: I thank the authors for having added more data, important discussion points, and references, and have no further requests.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Thank you for the authors' responses to my concerns. I do not have any further comments.

      We thank this reviewer for the positive and constructive evaluation of our manuscript.

      Reviewer #2 (Public Review):

      I have no further comment about this amended version, aside from suggesting to add (if known) the time at which biopsies were collected. Time-of-day is an important yet often overlooked parameter of gene expression variation, and along the same line, the imposed fasting to bariatric surgery patients is also a matter of variation of gene expression and of metabolite abundance. It is hoped that future investigations will more precisely characterize the role of the newly identified targets in MASLD.

      We agree with this and are fully aware that metabolism in the liver is controlled by circadian rhythm and therefore the time-of-day is an important parameter when liver samples are collected. All liver samples were collected between 8am and 1pm, and this information has been added to the Methods section. We are already working on the characterization of the newly identified targets. Thank you for the positive and constructive evaluation of our manuscript.

      Reviewer #3 (Public Review):

      (1) Confounders (such as (pre-)diabetes)

      The patient table shows significant differences in non-MASLD vs. MASLD individuals, with the latter suffering more often from diabetes or hypertriglyceridemia. Rather than just stating corrections, subgroup analyses should be performed (accompanied with designated statistical power analyses) to infer the degree to which these conditions contribute to the observations. I.e., major findings stating MASLD-associated changes should hold true in the subgroup of MASLD patients without diabetes/of female sex and so forth (testing for each of the significant differences between groups).

      Post-rebuttal update: The authors have performed the requested sub-group analysis and find the gene signatures hold for the non-diabetic sub-cohort, but not the diabetic subgroup. They denote a likely interaction between fibrosis and diabetes, that was not corrected for in the original analysis.

      (2) External validation

      Additionally, to back up the major GTPase signature findings, it would be desirable to analyze an external dataset of (pre)diabetes patients (other biased groups) for alternations in these genes. It would be important to know if this signature also shows in non-MASLD diabetic patients vs. healthy patients or is a feature specific to MASLD. Also, could the matched metabolic data be used to validate metabolite alterations that would be expected under GTPase-associated protein dysregulation?

      Post-rebuttal update: The authors confirm that with the present data, insulin resistance cannot be fully ruled out as a confounder to the GTPase related gene signature. They however plan future mouse model experiments to study whether the GTPase-fibrosis signature differs in diabetic vs. non-diabetic conditions.

      (3) 3D liver spheroid MASH model, Fig. 6D/E

      This 3D experiment is technically not an external validation of GTPase-related genes being involved in MASLD, since patient-derived cells may only retain changes that have happened in vivo. To demonstrate that the GTPase expression signature is specifically invoked by fibrosis the LX-2 set up is more convincing, however, the up-regulation of the GTPase-related genes upon fibrosis induction with TGF-beta, in concordance with the patient data, needs to be shown first (qPCR or RNA-seq). Additionally, the description of the 3D model is too uncritical. The maintenance of functional PHHs is a major challenge (PMID: 38750036, PMID: 21953633, PMID: 40240606, PMID: 31023926). It cannot be ruled out that their findings are largely attributable to either 1) the (other present) mesenchymal cells (i.e., mesenchyme-derived cells, such as for example hepatic stellate cells, not to be confused with mesenchymal stem cells, MSCs), or 2) related to potential changes in PHHs in culture, and these limitations need to be stated.

      Post-rebuttal update: To address the concern of other cells than hepatocytes contributing to the observed effects in culture, the authors performed TGF-beta treatment in independent mono-cultures (Figure R4): LX-2 and hepatocytes, and the spheroid system. Surprisingly, important genes highlighted in Figure 6E for the spheroid system (RAB6A, ARL4A, RAB27B, DIRAS2) are all absent from this qPCR(?) validation experiment. The authors evaluate instead RAC1, RHOU, VAV1, DOCK2, RAB32. -In spheroids, RHOU and RAB32 are down-regulated with TGF-B. In hepatocytes DOCK2 and RAC seemed up-regulated. They find no difference in these genes in LX-2 cells. Surprisingly, ACTA2 expression values are missing for LX-2 cells. Together, it is hard to judge which individual cell type recapitulates the changes observed in patients in this validation experiment, as the major genes called out in Figure 6E are not analyzed.

      All biological experiments show variations and especially when analyzing various cell types (lines), we are not completely surprised that not all results are completely aligned. In other words, some of the GTPases will be upregulated in hepatocytes, while other may be upregulated in hepatic stellate cells due to the complex signaling arrangement in each cell. To address this reviewer’s concerns, we have done qPCR for RAB6A, ARL4A, RAB27B, DIRAS2 in LX-2 cells and the results are shown in the revised now Figure 6– figure supplement 5. To align all three graphs displaying the same genes analyzed, we have now depicted the gene expression for the co-culture (hepatocytes, hepatic stellate cells, and Kupffer cells) and mono-culture (hepatocytes only) from RNAseq analysis.

      Unfortunately, the 3D liver spheroid model used (as presente-d in PMID39605182) lacks important functional validation tests of maintained hepatocyte identity in culture (at the very least Albumin expression and secretion plus CYP3A4 assay). This functional data (acquired at the time point in culture when the RNA expression analysis in 6E was performed) is indispensable prior to stating that mature hepatocytes cause the observed effects.

      We agree that the characterization of the liver spheroid model derived from human patient samples is important. The functional characterization has already been published in these papers:

      (1) Bell, C. C. et al. Transcriptional, Functional, and Mechanistic Comparisons of Stem Cell–Derived Hepatocytes, HepaRG Cells, and Three-Dimensional Human Hepatocyte Spheroids as Predictive In Vitro Systems for Drug-Induced Liver Injury. Drug Metab. Dispos. 45, 419–429 (2017).

      (2) Bell, C. C. et al. Characterization of primary human hepatocyte spheroids as a model system for drug-induced liver injury, liver function and disease. Sci. Rep. 6, 25187 (2016). 3.Vorrink, S. U. et al. Endogenous and xenobiotic metabolic stability of primary human hepatocytes in long‐term 3D spheroid cultures revealed by a combination of targeted and untargeted metabolomics. FASEB J. 31, 2696–2708 (2017).

      (4) Messner, S. et al. Transcriptomic, Proteomic, and Functional Long-Term Characterization of Multicellular Three-Dimensional Human Liver Microtissues. Appl. In Vitro Toxicol. 4, 1–12 (2018).

      (5) Bell, C. C. et al. Comparison of Hepatic 2D Sandwich Cultures and 3D Spheroids for Long-term Toxicity Applications: A Multicenter Study. Toxicol. Sci. 162, 655–666 (2018). We have mentioned this now in the manuscript on page 18 to make this point clear.

      (4) Novelty / references

      Similar studies that also combined liver and blood lipidomics/metabolomics in obese individuals with and without MASLD (e.g. PMID 39731853, 39653777) should be cited. Additionally, it would benefit the quality of the discussion to state how findings in this study add new insights over previous studies, if their findings/insights differ, and if so, why.

      Post-rebuttal update: The authors have included the studies into their discussion.

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      (1) Add the plots showing diabetes/non-diabetes sub-group analysis and power estimates to the Supplementary Figures (rather than just as a Supplementary table)

      We have added this as Figure 5-figure supplement 2 in the revised manuscript (R2).

      (2) Add a short note on the validity of the results limiting to the non-diabetes subgroup to the limitations section

      We have done this in the revised manuscript (R2).

      (3) Add a short note on the missing adjustment for fibrosis/diabetes interactions in the study to the limitations paragraph

      We appreciate the reviewer’s suggestion to address the lack of adjustment for potential fibrosis–diabetes interaction. We added a note to the limitations paragraph in the Limitations section. Although diabetes considerably modulates the risk for steatohepatitis, only a small number of participants had diabetes (29 of 109) in our study, undermining statistical power to detect meaningful interaction effects.

      Author response table 1.

      (4) Fig S10/6E: In vitro TGF-b stimulation on spheroids, LX-2 cells, hepatocytes: evaluate expression of RAB6A, ARL4A, RAB27B, DIRAS2 genes from 6E to create consistency between the findings. Confirm ACTA2 up-regulation in LX-2 cells treated with TGF-β as a positive control. Also specify methods for gene expression analysis in spheroids and the cell types in the figure legends (RNA-Seq? qPCR?)

      To address this reviewer’s concerns, we have done qPCR for RAB6A, ARL4A, RAB27B, DIRAS2 in LX-2 cells stimulated with TGF-β and the results are shown in the revised now Figure 6–figure supplement 5. To align all three graphs displaying the same genes analyzed, we have now depicted the gene expression for the co-culture (hepatocytes, hepatic stellate cells, and Kupffer cells) and mono-culture (hepatocytes only) from RNAseq analysis. We have also updated the methods that we used in the figure legend.

      (5) Validate the functionality of hepatocytes in the 3D liver spheroid model used (PMID: 39605182) at the time points of which the experiments have been performed (e.g. Albumin secretion, CYP-assays).

      We agree that the characterization of the liver spheroids from human patients using fully differentiated cells, is important but this has already been done and is published in these papers:

      (1) Bell, C. C. et al. Transcriptional, Functional, and Mechanistic Comparisons of Stem Cell–Derived Hepatocytes, HepaRG Cells, and Three-Dimensional Human Hepatocyte Spheroids as Predictive In Vitro Systems for Drug-Induced Liver Injury. Drug Metab. Dispos. 45, 419–429 (2017).

      (2) Bell, C. C. et al. Characterization of primary human hepatocyte spheroids as a model system for drug-induced liver injury, liver function and disease. Sci. Rep. 6, 25187 (2016). 3.Vorrink, S. U. et al. Endogenous and xenobiotic metabolic stability of primary human hepatocytes in long‐term 3D spheroid cultures revealed by a combination of targeted and untargeted metabolomics. FASEB J. 31, 2696–2708 (2017).

      (4) Messner, S. et al. Transcriptomic, Proteomic, and Functional Long-Term Characterization of Multicellular Three-Dimensional Human Liver Microtissues. Appl. In Vitro Toxicol. 4, 1–12 (2018).

      (5) Bell, C. C. et al. Comparison of Hepatic 2D Sandwich Cultures and 3D Spheroids for Long-term Toxicity Applications: A Multicenter Study. Toxicol. Sci. 162, 655–666 (2018).

      We have mentioned this now in the manuscript on page 18 and also the Limitation section to make this point clear.

      (6) Add a note on limitations of the PHH-spheroid and cell line in vitro models to the limitations section and discuss the need for future experiments to examine the cellular crosstalk and cell types potentially responsible for the proposed GTPase-gene dysregulation.

      We have added this to the limitation section on page 13 this in the revised manuscript (R2).

    1. eLife Assessment

      Kambali et al use optogenetic manipulations to examine whether the ventral hippocampal Schaffer collateral (vCA3-to-vCA1) and temporoammonic (EC-to-vCA1) pathways regulate anxiety- and fear-related behaviors in mice. They find that both pathways regulate the expression of fear (freezing) responses to a context and auditory conditioned stimulus paired with foot shock (trace conditioning protocol), but only the Schaffer collateral pathway regulates the expression of anxiety-related behaviors in the elevated plus maze, open field test, and Vogel conflict test. Overall, the study is valuable: it detects bidirectional effects of optogenetic excitation and inhibition in both pathways. However, the strength of the evidence in support of its main claims is incomplete.

    2. Reviewer #1 (Public review):

      Summary:

      The hippocampus, especially the ventral subregion, has been related to emotional processing. However, the specific circuitry involved deserves further investigation. By using a bidirectional optogenetic modulation, Kambali et al. have investigated the role of different inputs to vCA1 (i.e., from vCA3 and entorhinal cortex) in anxiety- and fear-related responses. The major findings of this work suggested that both inputs to vCA1 control fear-related responses, whereas only the projection between vCA3 and vCA1 controls anxiety-related behavior. Overall, the authors used an advanced methodological approach, which allows them to modulate specific brain circuits, to study specific hippocampal projections, providing some new information regarding the hippocampal function in anxiety and fear.

      Strengths:

      (1) The manuscript is well written, clear and has a detailed and specific discussion.

      (2) Results from each optogenetic manipulation are clear in different anxiety- and fear-related tasks, demonstrating the robustness of the findings.

      (3) The overall conclusions are very interesting and might be relevant for the field of mental health disorders accompanied by anxiety- and fear-related alterations.

      Weaknesses:

      (1) The major differences in basal behavioral performance in the different paradigms between the two optogenetic modulations prevent the achievement of strong conclusive results.

      (2) Data presentation and representative figures need a major revision.

      (3) No analysis has been performed to analyze potential sex differences in behavioral domains where sex is important.

    3. Reviewer #2 (Public review):

      Summary:

      This paper uses an optogenetic approach to either activate or inhibit separate neural pathways projecting to the ventral CA1 hippocampal subregion, from either CA3 or the entorhinal cortex. The authors report that manipulation of the vCA3→vCA1 pathway affected behavioural performance on a number of tasks: elevated plus maze, open field, Vogel conflict test and freezing behaviour to both context and a trace CS cue. In contrast, optogenetic manipulation of neural activity in the EC→vCA1 pathway only affected behaviour on the trace CS/context fear memory test but had no effect on the elevated plus maze, open field or Vogel conflict test. The authors suggest different roles for these two ventral hippocampal pathways in fear versus anxiety.

      Strengths:

      This is an interesting study addressing an important question in a highly topical subject area. The experiments are well conducted and have generated interesting and important data.

      Weaknesses:

      While I am broadly sympathetic to the overall narrative of the paper, I have some questions/comments around the specific interpretation of the results presented. In my view, the authors' claims may not be completely supported by their data, but the data are interesting nonetheless.

      In terms of the framework presented by the authors for interpreting their data, many would argue that freezing (or at least reduced activity/behavioural inhibition) to the context provides a readout of conditioned anxiety rather than fear. In this sense, the context is a signal of potential threat (i.e. the context becomes associated with both shock and with the absence of shock) and thus generates anxiety rather than fear. Likewise, the trace CS cue could be considered as an ambiguous predictor of shock in that the shock doesn't occur straight away. In contrast, a punctate CS cue which co-terminates with shock would be a reliable signal of imminent threat and thus generates a fear response. Thus, it might be argued that all of the assays adopted by the authors are readouts of anxiety (albeit comprising tests of both conditioned and unconditioned anxiety). For example, from the authors' perspective, it is not clear a priori why the Vogel conflict test is considered anxiety, but contextual freezing is considered fear? Indeed, in the Discussion, the authors mention another study in which the data from the Vogel conflict test align with fear assays rather than anxiety tests. Can the authors elaborate on their distinction? I appreciate that, in practice, it might be difficult to distinguish between fear and anxiety at the behavioural level in rodents (although opposing effects of fear and anxiety on pain responses might be one option). At the very least, this issue merits further discussion.

      Another question is whether rather than representing a qualitative difference between the contributions of the vCA3→vCA1 and EC→vCA1 pathways to different aspects of fear/anxiety behaviours, the different results reflect a quantitative difference between the magnitude of effects in vCA1 that are generated from optogenetic manipulation of the two pathways, coupled with the possibility that behaviour on the trace CS/context fear memory task is more sensitive to manipulation than the "anxiety tests". The possibility that vCA3→vCA1 stimulation is more effective is potentially supported by the c-fos measurements in vCA1. vCA3→vCA1 stimulation produced a much bigger vCA1 c-fos response (approx. 350% c-fos cell activation; see Figure 1E) compared to activation of the EC→vCA1 pathway (approx. 170% c-fos cell activation; see Figure 4E).

      Furthermore, in some studies, there seem to be quite large differences between the laser OFF conditions for the different groups (which presumably one would not expect to be different). For example, compare laser OFF for the Inhibition group for time in open arms of EPM in Figure 5C (> 40%) versus laser OFF for the Inhibition group for time in open arms of EPM in Fig. 2C (< 20%). This could potentially result in ceiling effects, such that it is very hard to see a further increase in time in the open arms from a level already above 40% when the laser is then switched on. This could complicate the interpretation of the laser ON condition.

      Likewise, there is a big difference between the behavioral performance of the two SHAM groups in Figure 3 (compare SHAM in 3 B, C and SHAM in 3 D, E). How is this explained? Could this generate a ceiling effect? This may also merit some discussion. More details on the SHAM procedure(s) in the main manuscript may also be helpful.

      According to Figure 3A, the test of freezing response to the trace Tone CS is conducted in a different context from the conditioning context. The data presented in Figure 3 for tone fear are the levels of freezing during the presentation of this cue in the different contexts. It would be important to present both pre-CS and CS freezing levels here to determine how much of the freezing is actually driven by the punctate tone CS. The pre-CS freezing levels in this different context would also provide a nice control for the contextual fear conditioning.

    4. Reviewer #3 (Public review):

      Summary:

      In their paper entitled "Ventral hippocampal temporoammonic and Schaffer collateral pathways differential control fear- and anxiety-related behaviors" the authors use a bidirectional optogenetic approach to elucidate the role of temporammonic (TA) and Schaffer collateral (SC) inputs to the ventral hippocampus (CA1) in modulating both fear and anxiety-related behaviors. While fear and anxiety behaviors are often considered on a continuous spectrum, identifying neural pathways that are differentially activated represents an important open question in the field. The authors find that optogenetic stimulation or inhibition of the Schaffer Collateral pathway in the ventral hippocampus (CA3-CA1) bidirectionally modulates both fear-related and anxiety-related behavioral paradigms. More specifically, optogenetic excitation of the CA3-CA1 pathway using ChR2-expressing viral constructs increases anxiety-like behaviors in numerous behavioral paradigms (elevated plus maze, open field, Vogel conflict test). Conversely, optogenetic inhibition using halorhodopsin reduced anxiety-like behaviours. To examine fear behaviors, the authors examined contextual and trace fear conditioning. Similar to their results with anxiety-like behaviors, the authors observed bidirectional fear modulation following optogenetic stimulation of the vCA3-vCA1 pathway. The authors next examined the temporammonic pathway originating from the lateral entorhinal cortex to vCA1. Unlike with SC stimulation, stimulation of the TA pathway had no effect on anxiety-like behaviors but did bidirectionally modulate contextual fear conditioning. Together, these results differentiate the SC and TA pathways in the ventral hippocampus as distinct regulators of affective behavior.

      Strengths:

      The paper has numerous technical strengths, including dissecting the role of both excitation and inhibition of both pathways and the use of behavioral measures of anxiety and fear. This balanced and internally controlled design allows readers to evaluate the effects of both pathways in a single study, thereby reducing technical complications from experiments being completed across laboratories and experimental conditions.

      Weaknesses:

      There are a few limitations of the study, however, which bear discussion.

      (1) The authors use halorhodopsin to achieve optogenetic inhibition. Halorhodopsin is generally considered a first-generation optogenetic actuator, as it is a Cl- pump rather than an ion channel. This limits the degree of inhibition (i.e. by preventing shunting inhibition) and can result in altered chloride gradients in the period immediately following optogenetic stimulation. This is of particular concern in this paper as the stimulation parameters and behavioral analysis are not temporally correlated, therefore confounds of disrupted chloride cannot be experimentally accounted for or controlled.

      (2) The authors use an AAV-CaMKII-eGFP as a control (Sham) throughout the dataset; however, in the trace fear conditioning experiments, there are no AAV-CaMKII-ChR2-eYFP or AAV-CaMKII-eNpHR3.0-eYFP controls without optogenetic stimulation. Therefore, it is unclear the extent to which viral expression of optogenetic actuators impacts behavior. Additionally, the authors only provided optogenetic stimulation during contextual fear recall and tone fear recall. Additional experiments disrupting each pathway during trace conditioning would have provided additional insight into the role of each pathway in the initial encoding of fear memories.

      (3) The location and extent of viral expression across animals were not systematically quantified.

      Overall, however, these weaknesses do not significantly detract from the main conclusions of the paper. The authors' data convincingly demonstrates that disruption of the trisynaptic circuit bidirectionally modulates both fear- and anxiety-like behaviors while disruption of the temporammonic pathway has no effect on anxiety-like behaviors but disrupts fear-related behaviors. It is interesting to note, however, that the TA activation had no effect on tone-related fear conditioning, suggesting a potential specialized role of the temporammonic pathway specifically in contextual fear memory.

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The hippocampus, especially the ventral subregion, has been related to emotional processing. However, the specific circuitry involved deserves further investigation. By using a bidirectional optogenetic modulation, Kambali et al. have investigated the role of different inputs to vCA1 (i.e., from vCA3 and entorhinal cortex) in anxiety- and fear-related responses. The major findings of this work suggested that both inputs to vCA1 control fear-related responses, whereas only the projection between vCA3 and vCA1 controls anxiety-related behavior. Overall, the authors used an advanced methodological approach, which allows them to modulate specific brain circuits, to study specific hippocampal projections, providing some new information regarding the hippocampal function in anxiety and fear.

      Strengths:

      (1) The manuscript is well written, clear and has a detailed and specific discussion.

      (2) Results from each optogenetic manipulation are clear in different anxiety- and fear-related tasks, demonstrating the robustness of the findings.

      (3) The overall conclusions are very interesting and might be relevant for the field of mental health disorders accompanied by anxiety- and fear-related alterations.

      Weaknesses:

      (1) The major differences in basal behavioral performance in the different paradigms between the two optogenetic modulations prevent the achievement of strong conclusive results.

      The two projections of ventral CA1 were studied independently in different cohorts of animals tested at different times during the study. This difference in timing may have contributed to variations in the basal behavioral performance between the two projections. Importantly we found that within each cohort – control and optogenetic manipulation, the basal performance within each set of experiments (i.e., corresponding to projections) is highly consistent, e.g., basal cued and contextual freezing responses and responses to OFF conditions in Vogel conflict test. Moreover, the ANOVA statistics conducted across the baseline and ON conditions for each task revealed robust significant effects of bidirectional optogenetic modulation for each cohort. In case of the fear responses, a point to note is that the freezing levels in SHAM controls differ between projections but are consistent between two types of assessments (tone and context) within each projection. We will mention these limitations in the revised manuscript.

      (2) Data presentation and representative figures need a major revision.

      The figures will be rearranged according to the projections. The anxiety-related figures and fear response related figures will be grouped for each projection to improve clarity and readability. The revised manuscript will include representative heat maps for each behavioral task for both projections in addition to population quantification data.

      (3) No analysis has been performed to analyze potential sex differences in behavioral domains where sex is important.

      This assessment was not done in the original submission. We will perform statistical analysis for male and female mice separately and if the results are sex-dependent, we will present separate figures. Otherwise, the combined data presentation will be followed.

      Reviewer #2 (Public review):

      Summary:

      This paper uses an optogenetic approach to either activate or inhibit separate neural pathways projecting to the ventral CA1 hippocampal subregion, from either CA3 or the entorhinal cortex. The authors report that manipulation of the vCA3→vCA1 pathway affected behavioural performance on a number of tasks: elevated plus maze, open field, Vogel conflict test and freezing behaviour to both context and a trace CS cue. In contrast, optogenetic manipulation of neural activity in the EC→vCA1 pathway only affected behaviour on the trace CS/context fear memory test but had no effect on the elevated plus maze, open field or Vogel conflict test. The authors suggest different roles for these two ventral hippocampal pathways in fear versus anxiety.

      Strengths:

      This is an interesting study addressing an important question in a highly topical subject area. The experiments are well conducted and have generated interesting and important data.

      Weaknesses:

      While I am broadly sympathetic to the overall narrative of the paper, I have some questions/comments around the specific interpretation of the results presented. In my view, the authors' claims may not be completely supported by their data, but the data are interesting nonetheless.

      In terms of the framework presented by the authors for interpreting their data, many would argue that freezing (or at least reduced activity/behavioural inhibition) to the context provides a readout of conditioned anxiety rather than fear. In this sense, the context is a signal of potential threat (i.e. the context becomes associated with both shock and with the absence of shock) and thus generates anxiety rather than fear. Likewise, the trace CS cue could be considered as an ambiguous predictor of shock in that the shock doesn't occur straight away.

      In contrast, a punctate CS cue which co-terminates with shock would be a reliable signal of imminent threat and thus generates a fear response. Thus, it might be argued that all of the assays adopted by the authors are readouts of anxiety (albeit comprising tests of both conditioned and unconditioned anxiety).

      We agree with the reviewer that context and trace fear conditioning do not represent an “imminent” threat as severe as would likely be internalized in delay fear conditioning. However, the goal of the study was to probe hippocampal dependent processes (contextual and trace fear conditioning are strongly modulated by the hippocampus while delay conditioning is not). Consistent with several other studies, we believe the conditional nature of the task (context and trace are invariably linked to shock) provides support for a “non-ambiguous” relationship that is conducive for measuring the assessment of fear-based behavior.

      Several studies show clear differences in the involvement of amygdala and hippocampus in delay vs. trace fear conditioning. Inactivating amygdala led to deficits in contextual and delay conditioning but had no effect on trace conditioning. In contrast, inactivating hippocampus led to deficits in trace and contextual but not delay fear conditioning. These findings suggest that a temporal gap between the CS and US can generate amygdala-independent but hippocampal-dependent fear conditioning (Raybuck J. D., Lattal K. M 2011, PMID: 21283812). Lesions of the entorhinal cortex impair the acquisition of trace fear conditioning but not the acquisition of delay fear conditioning (Raybuck J. D., Lattal K. M 2011, PMID: 21283812) . Further, using single unit recording during fear retention tests after delay or trace fear conditioning, the study showed that entorhinal neurons specifically respond after trace but not after delay fear conditioning (Kong et al 2023, PMID: 36919333). These findings demonstrate that trace fear conditioning and delay fear conditioning may involve overlapping but largely different neuronal circuits. A knockdown of the expression of the α5-subunit–containing GABA<sub>𝐴</sub> receptors in the CA1 region (α5CA1KO mice) leads to improved spatial learning and enhanced trace fear conditioning memory, actually to the level of delay fear conditioning, suggesting that α5GABA<sub>𝐴</sub>Rs in CA1 pyramidal neurons normally constrain hippocampus-dependent memory processes and that trace fear conditioning in the absence of a5-GABA<sub>𝐴</sub> receptors in CA1 has the same effect size as delay fear conditioning (Engin et al 2020, PMID: 32934095), supporting the view that trace fear conditioning is not “ambiguous”.

      For example, from the authors' perspective, it is not clear a priori why the Vogel conflict test is considered anxiety, but contextual freezing is considered fear? Indeed, in the Discussion, the authors mention another study in which the data from the Vogel conflict test align with fear assays rather than anxiety tests. Can the authors elaborate on their distinction? I appreciate that, in practice, it might be difficult to distinguish between fear and anxiety at the behavioral level in rodents (although opposing effects of fear and anxiety on pain responses might be one option). At the very least, this issue merits further discussion.

      We will make this distinction clearer in the revisions. Briefly, behavioral actions in the Vogel conflict test are generally considered to be most pertinent to general anxiety disorders in humans and anxiolytics have high predictive validity in animals in this task. In particular, the robust actions of benzodiazepines and 5-HT<sub>1A</sub> partial agonists parallel their clinical efficacy in patients (McMillan and Brocco, 2003, PMID: 12600703).

      Our previous study (Engin et al 2016, PMID: 26971710) used global diazepam-induced neuronal inhibition and identified that positive modulation of α2-GABA<sub>𝐴</sub>Rs in dentate gyrus granule cells and CA3 pyramidal neurons is required to reduce anxiety-like behaviors while inhibition of positive modulation of α2-GABA<sub>𝐴</sub>Rs in CA1 pyramidal neurons is required to reduce fear-related behaviors. The effects were absent when α2-GABA<sub>𝐴</sub>Rs was knocked out in the respective subregions. These results indicate that these intrahippocampal subregions can modulate fear and anxiety-like behaviors independently of the amygdala. In the previous study we used conditional α2-GABA<sub>𝐴</sub>R knockouts in hippocampal subregions and subjected these mice to systemic diazepam. In these experiments, diazepam still acts on α1-, α3- and α5-<sub>𝐴</sub>Rs in the hippocampal subregions and cell types in which when α2-GABA<sub>𝐴</sub>Rs are lacking. Therefore, for example when α2CA1KO mice were administered diazepam, diazepam still led to inhibition of pyramidal neurons in CA3 and DG via α1-, α2-, α3- and α5- GABA<sub>𝐴</sub>Rs, and in addition, diazepam also inhibited α1-, α3- and α5- GABA<sub>𝐴</sub>Rs in CA1 itself. Diazepam also acted on GABA<sub>𝐴</sub>Rs in amygdala or other brain regions. These are fundamentally different experimental conditions compared to the optogenetic experiment described in this paper. Moreover, in contrast to the current paper, the previous work did not examine projections but used global diazepam-induced neuronal inhibition as a baseline. Moreover, whereas the previous paper examined whether a specific neuronal cell type was required for anxiolytic-like or fear-like actions, the current manuscript examined whether activation or inhibition of neuronal projections is sufficient to modulate anxiety- and fear-related behaviors. Overall, one cannot easily compare the results in the Vogel conflict test in both papers.

      Another question is whether rather than representing a qualitative difference between the contributions of the vCA3→vCA1 and EC→vCA1 pathways to different aspects of fear/anxiety behaviours, the different results reflect a quantitative difference between the magnitude of effects in vCA1 that are generated from optogenetic manipulation of the two pathways, coupled with the possibility that behaviour on the trace CS/context fear memory task is more sensitive to manipulation than the "anxiety tests". The possibility that vCA3→vCA1 stimulation is more effective is potentially supported by the c-fos measurements in vCA1. vCA3→vCA1 stimulation produced a much bigger vCA1 c-fos response (approx. 350% c-fos cell activation; see Figure 1E) compared to activation of the EC→vCA1 pathway (approx. 170% c-fos cell activation; see Figure 4E).

      Furthermore, in some studies, there seem to be quite large differences between the laser OFF conditions for the different groups (which presumably one would not expect to be different). For example, compare laser OFF for the Inhibition group for time in open arms of EPM in Figure 5C (> 40%) versus laser OFF for the Inhibition group for time in open arms of EPM in Fig. 2C (< 20%). This could potentially result in ceiling effects, such that it is very hard to see a further increase in time in the open arms from a level already above 40% when the laser is then switched on. This could complicate the interpretation of the laser ON condition.

      The magnitude of activation as evidenced by c-fos measurements differs between the two projections. This might reflect different levels of modulations of CA1 neuronal activity. The fact that the two projections were studied at different time points (see response to reviewer 1) may also have contributed to the difference. The revised manuscript will include a formal discussion about magnitude of modulation that could contribute to differential sensitivity for the modulation of anxiety-like behaviors. However, the inputs from these two projections systems target different regions of CA1 pyramidal neurons and each pathway has distinct roles in other processes (sensory versus memory-based completion) – thus a dissociation may also be present for other types of behavior as well including the modulation of anxiety-like behaviors.

      While it is possible that ceiling effects could impact our interpretation, we believe ceiling effects would only impact one direction of the optogenetic manipulation and there was no effect of activation (Fig. 5C) or bidirectional modulation of anxiety-related behavior in the novel open field test (Fig. 5F) which has levels of behavior comparable to Figure 2F.

      Likewise, there is a big difference between the behavioral performance of the two SHAM groups in Figure 3 (compare SHAM in 3 B, C and SHAM in 3 D, E). How is this explained? Could this generate a ceiling effect? This may also merit some discussion. More details on the SHAM procedure(s) in the main manuscript may also be helpful.

      With respect to contextual fear, ceiling effects are not a major factor as we still see enhanced freezing in the activation condition. With tone fear, we cannot formally exclude a ceiling effect, and this will be addressed as a potential confound in the manuscript.

      According to Figure 3A, the test of freezing response to the trace Tone CS is conducted in a different context from the conditioning context. The data presented in Figure 3 for tone fear are the levels of freezing during the presentation of this cue in different contexts. It would be important to present both pre-CS and CS freezing levels here to determine how much of the freezing is actually driven by the punctate tone CS. The pre-CS freezing levels in this different context would also provide a nice control for the contextual fear conditioning.

      We agree and will analyze and report the pre-CS freezing data in the revision.

      Reviewer #3 (Public review):

      Summary:

      In their paper entitled "Ventral hippocampal temporoammonic and Schaffer collateral pathways differential control fear- and anxiety-related behaviors" the authors use a bidirectional optogenetic approach to elucidate the role of temporammonic (TA) and Schaffer collateral (SC) inputs to the ventral hippocampus (CA1) in modulating both fear and anxiety-related behaviors. While fear and anxiety behaviors are often considered on a continuous spectrum, identifying neural pathways that are differentially activated represents an important open question in the field. The authors find that optogenetic stimulation or inhibition of the Schaffer Collateral pathway in the ventral hippocampus (CA3-CA1) bidirectionally modulates both fear-related and anxiety-related behavioral paradigms. More specifically, optogenetic excitation of the CA3-CA1 pathway using ChR2-expressing viral constructs increases anxiety-like behaviors in numerous behavioral paradigms (elevated plus maze, open field, Vogel conflict test). Conversely, optogenetic inhibition using halorhodopsin reduced anxiety-like behaviours. To examine fear behaviors, the authors examined contextual and trace fear conditioning. Similar to their results with anxiety-like behaviors, the authors observed bidirectional fear modulation following optogenetic stimulation of the vCA3-vCA1 pathway. The authors next examined the temporammonic pathway originating from the lateral entorhinal cortex to vCA1. Unlike with SC stimulation, stimulation of the TA pathway had no effect on anxiety-like behaviors but did bidirectionally modulate contextual fear conditioning. Together, these results differentiate the SC and TA pathways in the ventral hippocampus as distinct regulators of affective behavior.

      Strengths:

      The paper has numerous technical strengths, including dissecting the role of both excitation and inhibition of both pathways and the use of behavioral measures of anxiety and fear. This balanced and internally controlled design allows readers to evaluate the effects of both pathways in a single study, thereby reducing technical complications from experiments being completed across laboratories and experimental conditions.

      Weaknesses:

      There are a few limitations of the study, however, which bear discussion.

      (1) The authors use halorhodopsin to achieve optogenetic inhibition. Halorhodopsin is generally considered a first-generation optogenetic actuator, as it is a Cl- pump rather than an ion channel. This limits the degree of inhibition (i.e. by preventing shunting inhibition) and can result in altered chloride gradients in the period immediately following optogenetic stimulation. This is of particular concern in this paper as the stimulation parameters and behavioral analysis are not temporally correlated, therefore confounds of disrupted chloride cannot be experimentally accounted for or controlled.

      Choice of halorhodopsin was in part influenced by a report that spontaneous archaerhodopsin activation was paradoxically associated with increased spontaneous release of neurotransmitter from presynaptic terminals, whereas activation of chloride-reducing halorhodopsin triggered neurotransmitter release upon light onset (Mahn et al., PMID: 26950004), suggesting that halorhodospin may be advantageous in studies inhibiting presynaptic nerve terminals. Halorhodpsin has been used in several studies to effectively silence activity and had substantial influence on behavioral in our studies that was inversely proportional to ChR2 stimulation. While perhaps not optimal out of an abundance of caution, we chose it over Archaerhodopsin based on the cited literature.

      (2) The authors use an AAV-CaMKII-eGFP as a control (Sham) throughout the dataset; however, in the trace fear conditioning experiments, there are no AAV-CaMKII-ChR2-eYFP or AAV-CaMKII-eNpHR3.0-eYFP controls without optogenetic stimulation. Therefore, it is unclear the extent to which viral expression of optogenetic actuators impacts behavior. Additionally, the authors only provided optogenetic stimulation during contextual fear recall and tone fear recall. Additional experiments disrupting each pathway during trace conditioning would have provided additional insight into the role of each pathway in the initial encoding of fear memories.

      Thank you for your observation. We have used a SHAM control that was injected with the AAV vector without any opsins. In fear conditioning experiments we performed optogenetic manipulations only during the fear response either with context or cue recall. This aligned well with our hypothesis to test whether the intrahippocampal projections play any role in fear response modulation. Investigating the role of each pathway during acquisition of trace and/or contextual fear conditioning is also highly relevant; however, evaluating these projections in fear memory formation was beyond the scope of this study. The observation that we can bidirectionally modulate fear responses with light is consistent with (although it does not prove) a light-specific modulation. In any case, even if there were baseline effects without light, they would still be suggestive of the effects observed being mediated by the optogenetic actuators.

      (3) The location and extent of viral expression across animals were not systematically quantified.Overall, however, these weaknesses do not significantly detract from the main conclusions of the paper. The authors' data convincingly demonstrates that disruption of the trisynaptic circuit bidirectionally modulates both fear- and anxiety-like behaviors while disruption of the temporammonic pathway has no effect on anxiety-like behaviors but disrupts fear-related behaviors. It is interesting to note, however, that the TA activation had no effect on tone-related fear conditioning, suggesting a potential specialized role of the temporammonic pathway specifically in contextual fear memory.

      Thank you for your thoughtful description of the present study. It is true that TA pathway is distinct from vCA3 to vCA1 pathway in various ways, one being the synapse formation of these two projections are at different locations or layers on vCA1 neurons i.e., the TA pathway synapses on the stratum lacunosum-moleculare (LMol) layer while the vCA3 to vCA1 pathway synapses at stratum radiatum (Rad), close to the CA1 pyramidal cell layer, which is in line with differential functions of the two projections They modulate the pyramidal cell activity in a different way, with TA pathway synapses being distinct from vCA3 to vCA1 synapses on the pyramidal cell layer, which may result in different computational properties of the two projections. Additionally, TA projections are modulated by dopamine while projections from vCA3 are not, but the projections from vCA3 receive inputs from various sources including collaterals, and entorhinal via dentate gyrus. These distinct features of the two projections may contribute to differential modulation of vCA1 activity. We note that cue-related fear is not affected by the TA activation, however even in this case, the TA pathway activation by channelrhodopsin or inhibition by halorhodopsin results in a decrease or an increase of the contextual fear response, respectively.

  2. Feb 2026
    1. eLife Assessment

      This study provides valuable insights into the regulation of myogenic differentiation by identifying Leiomodin 1 as a modulator of proteome dynamics during myogenic differentiation. The combination of quantitative proteomics with functional perturbation experiments offers solid evidence supporting the idea that SIRT1 influences perturbations of myogenic differentiation upon LMOD1 inactivation. These findings advance our understanding of muscle differentiation and will be of interest to researchers studying muscle development and related pathologies

    2. Reviewer #1 (Public review):

      The main significance of this work is characterizing the function of a new gene Lmod1 in muscle stem cell biology. The study suggests an intriguing regulatory mechanism by which Sirt1 sequesters Lmod1 in a specific temporal window during myogenesis.

      Comments on revisions:

      The authors have satisfactorily addressed my inquires. Thank you.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors identify Leiomodin-1 (LMOD1) as a key regulator of early myogenic differentiation, demonstrating its interaction with SIRT1 to influence SIRT1's cellular localization and gene expression. The authors propose that LMOD1 translocates SIRT1 from the nucleus to the cytoplasm to permit the expression of myogenic differentiating genes such as MYOD or Myogenin.

      Strengths:

      A major strength of this work lies in the robust temporal resolution achieved through a time-course mass spectrometry analysis of in vitro muscle differentiation. This provides novel insights into the dynamic process of myogenic differentiation, often under explored in terms of temporal progression. The authors provide a strong mechanistic case for how LMOD1 exerts its role on muscle differentiation which opens avenues to modulate.

      Weaknesses:

      In the revised manuscript, the authors begin to translate their in vitro findings to an in vivo context by examining SIRT1 expression across a regeneration time course (Fig. 4I). They observe an increase in SIRT1 expression concomitant with LMOD1, supporting a potential role for SIRT1 in myogenic differentiation. Future studies will be required to provide deeper mechanistic insight into SIRT1 function in vivo.

      Discussion:

      Overall, the study emphasizes the importance of understanding the temporal dynamics of molecular players during myogenic differentiation and provides valuable proteomic data that will benefit the field. Future studies should explore whether LMOD1 modulates the nuclear-cytoplasmic shuttling of other transcription factors during muscle development and how these processes are mechanistically achieved. Investigating whether LMOD1 can be therapeutically targeted to enhance muscle regeneration in contexts such as exercise, aging, and disease will be critical for translational applications. Additionally, elucidating the interplay among LMOD1, LMOD2, and LMOD3 could uncover broader implications for actin cytoskeletal regulation in muscle biology. The authors have nicely updated their discussion.

    4. Reviewer #3 (Public review):

      Summary:

      In this manuscript, the investigators identified LMOD1 as one of a subset of cytoskeletal proteins that levels increase in early stages of myogenic differentiation. Lmod1 is understudied in striated muscle and in particular in myogenic differentiation. Thus, this is an important study. It is also a very thorough study, with perhaps even too much data presented. Importantly, the investigators observed that LMOD1 appears to be important for skeletal regeneration, myogenic differentiation and that it interacts with SIRT1. Both primary myoblast differentiation and skeletal muscle regeneration were studied. Rescue experiments confirmed these observations: SIRT1 can rescue perturbations of myogenic differentiation as a result of LMOD1 knockdown.

      Strengths:

      Particular strengths include: an important topic, the use of primary skeletal cultures, the use of both cell culture and in vivo approaches, careful biomarker analysis of primary mouse myoblast differentiation, the use of two methods to probe the function of the Lmod1/SIRT1 pathway via using depletion approaches and inhibitors, and the generation of six independent myoblast cultures. Results support their conclusions.

      Weaknesses:

      (1) Figure 1. Images of cells in Figure 1A are too small to be meaningful (especially in comparison to the other data presented in this figure). Perhaps make graphs smaller?

      (2) Line 148 "We found LMOD2 to be the most abundant Lmod in whole skeletal muscle." This is confusing since most, if not all, prior studies have shown that Lmod3 is the predominant isoform in skeletal muscle. The two papers that are cited are incorrectly cited. Clarification to resolve this discrepancy is needed.

      (3) Figure 2. Immunofluorescence (IF) panels are too small to be meaningful. Perhaps the graphs could be made smaller and more space allocated for the IF panels? This issue is apparent for just about all IF panels - they are simply too small to be meaningful. Additionally, in many of the immunofluorescence figures, the colors that were used make it difficult to discern the stained cellular structures. For example, in Figure S1, orange and purple are used - they do not stand out as well as other colors that are more commonly used.

      (4) There is huge variability in many experiments presented - as such, more samples appear to be required to allow for meaningful data to be obtained. For example, Figure S2. Many experimental groups, only have 3 samples - this is highly problematic - I would estimate that 5-6 would be the minimum.

      (5) Ponceau S staining is often used as a loading control in this manuscript for western blots. The area/molecular weight range actually used should be specified. Not clear why in some experiments GAPDH staining is used, in other experiments Ponceau S staining is used, and in some, both are used. In some experiments the variability of total protein loaded from lane-to-lane is disconcerting. For example, in Figure S4C there appears to be more than normal variability. Can the protein assay be redone and the samples run again?

      (6) Figure S3 - Lmod3 is included in the figure but no mention of it occurs in the title of the figure and/or legend.

      (7) Abstract, line 25. "overexpression accelerates and improves the formation of myotubes". This is a confusing sentence. How is it improving the formation? A little more information about how they are different than developing myotubes in normal/healthy muscle would be helpful

      (8) Impossible from IF figures presented to determine where Lmod1 localizes in the myocytes. Information on its subcellular localization is important. Does it localize with Lmod2 and Lmod3 at thin filament pointed ends?

      Comments on revisions:

      Many comments have been adequately addressed. However, some concerns remain.

      Former Concern #2. The issue with the lack of detection of LMOD3 in their muscle samples is troublesome and has not been adequately resolved in the revised manuscript. It is a fact that most, if not all, studies on Lmod3 report that it is the most abundant isoform in skeletal muscle. This issue should be discussed in the manuscript. It is recognized that a different assay was utilized in this paper. The papers that are cited continue to remain incorrect. Specifically:

      Tsukada et al., reports abundance of LMOD2 in cardiac muscle, not in skeletal muscle.

      Nworu et al., 2015 reports on LMOD3 in skeletal muscle.

      Kiss et al.,2020. While this paper reveals an important function for Lmod2 in thin filament length regulation, it is clearly shows many examples of high expression of Lmod3 in various skeletal muscles isolated from mice.

      Former Concern #3. With respect to small sample numbers. Hopefully a statistical editor is available to comment. While this reviewer is happy that other assays were used to verify their data, the problem still remains that many experimental groups only have 3 samples (with high variability).

      Former Concern #3. Many immunofluorescence panels are hard to evaluate because of their small size.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This valuable study offers insights into the role of Leiomodin-1 (LMOD1) in muscle stem cell biology, advancing our understanding of myogenic differentiation and indicating LMOD1 as a regulator of muscle regeneration, aging, and exercise adaptation. The integration of in vitro and in vivo approaches, complemented by proteomic and imaging methodologies, is solid. However, certain aspects require further attention to improve the clarity, impact, and overall significance of the work, particularly in substantiating the in vivo relevance. This work will provide a starting point that will be of value to medical biologists and biochemists working on LMOD and its variants in muscle biology.

      Thank you for the positive feedback on our manuscript and the constructive criticism provided by the reviewers that helped us improve our manuscript.

      Public Reviews:

      Reviewer #1 (Public review):

      This manuscript by Ori and colleagues investigates the role of Lmod1 in muscle stem cell activation and differentiation. The study begins with a time-course mass spectrometry analysis of primary muscle stem cells, identifying Lmod1 as a pro-myogenic candidate (Figure 1). While the initial approach is robust, the subsequent characterization lacks depth and clarity. Although the data suggest that Lmod1 promotes myogenesis, the underlying mechanisms remain vague, and key experiments are missing. Please find my comments below.

      We thank the reviewer for the positive feedback on our manuscript and the helpful comments, which helped improve it.

      (1) The authors mainly rely on coarse and less-established readouts such as myotube length and spherical Myh-positive cells. More comprehensive and standard analyses, such as co-staining for Pax7, MyoD, and Myogenin, would allow quantification of quiescent, activated, and differentiating stem cells in knockdown and overexpression experiments. The exact stage at which Lmod1 functions (stem cell, progenitor, or post-fusion) is unclear due to the limited depth of the analysis. Performing similar experiments on cultured single EDL fibers would add valuable insights.

      We thank the reviewer for this comment. In addition to performing standard measurements such as staining for Myogenin and Myosin Heavy Chain (Figure S2H), we focused on morphological readouts, such as myotube formation, because LMOD1 is an actin cytoskeleton-associated protein. Therefore, we reasoned its function would be most directly reflected in structural changes during differentiation, rather than solely in early transcriptional markers. 

      Regarding the use of standard markers, we have already performed co-staining for Myogenin and Myosin Heavy Chain (MHC), which effectively quantifies early myogenic committed (Myogenin+/MHC-) and terminally differentiating (Myogenin+/MHC+) cells (Figure S2H). We did not include Pax7 as our primary culture system consists of already activated myoblasts, where Pax7 is not a reliable marker of quiescence. Our data also suggest that Lmod1 is important in regulating differentiation with comparably only mild effects on proliferation (S2D-E), therefore, we focused on this stage of myogenesis.

      Our focus on differentiation over activation is further supported by multiple lines of evidence. First, analysis of publicly available transcriptome datasets reveals that Lmod1 mRNA levels actually decrease upon Muscle Stem Cell (MuSC) activation, suggesting its primary role is not during this initial phase. We added this data for clarification to Figure S1B. This aligns perfectly with our in vivo data from cardiotoxin-induced muscle regeneration, where abundance of LMOD1 protein peaks at days 4-7 post-injury — a time point coinciding with new myofiber formation and maturation — rather than during the initial activation and proliferation phase (days 1-3) (Figure 4I).

      Given this strong evidence pointing to a primary role for LMOD1 during the later stages of differentiation, we believe our current analyses are the most relevant. While single EDL fiber cultures are valuable for studying the quiescence-to-activation transition, they would not provide significant additional insight into the specific differentiation-centric mechanism we are investigating here. We are confident that our chosen readouts appropriately address Lmod1's function in the differentiation of myoblasts and formation of myotubes.

      (2) In supplementary Figure 2E, the distinction between Hoechst-positive cells and total cell counts is unclear. The authors should clarify why Hoechst-positive cells increase and relabel "reserve cells," as the term is confusing without reading the legend.

      We thank the reviewer for pointing out the confusion regarding the naming of the cell populations and the increase in Hoechst-positive cells. We have now modified this and revised the terminology used in Figure S2E to improve clarity. Specifically, we have relabeled "reserve cells" as "non-proliferating myoblasts (Ki67-/Hoechst+)" to describe these cells more accurately without requiring the legend for interpretation. Regarding the increase in Hoechst-positive cells, we observed a slight (26%) but significant decrease in the number of proliferating myoblasts (Ki67+/Hoechst+) (Figures S2D and S2E). The relative increase in non-proliferating (Ki67-/Hoechst+) cells is a consequence of the significant reduction in the number of proliferating cells (Ki67+/Hoechst+) cells. Importantly, the total cell count (sum of Ki67-/Hoechst+) and (Ki67+/Hoechst+) remained stable. This has been clarified in the revised figure legend and main text as follows:

      “This was accompanied by a proportional increase in non-proliferating myoblasts (Ki67-/Hoechst+), while the total Hoechst-positive cell count (Ki67+/Hoechst+ and Ki67-/Hoechst+) remained unchanged (Figure S2E).”

      (3) The specificity of Lmod1 and Sirt1 immunostaining needs validation using siRNA-treated samples, especially as these data form the basis of the mechanistic conclusions.

      We have validated the specificity of the LMOD1 antibody using multiple approaches. Specifically, we performed immunofluorescence and immunoblotting on Lmod1 siRNA-transfected samples, where we observed a significant reduction in the Lmod1 protein signal compared to control conditions (see manuscript data from Figure S2G).

      Additionally, LMOD1 overexpression experiments demonstrated a corresponding increase in the signal for LMOD1 using immunofluorescence analyses, confirming the specificity of the antibody for detecting LMOD1.

      For the reviewers’ interest, we add Author response image 1:

      Author response image 1.

      Specificity of antibodies detecting LMOD1. Representative immunofluorescence images of LMOD1 in primary myoblast cultures following siLmod1 knockdown, LMOD1 overexpression, or controls transfected with a non-targeting siRNA (siCtrl) after one day of differentiation. LMOD1 (purple), SIRT1 (yellow), and nuclei (Hoechst, blue). Scale bar: 10 µm.

      For the SIRT1 antibody used in our immunostaining, the specificity was validated by transfecting primary myoblasts with siRNA targeting Sirt1 and performing immunoblot analyses (Figure S5A). These showed a significant reduction in SIRT1 protein levels, confirming both the effectiveness of the siRNA and, critically, the antibody's ability to specifically recognize and detect SIRT1 protein. Furthermore, the same SIRT1 antibody was utilized in our nuclear-cytoplasmic fractionation experiments (Figure S4C), and its ability to detect SIRT1 in the expected subcellular compartments further supports its specific binding to SIRT1. While direct immunofluorescence on Sirt1 siRNA-transfected samples was not performed, the robust demonstration of the antibody's specificity for Sirt1 protein via immunoblotting (i.e., correct molecular weight band, significantly reduced by Sirt1 siRNA) and its distribution in subcellular fractions, which is fully consistent with the localization immunostaining performed at the same time points (compare Figure S4C and 5A), provide strong evidence on the antibody’s specificity, also in immunofluorescence experiments.

      (4) The authors must test the effect of Lmod1 siRNA on Sirt1 localization, as only overexpression experiments are shown

      We carefully considered performing this experiment. However, the knockdown of Lmod1 significantly impairs myogenic differentiation, a crucial cellular process that itself can influence protein localization. Consequently, if SIRT1 localization would be altered following knockdown of Lmod1, it would be challenging to disentangle whether this was a direct result of LMOD1 absence impacting SIRT1 trafficking or an indirect consequence of the cells failing to differentiate properly. This would make it difficult to draw clear conclusions regarding a direct causal link between LMOD1 and SIRT1 localization from such an experiment. Therefore, we focused on overexpression experiments, where we could demonstrate that altering LMOD1 levels is sufficient to affect SIRT1 localization. Our nuclear-cytoplasmic fractionation experiments clearly show that LMOD1 overexpression leads to changes in SIRT1 distribution (Figure 5H-K). These findings provide evidence that LMOD1 can directly modulate SIRT1 localization, supporting our mechanistic conclusions.

      (5) In Figure S3, the biotin signal in LMOD2 samples appears weak. The authors need to address whether comparing LMOD1 and LMOD2 is valid given the apparent difference in reaction efficiency. It would also help to highlight where Sirt1 falls on the volcano plot in S3B.

      We agree that the overall biotin signal on the streptavidin blot for the LMOD2-BirA* sample appears weaker than for LMOD1-BirA*. To provide a more direct comparison of the bait proteins themselves, we have now added a bar graph to the revised Figure S3D, which quantifies the relative abundance of LMOD1 and LMOD2 bait proteins in the pull down experiments. This analysis shows that the levels of LMOD1-BirA* and LMOD2-BirA* were comparable in our BioID samples. Furthermore, the validity of the LMOD2 BioID experiment is strongly supported by the identification of several known LMOD1 and LMOD2 interaction partners. As shown in the dataset, well-established interactors such as TMOD1, TPM3, and TMOD3 were identified, with some even showing stronger enrichment with LMOD2 than with LMOD1. This confirms that the biotinylation reaction was efficient enough to capture proximal proteins for both baits.

      Regarding SIRT1, we have now highlighted in yellow its position on the volcano plot in the revised Figure S3E. As can be seen, SIRT1 was identified in the LMOD1-BirA sample and showed enrichment. We believe these clarifications, along with the additional expression data and the successful identification of known interactors, confirm the validity of our comparative BioID analysis.

      (6) The immunostaining data suggest that Lmod1 remains cytoplasmic throughout differentiation, whereas Sirt1 shows transient cytoplasmic localization at day 1 of differentiation. The authors should explain why Sirt1 is not constantly sequestered if Lmod1's cytoplasmic localization is consistent. It is also unclear whether day 1 is the key time point for Lmod1 function, as its precise role during myogenesis remains ambiguous.

      We thank the reviewer for this comment. We have no data explaining why SIRT1 is not constantly sequestered while LMOD1 remains consistently cytoplasmic. We can only speculate that the transient cytoplasmic localization of SIRT1 may be linked to the availability and functional role of LMOD1 throughout the differentiation process. While LMOD1 is present at low levels in proliferating primary myoblasts, its expression increases upon the initiation of differentiation (Figure 2A). Initially, during the early stages of differentiation, LMOD1 may not be required for actin nucleation as the major remodeling of the cytoskeleton has not yet begun. During this phase, LMOD1 might have the capacity to sequester SIRT1 in the cytoplasm.

      However, as differentiation progresses and morphological changes take place, LMOD1 may switch its functional role to actin nucleation, thereby releasing SIRT1. This transition could explain why SIRT1 is free to localize transiently to the cytoplasm, particularly at day 1, when cytoskeletal remodeling is beginning but not yet fully established.

      Additionally, as LMOD1 and SIRT1 are known to colocalize in the nucleus, they may exit the nucleus together. Once in the cytoplasm, LMOD1 may become engaged in actin nucleation, allowing SIRT1 to function independently, which could explain the transient nature of SIRT1’s cytoplasmic localization.

      We have acknowledged this gap in our understanding in the discussion of the revised manuscript:

      “Our immunostaining data show that while LMOD1 is consistently cytoplasmic, its partner SIRT1 is only transiently localized in the cytoplasm. This suggests that their interaction is dynamically regulated. We hypothesize that the function of LMOD1 is determined by the changing availability of its binding partners during differentiation. During the initial phase, LMOD1 may primarily function to sequester SIRT1, a key regulator of myogenic genes. As differentiation proceeds, the increased expression of cytoskeletal components, such as its canonical partners TMODs and TPMs, likely shifts the function of LMOD1 towards its role in actin nucleation. This molecular switch, potentially driven by a change in the interactome of LMOD1, could then result in the release of SIRT1 from the cytoplasm. Such a mechanism may coordinate transcriptional regulation with cytoskeletal remodeling during myoblast differentiation.”

      (7) The introduction does not sufficiently establish the motivation or knowledge gap this work aims to address. Instead, it reads like a narration of disparate topics in a single paragraph. The authors should clarify the statement in line 150, "since this protein has been...,".

      We thank the reviewer for requesting clarification regarding our focus on LMOD1 (Introduction and Line 150 in the original submission). In the revised manuscript, we shortened the introduction and more clearly emphasized the motivation of our study:

      “Although these mechanisms contribute to remodeling the cellular architecture of MuSCs, a comprehensive understanding of the temporal dynamics of proteome remodeling during differentiation remains lacking. To address this knowledge gap, we performed an unbiased proteomic analysis of the early stages of myogenic differentiation to identify previously unrecognized proteins involved in this process and to examine how they functionally interact with established regulatory pathways.”

      Our decision to focus on LMOD1 was driven by its significant upregulation in our temporal proteome dataset, together with its previously uncharacterized role in primary myoblasts. Furthermore, to strengthen the interpretation of LMOD1’s role, particularly in the context of aging, we have integrated a new analysis of published transcriptomic datasets. This can be found in the main text as follows:

      “Surprisingly, we detected LMOD1 in freshly isolated muscle stem cells (MuSCs), but not LMOD2. Additionally, we observed that the protein levels of LMOD1 increased in MuSCs isolated from older mice (Figure 2C and Figure S1B). We further analyzed published transcriptomic data sets that describe changes between young and old MuSCs in both quiescent and activated states in young and old animals (Liu et al. 2013; Lukjanenko et al. 2016). In these analyzed transcriptomic data sets, Lmod1 was found to be significantly downregulated during the activation of MuSCs in both young and old mice (see Figure S1B).

      To assess the in vivo relevance of our finding, we queried two proteomic datasets of freshly isolated MuSCs and four different skeletal muscles (gastrocnemius, G; soleus, S; tibialis anterior, TA; extensor digitorum longus, EDL) (Schüler et al. 2021). We found LMOD2 to be the most abundant leiomodin protein in whole skeletal muscle, consistent with data from (Tsukada et al. 2010; Nworu et al. 2015; Kiss et al. 2020), while the overall abundance of LMOD1 was lower since this protein has been mainly associated with smooth muscle cells (Nanda and Miano 2012; Conley et al. 2001; Nanda et al. 2018) (Figure 2B).”

      Overall, while the identification of Lmod1 as a pro-myogenic factor is convincing, the mechanistic insights are insufficient, and the manuscript would benefit from addressing these concerns.

      We thank the reviewer for their constructive criticism. In the revised manuscript, we have strengthened our mechanistic insights and the validation of our findings by implementing the suggestions of the reviewers and including new experimental data to address their concerns.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors identify Leiomodin-1 (LMOD1) as a key regulator of early myogenic differentiation, demonstrating its interaction with SIRT1 to influence SIRT1's cellular localization and gene expression. The authors propose that LMOD1 translocates SIRT1 from the nucleus to the cytoplasm to permit the expression of myogenic differentiation genes such as MYOD or Myogenin.

      Strengths:

      A major strength of this work lies in the robust temporal resolution achieved through a time-course mass spectrometry analysis of in vitro muscle differentiation. This provides novel insights into the dynamic process of myogenic differentiation, often under-explored in terms of temporal progression. The authors provide a strong mechanistic case for how LMOD1 exerts its role in muscle differentiation which opens avenues to modulate.

      We thank the reviewer for the positive feedback on our manuscript and the insightful comments which helped to improve the manuscript!

      Weaknesses:

      One limitation of the study is the in vivo data. Although the authors do translate their findings in vivo for LMOD1 localization and expression, the cross-sectional imaging is not highly convincing. Longitudinal cuts or isolated fibers could have been more useful specimens to answer these questions. Moreover, the authors do not assess their in vitro SIRT1 findings in vivo. A few key experiments in regenerating or aged mice would strengthen the mechanistic insight of the findings.

      We agree that longitudinal cuts and isolated fibers can provide excellent morphological detail for specific questions. However, for our primary objective in this study, which was to assess the temporal expression and localization of LMOD1 across the tissue during the regeneration process, we decided that cross-sectional analysis provided the most robust and reliable overview. Cross-sectional imaging effectively captures the spatial distribution of LMOD1 across multiple myofibers and their surrounding microenvironment, simultaneously assessing the whole cross-sectional area. By using this approach, we were able to evaluate the broader tissue architecture and cellular context, which was essential for understanding the dynamic changes occurring during regeneration. We were also able to investigate all myofibers of a muscle, and not only a small proportion, which we would analyze with longitudinal sections and isolated myofibers. Therefore, we continued using cross-sections for further analyses.

      We fully agree with the reviewer that validating our in vitro SIRT1 findings in an in vivo context is an essential next step. To address this, we performed additional analyses on our existing regenerating muscle samples and incorporated new immunostainings for SIRT1 and PAX7 into the regeneration time-course (now shown in revised Figure 4I), providing further in vivo support for our proposed mechanism. We focused specifically on cross-sections collected at day 5 post-injury, a time point selected based on the peak in LMOD1 expression, to assess whether SIRT1 levels increase in parallel with LMOD1 during regeneration. Notably, SIRT1 abundance is elevated at day 5 post-injury, underscoring its involvement in early myogenic differentiation. This conclusion is further supported by the localization of SIRT1 within mononucleated cells and newly formed myofibers at this stage of regeneration.

      Finally, we agree that further mechanistic studies in vivo would be highly valuable. While we were able to address SIRT1 dynamics in our regeneration model as suggested, an aged mouse cohort was unfortunately not available to us for this kind of study. Furthermore, more extensive in vivo experiments, such as those involving genetic manipulation, were beyond the scope of the current study, partly due to constraints related to animal welfare regulations and our approved experimental protocols.

      Discussion:

      Overall, the study emphasizes the importance of understanding the temporal dynamics of molecular players during myogenic differentiation and provides valuable proteomic data that will benefit the field. Future studies should explore whether LMOD1 modulates the nuclear-cytoplasmic shuttling of other transcription factors during muscle development and how these processes are mechanistically achieved. Investigating whether LMOD1 can be therapeutically targeted to enhance muscle regeneration in contexts such as exercise, aging, and disease will be critical for translational applications. Additionally, elucidating the interplay among LMOD1, LMOD2, and LMOD3 could uncover broader implications for actin cytoskeletal regulation in muscle biology.

      We thank the reviewer for this excellent suggestion for future analyses. We have included these important considerations and future avenues in the Discussion of the revised manuscript:

      “Our immunostaining data show that while LMOD1 is consistently cytoplasmic, its partner SIRT1 is only transiently localized in the cytoplasm. This suggests that their interaction is dynamically regulated. We hypothesize that the function of LMOD1 is determined by the changing availability of its binding partners during differentiation. During the initial phase, LMOD1 may primarily function to sequester SIRT1, a key regulator of myogenic genes. As differentiation proceeds, the increased expression of cytoskeletal components, such as its canonical partners TMODs and TPMs, likely shifts the function of LMOD1 towards its role in actin nucleation. This molecular switch, potentially driven by a change in the interactome of LMOD1, could then result in the release of SIRT1 from the cytoplasm. Such a mechanism may coordinate transcriptional regulation with cytoskeletal remodeling during myoblast differentiation.”

      “Moreover, delineating the functional specialization and potential redundancy among leiomodin proteins represents an important next step. Our data indicate that LMOD1 primarily regulates early myogenic differentiation (Figure 3). In contrast, the lack of an early functional phenotype upon LMOD2 depletion, together with its upregulation at later stages (Figure S2A), suggests a temporal shift in regulatory control. Accordingly, a systematic comparative analysis of LMOD1, LMOD2, and LMOD3 will be required to elucidate their distinct roles in actin cytoskeleton regulation across the myogenic program, particularly with respect to myofibril maturation and maintenance.”

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      Major Changes:

      (1) In Vivo Data on SIRT1:

      The inclusion of in vivo data on SIRT1 localization and expression would significantly strengthen the manuscript. Similar staining techniques used for LMOD1 could be applied to SIRT1. Additionally, imaging muscle specimens such as longitudinal sections or isolated myofibers would provide clearer insights into SIRT1's spatial distribution and improve upon the less convincing cross-sectional images currently presented (Figure 2).

      We fully agree that providing in vivo data on SIRT1 localization and expression is a crucial step to support our in vitro findings. Following the reviewer's suggestion, we have performed new experiments on muscle regeneration samples using the analyses of cross-sections as done for the analysis of LMOD1 localization. Specifically, we performed immunostaining for SIRT1 on cross-sections from muscle samples collected at day 5 post-injury, a time point selected based on the observed peak in LMOD1 expression. These new data (now included in revised Figure 4I) allowed us to assess whether SIRT1 levels increase during regeneration in parallel with an increase in LMOD1 abundance.

      Regarding the suggestion to use longitudinal sections or isolated myofibers, we agree that these preparations offer excellent answers for certain questions. For the primary goal of our study, to assess the temporal expression changes across the entire regenerating tissue at different time points, we found that cross-sections provided the most comprehensive and robust overview and therefore did not use longitudinal sections or isolated myofibers. 

      Performing additional animal experiments to obtain these specific preparations was beyond the scope of the current study and subject to constraints from our approved animal welfare protocols.

      (2) Morphology of siLmod1 Cells:

      The morphology of siLmod1-treated cells in vitro (Figure 3) raises concerns. Assessing cell viability or cell death in these experiments would help ensure that differences are not due to dead or unhealthy cells being quantified. There is also a notable discrepancy between the control panels in Figures 3C and 3H compared to the experimental conditions in 3F and 3K, particularly in terms of cell length and morphology. These inconsistencies should be addressed or clarified.

      We acknowledge the visual discrepancies in cell morphology noted by the reviewer (e.g., between Figures 3C/3H and 3F/3K). These differences can be attributed to biological variability between primary myoblast cultures isolated from different mice. Such variability includes differences in myogenic potential and the fact that cells are not synchronized, leading to variations in differentiation efficiency, baseline morphology, and cell length across cultures (Cornelison 2008; Vaughan and Lamia 2019). To account for this, we decided to use n=6 biological replicates, i.e., primary myoblast cultures isolated from 6 different mice, for immunofluorescence analysis, ensuring robust quantitative data. Furthermore, we confirmed that this phenotype was not an artifact of culture conditions, as we consistently observed the same effect of Lmod1 knockdown independently of the passage number of the myoblasts or the donor mouse.

      To address the concerns that morphological changes in siLmod1-treated cells might reflect cell death, we performed a TUNEL assay (transfection at day 1, analysis at day 3 of differentiation). This revealed no significant increase in TUNEL-positive (apoptotic) cells in siLmod1- (or siSirt1-) transfected samples versus siCtrl-transfected cells. These new data have been added to the revised manuscript as Supplementary Figure S2I. The TUNEL data indicate that the observed morphological changes upon knockdown of Lmod1 are not due to induced cell death. Supported by these results, our interpretation is that knockdown of Lmod1 impairs or arrests differentiation rather than causing cell death. Furthermore, our quantification of different cell populations showed shifts indicative of impaired differentiation (e.g., accumulation of cells at earlier stages) without exhibiting significant loss in cell numbers. For example, the numbers of myogenin+/MHC- and myogenin+/MHC+ cell populations, and differentiated myotubes, were not significantly reduced after transfection with siLmod1. A slight, not significant trend towards fewer non-proliferating myoblasts/reserve cells characterized by the expression of Myogenin-/MHC-Hoechst+ (Figure S2H) was noted. Overall, cells appeared to be 'stuck' in differentiation, consistent with the role of Lmod1 in impairing differentiation but not causing cell death. We have further clarified this aspect in the revised manuscript.

      (3) LMOD1 and SIRT1 Interaction in Myogenic Cells:

      Strengthening the connection between LMOD1 and SIRT1 within the myogenic system would enhance the manuscript. Could proximity ligation assays (PLA) be performed in myogenic cells, as was done in HEK293T cells? Additionally, investigating whether SIRT1 remains in the nucleus upon LMOD1 knockdown using siRNA would provide mechanistic insight into their interaction during myogenic differentiation.

      We would like to clarify that the Proximity Ligation Assays (PLA) shown in Figure 4H were indeed performed in primary myoblasts, confirming the LMOD1-SIRT1 interaction directly in a myogenic context. We have modified the text to clarify that primary myoblasts were used for the PLA assays.

      Minor Points:

      (1) Was Lmod1 knockdown confirmed in vivo?

      To target Lmod1 in Muscle Stem Cells (MuSCs) in vivo, we utilized self-delivering Accell siRNAs. This delivery system has been previously validated and shown to be highly effective for targeting MuSCs in regenerating muscle (Bentzinger et al., Cell Stem Cell, 2013).

      While this is an established method for delivery, confirming knockdown specifically within the rare MuSC population is technically challenging using bulk tissue analysis, as the target signal is diluted by numerous other cell types. 

      Therefore, to ensure the efficacy of our specific siRNA, we performed in vitro validation. For the reviewers' interest, we add Author response image 2 showing the efficiency of the respective siRNAs:

      Author response image 2.

      Knockdown efficiency of siRNAs targeting Lmod1 and Lmod2 following using the same self-delivering siRNA in proliferating primary myoblasts as used in in vivo experiments. Self-delivering Accell siRNA was added to primary myoblasts cultured in low serum media for 48 hours. Relative mRNA expression levels of Lmod1 and Lmod2 were measured after self-delivering Accell siRNA transfection targeting either Lmod1 (siLmod1) or Lmod2 (siLmod2). Expression levels were compared to control siRNA-transfected cells (siCtrl) and normalized to Gapdh expression.

      Based on the documented efficacy of this delivery system from prior literature and our own validation of the specific siRNAs used here, we are confident in the knockdown efficiency of the respective siRNAs. We decided not to perform additional animal experiments due to animal welfare considerations.

      (2) Some of the western blot bands do not appear to match the expected patterns for the tested proteins compared to controls (e.g., Figure S2J, S4C). Ensure that these are accurately labeled and include the entire membrane for transparency and reproducibility.

      Regarding Figure S2J, we agree that the presentation could be confusing to the reader. The blot shows LMOD1 and LMOD2 knockdown, while the bar plot quantifies only the change in LMOD2 levels. We have now revised the figure legend to explicitly state this. We hope this makes the presentation of our data clearer.

      For Figure S4C, we believe the concern about 'patterns' relates to loading variability. In this experiment, we manually counted the nuclei before lysis to ensure that each nuclear fraction started with an equal amount of material. We then loaded the cytoplasmic fractions in proportion to these counts. The purity of the fractions was additionally confirmed using nuclear (H4) and cytoplasmic (ALDOA) markers. As stated in the figure, the nuclear/cytoplasmic ratio of LMOD1 or SIRT1 was normalized across the entire lane of the Ponceau S staining, which we have now clarified in the relevant figure legends.

      Finally, regarding transparency, the presented immunoblot images are representative crops, which is standard practice for clarity. We are committed to reproducibility and will provide full, uncropped scans of all blots in the final version of the manuscript, in line with eLife publishing guidelines. 

      (3) Figure S1B appears to reuse images from Figure 2D (rotated). Verify that this is acceptable for the journal's guidelines, and if necessary, provide additional justification or clarification.

      We acknowledge that the image presented in Figure S1B was accidentally reused as a representative example in Figure 2D. To address this and prevent any potential redundancy or confusion, we have revised Figure S1B by replacing the duplicated image with a different, representative example from our dataset. The updated figure now contains unique image data, and we believe this revision fully resolves the concern.

      (4) Ensure consistent scale bars across images, particularly in Figures 3C and 3H, where discrepancies might affect interpretation.

      We thank the reviewer for pointing this out, we have now standardized all scale bars throughout the manuscript to ensure consistency. All immunofluorescence images of cultured cells (including Fig 3C and 3H) now have a 50 µm scale bar, and all tissue cross-sections have a 100 µm scale bar. This change has been implemented in the revised figures.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript, the investigators identified LMOD1 as one of a subset of cytoskeletal proteins whose levels increase in the early stages of myogenic differentiation. Lmod1 is understudied in striated muscle and in particular in myogenic differentiation. Thus, this is an important study. It is also a very thorough study - with perhaps even too much data presented. Importantly, the investigators observed that LMOD1 appears to be important for skeletal regeneration, and myogenic differentiation and that it interacts with SIRT1. Both primary myoblast differentiation and skeletal muscle regeneration were studied. Rescue experiments confirmed these observations: SIRT1 can rescue perturbations of myogenic differentiation as a result of LMOD1 knockdown.

      Strengths:

      Particular strengths include: important topic, the use of primary skeletal cultures, the use of both cell culture and in vivo approaches, careful biomarker analysis of primary mouse myoblast differentiation, the use of two methods to probe the function of the Lmod1/SIRT1 pathway via using depletion approaches and inhibitors, and generation of six independent myoblast cultures. Results support their conclusions.

      We thank the reviewer for the positive assessment of our work and the helpful comments for improving our manuscript.

      Weaknesses:

      (1) Figure 1. Images of cells in Figure 1A are too small to be meaningful (especially in comparison to the other data presented in this figure). Perhaps the authors could make graphs smaller?

      We have adjusted the size of the images across all figure panels to ensure better visibility and clarity. We hope these adjustments improve the presentation of the data.

      (2) Line 148 "We found LMOD2 to be the most abundant Lmod in the whole skeletal muscle." This is confusing since most, if not all, prior studies have shown that Lmod3 is the predominant isoform in skeletal muscle. The two papers that are cited are incorrectly cited. Clarification to resolve this discrepancy is needed.

      We acknowledge that LMOD2 and LMOD3 are predominantly expressed in skeletal and cardiac muscles (Tsukada et al. 2010; Nworu et al. 2015), www.proteinatlas.org) and LMOD3’s transcription is directly regulated by MRTF/SRF and MEF2 to coordinate sarcomeric assembly (Cenik et al. 2015). However, our statement refers specifically to the analysis of the proteomic datasets from freshly isolated MuSCs and four distinct skeletal muscles (G, S, TA, EDL) generated by Schüler et al. 2021. Crucially, LMOD3 was not detected in the quantitative mass spectrometry data for the EDL, G, S, or TA muscle samples analyzed in this specific study. In the context of this particular dataset, LMOD2 was the most highly abundant Leiomodin isoform detected in the whole skeletal muscle samples. This finding suggests a differential expression and function between LMOD isoforms depending on the muscle type and/or developmental/regenerative state. We have revised and corrected this clarification in the manuscript, including correcting the initial citations.

      (3) Figure 2. Immunoflorescence (IF) panels are too small to be meaningful. Perhaps the graphs could be made smaller and more space allocated for the IF panels? This issue is apparent for just about all IF panels - they are simply too small to be meaningful. Additionally, in many of the immunofluorescence figures, the colors that were used make it difficult to discern the stained cellular structures. For example, in Figure S1, orange and purple are used - they do not stand out as well as other colors that are more commonly used.

      We agree that the IF panels were too small for optimal interpretation and have adjusted them in Figure 2 and throughout the manuscript. Regarding the color choices, we appreciate the reviewer's comments. Our initial selection (e.g., orange and purple in Figure S1) was intended to enhance accessibility for individuals with common color vision deficiencies, including red-green color blindness. However, we acknowledge the reviewer's point that these combinations provided insufficient contrast for discerning cellular structures. Therefore, we have revised the color schemes to use green, red, and blue, which should offer improved contrast.

      (4) There is huge variability in many experiments presented - as such, more samples appear to be required to allow for meaningful data to be obtained. For example, Figure S2. Many experimental groups, only have 3 samples - this is highly problematic - I would estimate that 5-6 would be the minimum.

      We thank the reviewer for the comment regarding experimental variability and sample size. In our study, n=3 biological replicates, i.e., independent primary cell cultures obtained from different mice, were primarily used for immunoblots. We acknowledge that variability can be observed between distinct primary cell cultures due to factors such as inherent differences in myogenic potential, cell cycle state (as cells were not synchronized), and passage number. Importantly, despite this inter-sample variation, the investigated phenotypes showed consistent trends across biological replicates. Rather than increasing the number of replicates for immunoblots, we opted for validating our key findings using independent approaches with a higher number of replicates. For instance, qRT-PCR analyses (to confirm knockdown efficiency) and immunofluorescence analyses were mostly performed using five to six independent myoblast cultures (biological replicates).

      (5) Ponceau S staining is often used as a loading control in this manuscript for western blots. The area/molecular weight range actually used should be specified. Not clear why in some experiments GAPDH staining is used, in other experiments Ponceau S staining is used, and in some, both are used. In some experiments, the variability of total protein loaded from lane to lane is disconcerting. For example, in Figure S4C there appears to be more than normal variability. Can the protein assay be redone and samples run again?

      We have clarified in the relevant figure legends that Ponceau S normalization, when used, was based on the quantification of the entire lane. Our standard loading control is GAPDH. We used Ponceau S for normalization only when GAPDH was deemed unsuitable, e.g., in nuclear-cytoplasmic fractionation experiments where GAPDH is not present in all fractions.

      Concerning the variability observed in Figure S4C, we manually counted the nuclei before lysis to ensure that each nuclear fraction started with an equal amount of material. We then loaded the cytoplasmic fractions in proportion to these counts. The purity of the fractions was additionally confirmed using nuclear (H4) and cytoplasmic (ALDOA) markers. The nuclear/cytoplasmic ratio of LMOD1 or SIRT1 was normalized across the entire lane of the Ponceau S staining, which we have now clarified in the relevant figure legends.

      (6) Figure S3 - Lmod3 is included in the figure but no mention of it occurs in the title of the figure and/or legend.

      We wish to clarify that the protein identified in Figure S3 is TMOD3 (Tropomodulin 3), not LMOD3. TMOD3 is a known pointed-end capping protein regulating the actin filament nucleation process together with LMODs (Fowler and Dominguez 2017; Boczkowska et al. 2015), so its presence in our dataset was expected and helps validate our results.

      (7) Abstract, line 25. "overexpression accelerates and improves the formation of myotubes". This is a confusing sentence. How is it improving the formation? A little more information about how they are different than developing myotubes in normal/healthy muscles would be helpful.

      We thank the reviewer for the comment. To clarify, we have revised the sentence in line 25 to "improves the initiation of myotube formation." This change reflects our observation that overexpression of LMOD1 leads to a more rapid onset of myotube formation, as evidenced by earlier expression of differentiation markers and accelerated fusion of myoblasts into myotubes compared to GFP overexpression myoblast cell line. These findings suggest that LMOD1 overexpression enhances the efficiency of the early stages of differentiation and fusion, thereby contributing to improved initiation of myotube formation.

      (8) It is impossible from the IF figures presented to determine where Lmod1 localizes in the myocytes. Information on its subcellular localization is important. Does it localize with Lmod2 and Lmod3 at thin filament pointed ends?

      Several publications suggest that LMODs are involved in actin nucleation and interact with TMODs at the thin filament pointed ends (Boczkowska et al. 2015; Fowler and Dominguez 2017; Fowler, Greenfield, and Moyer 2003; Tsukada et al. 2010; Rao, Madasu, and Dominguez 2014). We performed F-actin (Phalloidin) staining together with LMOD1 staining and observed possible co-localization (see Author response image 3). Specifically, we noted an accumulation of LMOD1 at the ends of myocytes, indicating that LMOD1 might play a role in the elongation and guidance of myotube differentiation. For the reviewer’s interest, we include Author response image 3 as it was not part of the original manuscript. While performing subcellular localization stainings, we added the F-actin/Phalloidin staining to explore potential interactions, but this aspect was not further investigated in the current study.

      Author response image 3.

      Co-staining of LMOD1 and Phalloidin in differentiating myocytes.Example image showing immunofluorescence staining of LMOD1 (purple) and F-actin (green; Phalloidin) in differentiating primary myocytes. LMOD1 appears to accumulate at the ends of elongated myocytes and co-localizes with actin structures (highlighted in boxes), suggesting a potential role in myotube elongation and guidance during differentiation.

      Our study focused on a distinct role for LMOD1, independent from its function in actin filament nucleation, and we therefore did not pursue further co-localization staining with LMOD2 or LMOD3. We recognize the potential importance of exploring these interactions and their relevance to thin filament organization in skeletal muscle. However, although this was beyond the scope of our current work, we will investigate this aspect in the future.

      References

      Boczkowska, Malgorzata, Grzegorz Rebowski, Elena Kremneva, Pekka Lappalainen, and Roberto Dominguez. 2015. “How Leiomodin and Tropomodulin Use a Common Fold for Different Actin Assembly Functions.” Nature Communications 6 (1): 8314.

      Cenik, Bercin K., Ankit Garg, John R. McAnally, John M. Shelton, James A. Richardson, Rhonda Bassel-Duby, Eric N. Olson, and Ning Liu. 2015. “Severe Myopathy in Mice Lacking the MEF2/SRF-Dependent Gene Leiomodin-3.” The Journal of Clinical Investigation 125 (4): 1569–78.

      Cornelison, D. D. W. 2008. “Context Matters: In Vivo and in Vitro Influences on Muscle Satellite Cell Activity.” Journal of Cellular Biochemistry 105 (3): 663–69.

      Fowler, Velia M., and Roberto Dominguez. 2017. “Tropomodulins and Leiomodins: Actin Pointed End Caps and Nucleators in Muscles.” Biophysical Journal 112 (9): 1742–60.

      Fowler, Velia M., Norma J. Greenfield, and Jeannette Moyer. 2003. “Tropomodulin Contains Two Actin Filament Pointed End-Capping Domains.” The Journal of Biological Chemistry 278 (41): 40000–9.

      Liu, Ling, Tom H. Cheung, Gregory W. Charville, Bernadette Marie Ceniza Hurgo, Tripp Leavitt, Johnathan Shih, Anne Brunet, and Thomas A. Rando. 2013. “Chromatin Modifications as Determinants of Muscle Stem Cell Quiescence and Chronological Aging.” Cell Reports 4 (1): 189–204.

      Lukjanenko, Laura, M. Juliane Jung, Nagabhooshan Hegde, Claire Perruisseau-Carrier, Eugenia Migliavacca, Michelle Rozo, Sonia Karaz, et al. 2016. “Loss of Fibronectin from the Aged Stem Cell Niche Affects the Regenerative Capacity of Skeletal Muscle in Mice.” Nature Medicine 22 (8): 897–905.

      Nworu, Chinedu U., Robert Kraft, Daniel C. Schnurr, Carol C. Gregorio, and Paul A. Krieg. 2015. “Leiomodin 3 and Tropomodulin 4 Have Overlapping Functions during Skeletal Myofibrillogenesis.” Journal of Cell Science 128 (2): 239–50.

      Rao, Jampani Nageswara, Yadaiah Madasu, and Roberto Dominguez. 2014. “Mechanism of Actin Filament Pointed-End Capping by Tropomodulin.” Science 345 (6195): 463–67.

      Schüler, Svenja C., Joanna M. Kirkpatrick, Manuel Schmidt, Deolinda Santinha, Philipp Koch, Simone Di Sanzo, Emilio Cirri, Martin Hemberg, Alessandro Ori, and Julia von Maltzahn. 2021. “Extensive Remodeling of the Extracellular Matrix during Aging Contributes to Age-Dependent Impairments of Muscle Stem Cell Functionality.” Cell Reports 35 (10): 109223.

      Tsukada, Takehiro, Christopher T. Pappas, Natalia Moroz, Parker B. Antin, Alla S. Kostyukova, and Carol C. Gregorio. 2010. “Leiomodin-2 Is an Antagonist of Tropomodulin-1 at the Pointed End of the Thin Filaments in Cardiac Muscle.” Journal of Cell Science 123 (Pt 18): 3136–45.

      Vaughan, Megan, and Katja A. Lamia. 2019. “Isolation and Differentiation of Primary Myoblasts from Mouse Skeletal Muscle Explants.” Journal of Visualized Experiments: JoVE, no. 152 (October). https://doi.org/10.3791/60310.

    1. eLife Assessment

      This elegant study presents a valuable approach to probing the structural features of the full-length human Hv1 channel as a purified protein, supported by rigorous biochemical assays and spectral FRET analysis, which will interest biophysicists and physiologists studying Hv1 and other ion channels. Overall, the work introduces an interesting labeling strategy and provides methodological observations that are of value in investigating hHV1. However, the analysis appears incomplete, requiring additional structural interpretation and mechanistic insight.

    2. Reviewer #1 (Public review):

      In this study, the noncanonical amino acid acridon-2-ylalanine (Acd) was inserted at various positions within the human Hv1 protein using a genetic code expansion approach. The purified mutants with incorporated fluorophore were shown to be functional using a proton flux assay in proteoliposomes. FRET between native tryptophan and tyrosine residues and Acd were quantified using spectral FRET analysis. Predicted FRET efficiencies calculated from an AlphaFold model of the Hv1 dimer were compared to the corresponding experimental values. Spectral FRET analysis was also used to test whether structural rearrangements caused by Zn2+, a well-known Hv1 inhibitor, could be detected. The experimental data provide a good validation of the approach, but further expansion of the analysis will be necessary to differentiate between intra- and intersubunit structural features.

      Interestingly, the observed rearrangements induced by Zn2+ were not limited to the protein region proximal to the extracellular binding site but extended to the intracellular side of the channel. This finding agrees with previous studies showing that some extracellular Hv1 inhibitors, such as Zn2+ or AGAP/W38F, can cause long-range structural changes propagating to the intracellular vestibule of the channel (De La Rosa et al. J. Gen. Physiol. 2018, and Tang et al. Brit J. Pharm 2020). The authors should consider adding these references.

      Since one of the main goals of this work was to validate Acd incorporation and the spectral FRET analysis approach to detect conformational changes in hHv1 in preparation for future studies, the authors should consider removing one subunit from their dimer model, recalculating FRET efficiencies for the monomer, and comparing the predicted values to the experimental FRET data. This comparison could support the idea that the reported FRET measurements can inform not only on intrasubunit structural features but also on subunit organization.

    3. Reviewer #2 (Public review):

      This manuscript by Carmona, Zagotta, and Gordon is generally well-written. It presents a crude and incomplete structural analysis of the voltage-gated proton channel based on measured FRET distances. The primary experimental approach is Förster Resonance Energy Transfer (FRET), using a fluorescent probe attached to a noncanonical amino acid. This strategy is advantageous because the noncanonical amino acid likely occupies less space than conventional labels, allowing more effective incorporation into the channel structure.

      Fourteen individual positions within the channel were mutated for site-specific labeling, twelve of which yielded functional protein expression. These twelve labeling sites span discrete regions of the channel, including P1, P2, S0, S1, S2, S3, S4, and the dimer-connecting coiled-coil domain. FRET measurements are achieved using acridon-2-ylalanine (Acd) as the acceptor, with four tryptophan or four tyrosine residues per monomer serving as donors. In addition to estimating distances from FRET efficiency, the authors analyze full FRET spectra and investigate fluorescence lifetimes on the nanosecond timescale.

      Despite these strengths, the manuscript does not provide a clear explanation of how channel structure changes during gating. While a discrepancy between AlphaFold structural predictions and the experimental measurements is noted, it remains unclear whether this mismatch arises from limitations of the model or from the experimental approach. No further structural analysis is presented to resolve this issue or to clarify the conformational states of the protein.

      The manuscript successfully demonstrates that Acd can be incorporated at specific positions without abolishing channel function, and it is noteworthy that the reconstituted proteins function as voltage-activated proton channels in liposomes. The authors also report reversible zinc inhibition of the channel, suggesting that zinc induces structural changes in certain channel regions that can be reversed by EDTA chelation. However, this observation is not explored in sufficient depth to yield meaningful mechanistic insight.

      Overall, while the study introduces an interesting labeling strategy and provides valuable methodological observations, the analysis appears incomplete. Additional structural interpretation and mechanistic insight are needed.

      Major Points

      (1) Tryptophan and tyrosine exhibit similar quantum yields, but their extinction coefficients differ substantially. Is this difference accounted for in your FRET analysis? Please clarify whether this would result in a stronger weighting of tryptophan compared to tyrosine.

      (2) Is the fluorescence of acridon-2-ylalanine (Acd) pH-dependent? If so, could local pH variations within the channel environment influence the probe's photophysical properties and affect the measurements?

      (3) Several constructs (e.g., K125Tag, Y134Tag, I217Tag, and Q233Tag) display two bands on SDS-PAGE rather than a single band. Could this indicate incomplete translation or premature termination at the introduced tag site? Please clarify.

      (4) In Figure 5F, the comparison between predicted FRET values and experimentally determined ratio values appears largely uninformative. The discussion on page 9 suggests either an inaccurate structural model or insufficient quantification of protein dynamics. If the underlying cause cannot be distinguished, how do the authors propose to improve the structural model of hHV1 or better describe its conformational dynamics?

      (5) Cu²⁺, Ru²⁺, and Ni²⁺ are presented as suitable FRET acceptors for Acd. Would Zn²⁺ also be expected to function as an acceptor in this context? If so, could structural information be derived from zinc binding independently of Trp/Tyr?

      (6) The investigated structure is most likely dimeric. Previous studies report that zinc stabilizes interactions between hHV1 monomers more strongly than in the native dimeric state. Could this provide an explanation for the observed zinc-dependent effects? Additionally, do the detergent micelles used in this study predominantly contain monomers or dimers?

      (7) hHV1 normally inserts into a phospholipid bilayer, as used in the reconstitution experiments. In contrast, detergent micelles may form monolayers rather than bilayers. Could the authors clarify the nature of the micelles used and discuss whether the protein is expected to adopt the same fold in a monolayer environment as in a bilayer?

    1. eLife Assessment

      This study addresses a fundamental question in glycobiology by elucidating how a single-site processive enzyme orchestrates the alternating addition of sugars to synthesize complex polysaccharides such as hyaluronan. The findings are compelling, providing a clear mechanistic framework supported by strong experimental validation. Major strengths include the integration of high-resolution structural data with rigorous biochemical analyses, resulting in a well-supported model of hyaluronan assembly.

    2. Reviewer #1 (Public review):

      Summary:

      This revised manuscript describes critical intermediate reaction steps of a HA synthase at the molecular level; specifically, they examine the 2nd step, polymerization, adding GlcA to GlcNAc to form the initial disaccharide of the repeating HA structure. Unlike the vast majority of known glycosyltransferases, the viral HAS (a convenient proxy extrapolated to resemble the vertebrate forms) uses a single pocket to catalyze both monosaccharide transfer steps. The authors work illustrates the interactions needed to bind & proof-read the UDP-GlcA using direct and '2nd layer' amino acid residues. This step also allows the HAS to distinguish the two UDP-sugars; this is very important as the enzymes are not known or observed to make homopolymers of only GlcA or GlcNAc, but only make the HA disaccharide repeats GlcNAc-GlcA.

      Strengths:

      Techniques & analysis; overview of HA synthase mechanisms

      Weaknesses:

      None

      Comments on revisions:

      Previous clarity issues in the original submission were all resolved. Again, this is a very well done body of work!!

    3. Reviewer #2 (Public review):

      Summary:

      The paper by Stephens and co-workers provides important mechanistic insight into how hyaluronan synthase (HAS) coordinates alternating GlcNAc and GlcA incorporation using a single Type-I catalytic centre. Through cryo-EM structures capturing both "proofreading" and fully "inserted" binding poses of UDP-GlcA, combined with detailed biochemical analysis, the authors show how the enzyme selectively recognizes the GlcA carboxylate, stabilizes substrates through conformational gating, and requires a priming GlcNAc for productive turnover.

      These findings clarify how one active site can manage two chemically distinct donor sugars while simultaneously coupling catalysis to polymer translocation.

      The work also reports a DDM-bound, detergent-inhibited conformation that possibly illuminates features of the acceptor pocket, although this appears to be a purification artefact (it is indeed inhibitory) rather than a relevant biological state.

      Overall, the study convincingly establishes a unified catalytic mechanism for Type-I HAS enzymes and represents a significant advance in understanding HA biosynthesis at the molecular level.

      Strengths:

      There are many strengths.

      This is a multi-disciplinary study with very high-quality cryo-EM and enzyme kinetics (backed up with orthogonal methods of product analysis) to justify the conclusions discussed above.

      Comments on revisions:

      The suggestions made in the initial comments have all been responded to very well.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript describes critical intermediate reaction steps of a HA synthase at the molecular level; specifically, it examines the 2nd step, polymerization, adding GlcA to GlcNAc to form the initial disaccharide of the repeating HA structure. Unlike the vast majority of known glycosyltransferases, the viral HAS (a convenient proxy extrapolated to resemble the vertebrate forms) uses a single pocket to catalyze both monosaccharide transfer steps. The authors' work illustrates the interactions needed to bind & proof-read the UDP-GlcA using direct and '2nd layer' amino acid residues. This step also allows the HAS to distinguish the two UDP-sugars; this is very important as the enzymes are not known or observed to make homopolymers of only GlcA or GlcNAc, but only make the HA disaccharide repeats GlcNAc-GlcA.

      Strengths:

      Overall, the strengths of this paper lie in its techniques & analysis.

      The authors make significant leaps forward towards understanding this process using a variety of tools and comparisons of wild-type & mutant enzymes. The work is well presented overall with respect to the text and illustrations (especially the 3D representations), and the robustness of the analyses & statistics is also noteworthy.

      Furthermore, the authors make some strides towards creating novel sugar polymers using alternative primers & work with detergent binding to the HAS. The authors tested a wide variety of monosaccharides and several disaccharides for primer activity and observed that GlcA could be added to cellobiose and chitobiose, which are moderately close structural analogs to HA disaccharides. Did the authors also test the readily available HA tetramer (HA4, [GlcA-GlcNAc]2) as a primer in their system? This is a highly recommended experiment; if it works, then this molecule may also be useful for cryo-EM studies of CvHAS as well.

      The reviewer requested testing whether an HA tetratsaccharide could also serve as an glycosyl transfer acceptor for HAS. The commerically available HA tetrasaccharide (HA4) is terminated at its non-reducing end by GlcA, therein we proceeded to measure its effect on UDP-GlcNAc turnover kientics. Titration of HA4 failed to elicit any detectable change in UDP-GlcNAc turnover rate, indicating no priming. This is now mentioned in the main text and the data is shown in Fig. S9.

      Weaknesses:

      In the past, another report describing the failed attempt of elongating short primers (HA4 & chitin oligosaccharides larger than the cello- or chitobiose that have activity in this report) with a vertebrate HAS, XlHAS1, an enzyme that seems to behave like the CvHAS ( https://pubmed.ncbi.nlm.nih.gov/10473619/); this work should probably be cited and briefly discussed. It may be that the longer primers in the 1999 paper and/or the different construct or isolation specifics (detergent extract vs crude) were not conducive to the extension reaction, as the authors extracted recombinant enzyme.

      We apologize for the oversight. This reference is now cited (ref. 18) together with the description of the failed elongation of HA4 by CvHAS.

      There are a few areas that should be addressed for clarity and correctness, especially defining the class of HAS studied here (Class I-NR) as the results may (Class I-R) or may not (Class II) align (see comment (a) below), but overall, a very nicely done body of work that will significantly enhance understanding in the field.

      Done as requested

      Reviewer #2 (Public review):

      Summary:

      The paper by Stephens and co-workers provides important mechanistic insight into how hyaluronan synthase (HAS) coordinates alternating GlcNAc and GlcA incorporation using a single Type-I catalytic centre. Through cryo-EM structures capturing both "proofreading" and fully "inserted" binding poses of UDP-GlcA, combined with detailed biochemical analysis, the authors show how the enzyme selectively recognizes the GlcA carboxylate, stabilizes substrates through conformational gating, and requires a priming GlcNAc for productive turnover.

      These findings clarify how one active site can manage two chemically distinct donor sugars while simultaneously coupling catalysis to polymer translocation.

      The work also reports a DDM-bound, detergent-inhibited conformation that possibly illuminates features of the acceptor pocket, although this appears to be a purification artefact (it is indeed inhibitory) rather than a relevant biological state.

      Overall, the study convincingly establishes a unified catalytic mechanism for Type-I HAS enzymes and represents a significant advance in understanding HA biosynthesis at the molecular level.

      Strengths:

      There are many strengths.

      This is a multi-disciplinary study with very high-quality cryo-EM and enzyme kinetics (backed up with orthogonal methods of product analysis) to justify the conclusions discussed above.

      Weaknesses:

      There are few weaknesses.

      The abstract and introduction assume a lot of detailed prior knowledge about hyaluronan synthases, and in doing so, risk lessening the readership pool.

      A lot of discussion focuses on detergents (whose presence is totally inhibitory) and transfer to non-biological acceptors (at high concentrations). This risks weakening the manuscript.

      The abstract and parts of the introduction have been revised to address the reviewer’s concerns.

      Reviewer #1 (Recommendations for the authors):

      (1) As noted above, please state in title, abstract & introduction that this work is focused on a "Class I-NR HAS" (as described in Ref. #4), and NOT all HAS families...this is truly essential to note as someone working with the Pasteurella HAS version (Class II) would be totally misled & at this point, no one knows the Streptococcus HAS (Class-IR) mechanistic details which could be different due to its inverse molecular directionality of elongation compared to the CvHAS Class I-NR enzyme.

      Done as requested.

      (2) Page 6 - for the usefulness of the HAS mutants as being folded correctly, it was stated these mutants are suitable since they all 'purify' similarly...the use of the more proper term should probably be 'chromatograph', similarly suggesting similar hydrodynamic radii without massive folding issues.

      This has been revised to state that they all exhibited comparable size exclusion chromatography profiles.

      “All mutants share similar size exclusion chromatography profiles with the WT enzyme, suggesting that the substitutions do not cause a folding defect (Fig. S3).”

      (3) Page 7 - please check these sentences (& rest of paragraph?) as the meaning is not clear. "First, UDP-GlcNAc was titrated in the presence of excess UDP-GlcA, resulting in a response similar to the acceptor-free condition (Fig. 2C). However, the maximum reaction velocity at 20 mM UDP-GlcNAc was approximately 25% lower than that measured in the presence of UDP-GlcNAc only (Fig. 2C)."

      The paragraph has been revised to avoid confusion.

      (4) In Methods, please use an italicized 'g' for the centrifugation steps globally.

      Changed as requested

      (5) Please note the source/vendor for the HA standards on gels.

      Done

      (6) Page 35 - TLC section.

      (a) 'n-butanol' (with italic n) is the most widespread chemical name (not butan-1-ol).

      Done

      (b) Also, for all of the TLC images, the origin and the solvent front should be marked.

      Changed as suggested.

      Reviewer #2 (Recommendations for the authors):

      A number of minor issues should be addressed.

      (1) Abstract

      Two comments on the Abstract, which I found surprisingly weak given the quality of the work, and lacking a key detail.

      A major conceptual contribution of this work is the demonstration of how a single Type-I catalytic centre discriminates, positions, and transfers two chemically distinct substrates in an alternating pattern. This distinguishes HAS from dual-active-site (Type-II) glycosyltransferases and is important for understanding HA polymerization.

      However, this central point is not clearly articulated in the abstract. I suggest explicitly stating that HAS performs both GlcNAc and GlcA transfer reactions within a single catalytic site, and that the proofreading/inserted poses illuminate how this multifunctionality is achieved.

      The abstract currently ends with the observation of a DDM-bound, detergent-inhibited state. While this is interesting, it absolutely does not represent the central conceptual advance of the study and gives the abstract an artefactual ending.

      I strongly recommend revising the final sentences to emphasize the broader mechanistic insight and not an "artefact" (indeed, the enzyme is inactive in the presence of this detergent; it is thus a very unusual way to conclude an abstract).

      That is, finish with the wider implications of how HAS coordinates alternating substrate use, proofreading, and polymer translocation. Ending on the main mechanistic or biological significance would make the abstract considerably stronger and more aligned with the main message of the paper.

      The abstract has been revised thoroughly to reflect the important insights gained on CvHAS’ catalytic function and HA biogenesis in general.

      (2) Introduction

      The distinction between single active-centre enzymes, which transfer both sugars alternately, and twin catalytic domain enzymes that each perform one addition is surely central to the whole paper. But it is not discussed. Surely this has to be covered. There is a lot of work in this space, including, but not limited to:

      https://doi.org/10.1093/glycob/cwg085

      https://doi.org/10.1093/glycob/10.9.883

      https://doi.org/10.1093/glycob/cwad075 (includes this author team)

      Originally back to https://doi.org/10.1021/bi990270y

      If the authors instead assume such a level of knowledge for the reader, then surely they are writing for a specialist audience, not consistent with the wider readership ambitions of eLife?

      The Introduction has been revised as suggested by the reviewer, providing necessary background to frame our description of the Chlorella virus HAS. We made a deliberate effort to put new insights into a broader context.

      (3) Results and Discussion

      DDM "was observed for >50% of the analysed particles". I struggled with this. I couldn't understand how the authors selected particles that did or did not contain DDM. The main body text states: "To our surprise, careful sorting of the UDP-GlcA supplemented cryo EM dataset revealed a CvHAS subpopulation that was not bound to the substrate, but, instead, a DDM molecule near the active site (Fig 3A and S7). This was observed for >50% of the analyzed particles."

      That reads like there is one sample with two populations. But the figures and the methods section suggest differently: they suggest two samples with different data-collection regimes. That does not match the main text. Could this be clarified?

      Yes, that wasn’t explained well. We clarified the text to stress that the DDM-bound sample came from a dataset that was intended to resolve an UDP-GlcA-bound state, but instead revealed the inhibition by DDM.

      Also in this space, in the modern world, "nominal magnification" has no real meaning, and calibrated pixel size would be more appropriate. Can this be given, please?

      The relevant Methods section now states: “imaging of … was performed at a calibrated pixel size of 0.652 Å”.

      The discovery of DDM in the active site is surprising. But it is an inhibitory artefact. Is this section pushed a little too hard? Also, "The coordination of DDM's maltoside moiety, an αlinked glucose disaccharide, is consistent with priming by cellobiose and chitobiose." I'm not sure why an α-linked maltose is consistent with the binding of a β-linked cellobiose. That makes no sense. There will be no other enzymes where starch and cellulose oligos are mutually accepted. Consider rewriting.

      We like to stress the DDM coordination because it could lead to the development of compounds that can really function as inhibitors, either for HAS or other related enzymes. In the observed DDM binding pose, the alpha-linkage is not recognized. Instead, the reducing end glucosyl unit stacks against Trp342 while the non-reducing unit extends into the catalytic pocket. Hence, a similar binding pose is conceivable for cellobiose and potentially also for chitobiose. The relevant section has been reworded.

    1. eLife Assessment

      This important study introduces an approach to discovering antibiotic resistance determinants by leveraging diverse susceptibility profiles among related mycobacterial species, with particular relevance to high-level resistance against natural product-derived antibiotics. The research provides convincing evidence for the role of ADP-ribosylation enzymes in rifamycin resistance among mycobacteria, whilst also demonstrating that antibiotic susceptibility is not correlated with growth rate or intracellular compound concentration. The revision is substantively improved, but some broader claims still require additional experimental support. This work lays a significant foundation for understanding the complexity of antibiotic resistance mechanisms in mycobacteria and opens new avenues for future antimicrobial research.

    2. Reviewer #1 (Public review):

      This work analyzes innate resistance to drugs in mycobacteria by comparing minimum inhibitory concentrations (MICs) across a diverse panel of mycobacterial species. The results show that MICs are poorly correlated with growth rate while phylogeny associated with horizontal gene transfer underlies the observed differences in MIC, an important demonstration. A further investigation into the driver for the vast differences in susceptibility profiles shows that for three drugs the MIC is not correlated with intrabacterial drug concentrations where intrabacterial drug concentration is comprised of cytosolic and cell wall associated drug. This is a striking observation. The authors delve into the mechanisms that drive resistance to rifamycins and confirm that resistance is driven by ADP-ribosyltransferases of which two variant groups exist, one of which is kinetically faster and apparently is superior at modifying more hydrophobic rifamycins. The relative role of the two ADP-ribosyltransferases in conferring resistance especially in the species with both orthologs is not fully understood since the modified drug can possibly be further modified and transcriptional downregulation experiments performed in this work do not provide genetic evidence of perturbation of mRNA levels of the respective open reading frames.

      Comments on revisions:

      Demonstration of the level of transcriptional downregulation of the two Arr orthologs would have been a nice demonstration of (1) the utility of CRISPRi in other mycobacteria, (2) that the difference in rifabutin susceptibility during knockdown of Arr-1 vs Arr-X can fully be ascribed to the role of Arr-X in modifying the drug.

    3. Reviewer #3 (Public review):

      This manuscript presents a macroevolutionary approach to identification of novel high-level antibiotic resistance determinants that takes advantage of the natural genetic diversity within a genus (mycobacteria, in this case) by comparing antibiotic resistance profiles across related bacterial species and then using computational, molecular, and cellular approaches to identify and characterize the distinguishing mechanisms of resistance. The approach is contrasted with "microevolutionary" approaches based on comparing resistant and susceptible strains of the same species and approaches based on ecological sampling that may not include clinically relevant pathogens or related species. The potential for new discoveries with the macroevolution-inspired approach is evident in the diversity of drug susceptibility profiles revealed amongst the selected mycobacterial species and the identification and characterization of a new group of rifamycin-modifying ADP-ribosyltransferase (Arr) orthologs of previously described mycobacterial Arr enzymes. Additional findings that intra-bacterial antibiotic accumulation does not always predict potency within this genus, that M. marinum is a better proxy for M. tuberculosis drug susceptibility than the commonly used saprophyte M. smegmatis, and that susceptibility to semi-synthetic antibiotic classes is generally less variable than susceptibility to antibiotics more directly derived from natural products strengthen the claim that the macroevolutionary lens is valuable for elucidating general principles of susceptibility within a genus.

      There are some limitations to the work. The argument for the novelty of the approach could be better articulated. While the opportunities for new discoveries presented by identification of discrepant susceptibility results between related species is evident, it is less clear how the macroevolutionary approach is further leveraged for the discovery of truly novel resistance mechanisms. The example of the discovery of Arr-X enzymes presented here relied upon foundational knowledge of previously characterized Arr orthologs. There is less clarity about what the pipeline would look like for discovery of previously unknown determinants when one is agnostic to putative mechanisms. From the point at which interspecies differences in susceptibility are noted, does the framework still remain distinct from other discovery frameworks and approaches?

      While the experimentation and analyses performed are generally well designed and rigorous, there are a few instances in which broad claims are based on inferences from sample sets or data sets that are, at present, too limited to provide robust support. For example, the claim that rifampicin modification, and precisely ADP-ribosylation, is the dominant mechanism of resistance to rifampicin in mycobacteria is still a bit premature or at least an over-generalization, as other enzymatic modification mechanisms and other mechanisms such as helR-mediated dissociation of rifampicin-stalled RNA polymerases, efflux, etc were not examined. CRISPR interference was used in a demonstrative example to support this assertion, but would need to be applied more systematically to be more conclusive. The general claim that intra-bacterial antibiotic accumulation does not predict potency in mycobacteria may be another over-generalization based on the limited set of drugs and species studied.

      Comments on revisions:

      Discussion, lines 321-323: "We found that resistance to these antibiotics in mycobacteria do not correlate with by uptake/efflux mechanisms in the species tested..." is an over-generalization and conflicts with the following statement on lines 199-201: "for BDQ we could observe some correlation between antibiotic potency and [BDQ]IB which could be indicative of efflux playing a role in antibiotic efficacy. Given that the current statement in the Discussion only applies to 2 of 3 drugs tested, a more specific or nuanced interpretation seems warranted.

    4. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      This work shows that resistance profiles to a variety of drugs are variable between different mycobacterial species and are not correlated with growth rate or intrabacterial compound concentration (at least for linezolid, bedaquiline, and Rifampicin). Note that intrabacterial compound concentration does not distinguish between cytosolic and periplasmic/cell wall-associated drugs. The susceptibility profiles for a wide range of mycobacteria tested under the same conditions against 15 commonly used antimycobacterial drugs provide the first recorded cross-species comparison which will be a valuable resource for the scientific community. To understand the reasons for the high Rifampicin resistance seen in many mycobacteria, the authors confirm the presence of the arr gene known to encode a Rif ribosyltransferase involved in Rif resistance in M. smegmatis in the resistant mycobacteria after confirming the absence of on-target mutations in the RpoB RRDR. Metabolomic analyses confirm the presence of ribosylated Rif in some of the naturally resistant mycobacteria which may not be entirely surprising but an important confirmation. Presumably M. branderi is highly resistant despite lacking the arr homolog due to the rpoB S45N mutation. M. flavescens has an MIC similar to that of M. smegmatis, despite having both Arr-1 and Arr-X. Various Arr-1 and Arr-X proteins are expressed and characterized for catalytic activity which shows that Arr-X is a faster enzyme,, especially with respect to more hydrophobic rifamycins. M. flavescens has similar MIC values to Rifapentine and Rifabutin to M. smegmatis. Thus, the Arr-1 versus Arr-X comparison does not provide a complete explanation for the underlying reasons driving natural Rif resistance in mycobacteria. Downregulation of Arr-X expression in M. conceptionense confers increased sensitivity to Rifabutin confirming its role as a rifamycin-inactivating enzyme.

      Overall, the comparison of cross-species susceptibility profiles is novel; the demonstration that MIC is not correlated with intracellular drug concentration is important but not sufficiently interrogated, the demonstration that Arr-X is also a Rif ADP-ribosyltransferase is a good confirmation and shows that it is more efficient than Arr-1 on hydrophobic rifamycins is interesting but maybe not entirely surprising. The manuscript seems to have two parts that are related, but the rifamycin modification aspect of the work is not strongly linked to the first part since it interrogates the modification of one drug but not the common cause of natural resistance for other drugs.

      Reviewer #2 (Public review):

      Summary:

      The authors use a variety of methods to investigate the mechanisms of innate drug resistance in mycobacteria. They end up focusing on two primary determinants - drug accumulation, which correlates rather poorly with resistance for many species, and, for the rifamycins, ADP-ribosyltransferases. The latter enzymes do appear to account for a good deal of resistance, though it is difficult to extrapolate quantitatively what their relative contributions are.

      Overall, they make excellent use of biochemical methods to support their conclusions. Though they set out to draw very broad lessons, much of the focus ends up being on rifamycins. This is still a very interesting set of conclusions.

      Strengths:

      (1) A very interesting approach and set of questions.

      (2) Outstanding technical approaches to measuring intracellular drug concentrations and chemical modification of rifamycins.

      (3) Excellent characterization of variant rifamycin ADP-ribosyltransferases

      Weaknesses:

      (1) Figure 3c/d: These panels show the same experiment done twice, yet they display substantially different results in certain cases. For instance, M. smegmatis appears to show an order of magnitude lower RIF accumulation in panel d compared to M. flavescens, despite them displaying equal accumulation in panel c. The authors should provide justification for this variation, particularly as quantitative intra-species comparisons are central to the conclusions of this figure.

      The data in panels 3c and 3d are from different sets of experiments. The reviewer is correct with regards to M. smegmatis. The data indeed is ~ 1 order of magnitude different. However, the data for other species is very similar. The reviewer may also have noticed that the error bars are also larger in 3d, compared to 3c, indicating a greater variation between independent experiments use in 3d. We do not have a good explanation for this, other than the experiments shown in 3d were associated with greater biological variability.

      (2) There are several technical concerns with Figure 3 that affect how to interpret the work. According to the methods, the authors did not appear to normalize to an internal standard, only to an external antibiotic standard (which may account for some of the technical variation alluded to above).

      We agree that using a labeled drug as an internal standard (IS) would be ideal. However, the experiment initially followed an untargeted metabolomics approach, which later shifted to relative drug quantification. At that stage, normalizing with IS was impractical because proper implementation would require multiple IS across the chromatographic range. Therefore, we opted for total ion current (TIC) normalization, which accounts for variability in overall metabolite abundance—even though the experimental setup was already adjusted for each bacterial species’ growth rate. Additionally, we prepared external standard curves for each drug to enable quantification, and the amount of drug added to each plate was considered when reporting these values.

      Second, the authors used different concentrations of drug for each species to try to match the species' MICs. I appreciate the authors' thinking on this, but I think for an uptake experiment it would be more appropriate to treat with the same concentration of drug since uptake is likely saturable at higher drug concentrations. In the current setup, for the species with higher MIC, they have to be able to uptake substantially more antibiotics than the species with low MIC in order to end up with the same normalized uptake value in Figure 3d. It would be helpful to repeat this experiment with a single drug concentration in the media for all species and test whether that gives the same results seen here.

      We respectfully disagree with the reviewer. Experiments such as the one proposed by the review work well when MIC values are a few fold apart, for strains of the same species, but have not been tested when MIC values are 100-1000-fold apart, with different species. Furthermore, what would be the interpretation of compound uptake at 1000-fold the MIC for one species and MIC level for another? By using antibiotic concentrations at the respective MIC for each species we are at least under conditions where we know the biological effect of the antibiotic across species is the same, based on its potency.

      (3) Figure 4f: This panel seems to argue against the idea that the efficacy of RIF ribosylation is what's driving drug susceptibility. M. flavescens is similarly resistant to RIF as M. smegmatis, yet M. flavescens has dramatically lower riboslyation of RIF. This is perhaps not surprising, as the authors appropriately highlight the number of different rif-modifying enzymes that have been identified that likely also contribute to drug resistance. However, I do think this means that the authors can't make the claim that the resistance they observe is caused by rifamycin modification, so those claims in the text and figure legend should be altered unless the authors can provide further evidence to support them. This experiment also has results that are inconsistent with what appears to be an identical experiment performed in Supplemental Figure 5b. The authors should provide context for why these results differ.

      In regard to enzyme efficiency, the apparent rate of all Arr-1 is relatively similar in converting RIF into ADP-Ribosyl-Rif between species. However, Arr-X is much more efficient when compared to Arr-1 in both M. flavescents and M. conceptionense. This is indicated by the apparent rate measured and displayed on figure 5c.

      Proteomics data shows that there is upregulation of Arr-1 and Arr-X upon rifampicin treatment in M. flavescens and M. conceptionense. However, the same experiment was not performed in Arr-1 KD. Therefore, we can’t verify through this approach if the activity observed in vivo directly correlates with a higher expression of Arr-X alone. Of note, likely both enzymes contribute to resistance to rifamycins, as per our results with the Arr-X KD and sensitization of M. conceptionense to RIF.

      Author response image 1.

      It is also worth mentioning that there are other enzymes in the pathway of RIF ribosylation and their efficiency is unknown (Author response image 2). Therefore ADP-Ribosyl-RIF It is not an “end-metabolite” and maybe not the sole determinant of RIF resistance via ADP-ribosylation. Downstream enzymes can also account for the difference observed between M. flavescens and M. smegmatis.

      Author response image 2.

      It is correct that the Rifampicin MIC for M. flavescens is the same as M. smegmatis.

      (4) Fig 4f/5c: M. flavescens has both Arr-1 and Arr-X, yet it appears to not have ribosylated RIF. This result seems to undermine the authors' reliance on the enzyme assay shown in Fig 5c - in that assay, M. flavescens Arr-X is very capable of modifying rifampicin, yet that doesn't appear to translate to the in vivo setting. This is of importance because the authors use this enzyme assay to argue that Arr-X is a fundamentally more powerful RIF resistance mechanism than Arr-1 and that it has specificity for rifabutin. However, the result in Figure 4f would argue that the enzyme assay results cannot be directly translated to in vivo contexts. For the authors to claim that Arr-X is most potent at modifying rifabutin, they could test their CRISPRi knockdowns of Arr-X and Arr-1 under treatment with each of the rifamycins they use in the enzyme assay. The authors mentioned that they didn't do this because all the strains are resistant to those compounds; however, if Arr-X is important for drug resistance, it would be reasonable to expect to see sensitization of the bacteria to those compounds upon knockdown.

      The reviewer is reading Fig. 4f incorrectly, probably because it is plotted in a linear scale instead of logarithmic scale. Ribosylated Rif is present in M. flavescens, just at lower levels than M. conceptionense and M. smegmatis. In species where there is no Arr-1 or Arr-3, ribosylated RIF is not detected at all (e.g. M. tuberculosis), i.e., concentration is zero. Therefore, any detection of ribosylated RIF can be considered significant. In addition, as mentioned before, ADP-ribosylation of RIF is not the final product of the reaction and further studies need to be undertaken to understand subsequent reactions.

      (5) Figure 5d: The authors use this CRISRPi experiment to claim that ArrX from M. conceptionanse is more potent at inactivating rifabutin than Arr-1. This claim depends on there being equal degrees of knockdown of Arr-1 and Arr-X, so the authors should validate the degree of knockdown they get. This is particularly important because, to my knowledge, nobody has used this system in M. conceptionanse before.

      We agree with the reviewer that a qPCR should have been performed to define the extent of interference in the strain. generated Unfortunately, at this time a qPCR was not performed in the strains tested to confirm the extent of down regulation. Although it is the best practice to validate the strain KD, there is no indication that the effect observed is due to unspecific downregulation. The genetic environment in which Arr-X is positioned is different from Arr-1 and the targeting oligonucleotides are specific and would not promiscuously bind to Arr-1. Said that, this is indeed a fault in our setup.

      (6) The authors' arguments about Arr-X and Arr-1 would be strengthened by showing by LC/MS that Arr-X knockdown in M. conceptionense results in more loss of ribosyl-rifabutin than knockdown of Arr-1.

      We agree with the reviewer that performing the LC-MS analysis of the Arr-x knockdown would have strengthened the argument of our paper. Unfortunately, this experiment was not performed.

      Reviewer #3 (Public review):

      This manuscript presents a macroevolutionary approach to the identification of novel high-level antibiotic resistance determinants that takes advantage of the natural genetic diversity within a genus (mycobacteria, in this case) by comparing antibiotic resistance profiles across related bacterial species and then using computational, molecular, and cellular approaches to identify and characterize the distinguishing mechanisms of resistance. The approach is contrasted with "microevolutionary" approaches based on comparing resistant and susceptible strains of the same species and approaches based on ecological sampling that may not include clinically relevant pathogens or related species. The potential for new discoveries with the macroevolution-inspired approach is evident in the diversity of drug susceptibility profiles revealed amongst the selected mycobacterial species and the identification and characterization of a new group of rifamycin-modifying ADP-ribosyltransferase (Arr) orthologs of previously described mycobacterial Arr enzymes. Additional findings that intra-bacterial antibiotic accumulation does not always predict potency within this genus, that M. marinum is a better proxy for M. tuberculosis drug susceptibility than the commonly used saprophyte M. smegmatis, and that susceptibility to semi-synthetic antibiotic classes is generally less variable than susceptibility to antibiotics more directly derived from natural products strengthen the claim that the macroevolutionary lens is valuable for elucidating general principles of susceptibility within a genus.

      There are some limitations to the work. The argument for the novelty of the approach could be better articulated. While the opportunities for new discoveries presented by the identification of discrepant susceptibility results between related species are evident, it is less clear how the macroevolutionary approach is further leveraged for the discovery of truly novel resistance determinants. The example of the discovery of Arr-X enzymes presented here relied upon foundational knowledge of previously characterized Arr orthologs. There is little clarity on what the pipeline for identifying more novel resistance determinants would look like. In other words, what does the macroevolutionary perspective contribute to discovery from the point of finding interspecies differences in susceptibility? Does the framework still remain distinct from other discovery frameworks and approaches? If so, how?

      Thanks for pointing this out, as this is a critical feature of our study and method. Our approach relies on inter-species comparative genomics and phenotypes, and therefore, it is distinct from inter-strains comparison. This difference is dramatic, and it becomes clearer when we are comparing the core genome of M. tuberculosis (one species) 92% with the core genome of the genus, circa of 1%. While we focus on rifamycin in this manuscript, future manuscripts will investigate many of the other dozens of “inconsistencies” observed between the genetic makeup of different mycobacterial species and there actual performance in the presence of different antibiotics.

      While the experimentation and analyses performed appear well-designed and rigorous, there are a few instances in which broad claims are based on inferences from sample sets or data sets that are too limited to provide robust support. For example, the claim that rifampicin modification, and precisely ADP-ribosylation, is the dominant mechanism of resistance to rifampicin in mycobacteria may be a bit premature or an over-generalization, as other enzymatic modification mechanisms and other mechanisms such as helR-mediated dissociation of rifampicin-stalled RNA polymerases, efflux, etc were not examined nor were CRISPRi knockdown experiments conducted beyond an experiment to tease out the role of Arr-X and Arr-1 in one strain. The general claim that intra-bacterial antibiotic accumulation does not predict potency in mycobacteria may be another over-generalization based on the limited number of drugs and species studied, but perhaps the intended assertion was that antibiotic accumulation ALONE does not predict potency.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Major comments

      (1) The metabolomics is done using mycobacteria grown on filters. Initially, mycobacterial cells are grown on the filters for 5 doublings before being transferred to drug-containing (or free) agar for one doubling. Is this based on calculated doubling time in liquid culture or a true determination of the fact that the biomass increases to what would amount to 5 doublings?

      The doubling time used is the one determined in liquid media. Although it is possible that the growth kinetics in solid media is slightly different from liquid (±10%), this experimental design is well established for M. tuberculosis (since Proc Natl Acad Sci U S A. 2010 May 25;107(21):9819-24.) and M. smegmatis (unpublished). Therefore, we used the growth rate as a proxy for having the same biomass of cells for each species tested. A maximum difference of 10% was observed between M. tuberculosis growth in liquid and in solid media, however, cells grow exponentially for much longer in filters. This makes filter-based experiments more reliable, as few growth phase-derived differences are present.

      (2) The demonstration that intrabacterial drug concentrations vary between mycobacterial species in a manner not related to MIC for at least LZD and RIF, is an important finding. However, intrabacterial does not mean cytoplasmic since a considerable fraction could be present in the periplasmic/cell wall layers. Ideally, this would need to be determined but would of course be a massive undertaking since the method needs validation & optimization for each mycobacterial species. Nevertheless, this has to be mentioned. In addition, three drugs are limiting. Measuring additional drug concentrations in these 5 mycobacteria would at least establish some confirmation about the extent of this lack of correlation. Thus, could the authors measure concentrations of additional drugs with intracellular targets?

      Testing additional drugs can be beneficial and would be an expansion of our paper, which will definitely be on future plans for further studies focusing on other antibiotics described here. It would also provide new insights into other possible mechanisms of resistance in mycobacterial species. However, in this study we aimed to first determine the antibiotic response profile in different mycobacterial species, and once we identified interesting resistance phenotypes that could not be readily explained by known mechanisms of resistance, we narrowed it down to certain drugs and species that would potentially provide insights into new mechanisms of antibiotic resistance. Finally, exploring drug concentration across multiple bacterial compartments is a dauting task and it has not been done extensively with any species, not to mention with multiple species, many of which are still lacking any study of their actual cell envelope.

      (3) CRISPRi was used to reduce transcription in M. conceptionense. What was the level of gene downregulation?

      As mentioned previously, a setback from our setup is that the level of KD was not measured at this instance.

      Minor comments:

      (1) The introduction mentions the fast and slow-growing mycobacteria which are classified based on the time that it takes to observe colonies on solid agar. However, in liquid medium, there is less correlation between the reported growth on agar and doubling time in liquid (Figure 1b, Figure 2d). This could be mentioned in the results section. In Figure 2d, the filled circles represent fast-growers but this does not hold well for liquid culture and it might make more sense to not distinguish between fast- and slow-growers in these graphs. A small complication would also be the fact that the doubling time represents growth in a liquid medium with Tyloxapol as a detergent whereas the MIC and metabolomics are done on solid agar with no detergent. The metabolomics is done after a doubling but for those where agar growth and liquid growth have large discrepancies in growth rate, there could be some differences.

      Apologies for this misunderstanding. Fast- and slow-growth phenotypes are determined in Lowenstein-Jensen (LJ) agar, not in 7H10 agar (used in our study and most studies of mycobacteria). Furthermore, this is a qualitative definition, not a quantitative one. Therefore, our measurements do not need to correlate with fast- and slow-growth phenotypes, unless we had used that one specific medium. Furthermore, in liquid medium, we determined growth rate directly, which is never done with LJ medium.

      In addition to adding the same amount of cells to each filter, we also perform TIC normalization, which should account for how rich the samples were – and therefore how much material we had. Therefore, we do not observe discrepancies due to differences in growth rate and the presence/absence of detergent in the media.

      It is also worth mentioning that this experimental set up has been well established in many M. tuberculosis labs that study metabolism. Importantly, the use of detergent drastically affects mass spectrometry, and therefore cannot be used.

      (2) Figure 1g in the text should be Figure 1f.

      Apologies, it has been fixed.

      (3) Figure S1 would be ideal to have in (supplementary) table format.

      This data is now being provided in a table format.

      (4) Table S1 - ethambutol misspelt.

      Spelling has been corrected.

      (5) MIC for species such as M. abscessus could depend on medium (7H9-based medium can give different MIC values than CAMH).

      Indeed, different media can significantly change MIC values, and this is true for many bacterial species, if not all. For this study we used only species that could be grown in 7H9 broth containing 10 % ADC, 0.05% glycerol 0.05% tyloxapol and 7H10 plates containing 10% OADC and 0.05% glycerol. MIC<sub>99</sub> was determined in the latter as we found more efficient and robust to do our tests it in solid media. The goal of our experiment was not to the determined the “true” MIC for the antibiotics tested, as this value does not exist. It was to find lack of correlations between relative values and the presence of genes that can account for it.

      (6) The statement "the experiment was performed at a concentration of antibiotic equal to its MIC" initially seems confusing. It was not equal to the MIC but performed at 6-fold the respective MIC of the species in question. Maybe re-phrasing this would help.

      Apologies for this oversight. It has been corrected.

      (7) Note that some mutations outside the RRDR (eg. V170F and I491F) can also cause Rif resistance.

      Author response image 3.

      A Rainbow diagram of RpoB X-Ray structure coloured according to sequence conservation. Dark purple indicates high conservation, whereas dark orange indicates low conservation. RIF (showed in magenta) is bound to RpoB. Zoomed view displays that the RIF-binding pocket is considerably conserved. B RpoB protein sequence has an 81bp region called Rifampicin Resistance Determining Region (RRDR) that is known to be important for RIF binding and is where most mutations occur in drug-resistant TB. Sequence alignment displays that the RRDR region is conserved with the exception of M. branderi, which has an Asn instead of a Ser residue in position 456 (numbering is related to the M. tuberculosis sequence), highlighted in bold.

      Attached we have a structural alignment of RpoB of the species highlighted on this paper. Although there is variability within the sequences, which is also displayed in Author response image 3 with the conservation analysis, the residues that have been implicated with resistance (including V170 and I491) are conserved. Alignment sent on .fasta file that can be opened in jalview.

      (8) Discuss how the RpoB S450N mutation in M. branderi confers the observed level of resistance.

      That’s a great point, thank you. Now it reads as:

      “The rifampicin (RIF) binding pocket is generally conserved, but Mycobacterium branderi has an S450N mutation in the RRDR region. While this specific mutation hasn't been found in clinical isolates, it's located at the binding site and may confer resistance (273). Although both serine (S) and asparagine (N) have similar side chains, related mutations like S450Q have been linked to resistance (156). Thus, M. branderi may be RIF-resistant due to this mutation. In contrast, M. conceptionense, M. flavescens, and M. smegmatis show no target sequence differences that explain their resistance”

      (9) The statement that the three tested NTM are sensitive to rifabutin ("resistant to all rifamycins except for rifabutin") needs to be interpreted considering what sensitivity means. The MIC is still high (1.6-3.1 ug/mL) when compared to that of Mtb. The 2-fold differences in MIC between M. smegmatis and M. conceptionense do not really prove or disprove the role of Arr-X in rifabutin resistance.

      We fixed the sentence to be more careful with the language on the text. We agree, but it is worth mentioning that generally with bacteria there is a regulation by the CLSI. Each bacterial species has a range that is considered sensitive or resistant, but these are not available for the species used in this study. In general, bacteria with MIC values above 8 µg/mL are considered resistant to rifampin (J Antibiot 2014 67:625).

      (10) Figure 1d: It's hard to quantify the sensitivity of the plates. Can this be done by MIC? Was only rifabutin tested or also rifampicin?

      The initial experiments described on the paper were all performed using Rifampicin only. Then, the MIC for the remaining rifamycins was determined for M. smegmatis, M. flavescens and M. conceptionense, and can be perused on “Supplementary table 4”. Figure 5d is to illustrate the effect of the KD in M. conceptionense sensitivity to rifabutin.

      (11) Is there data to show the ADP-ribosylation of rifabutin in M. conceptionense and the CRISPRi strains?

      Unfortunately, we did not perform LC-MS analysis on M. conceptionense CRISPRi strains exposed to rifabutin to measure potential ADP-ribosylation.

      Reviewer #2 (Recommendations for the authors):

      (1) It would be useful if the authors would complete Figure 1A by determining growth rates for the remaining 18 strains that they currently omitted.

      These growth rates were obtained using roller bottles and in at least 3 independent experiments, unfortunately the throughput is far ideal. The goal of the experiment was to highlight difference in growth rate, beyond fast- and slow-growth, which we did. Adding the remaining values would not change this conclusion. Growth rate variation in 7H9 is significant and the point is made in our figure.

      (2) The authors should justify their choice of species used in Figures 3-4. It would be useful to know, for instance, if the authors chose these species in an unbiased fashion, or if they were chosen because the authors had already determined that they possess rifamycin-modifying enzymes of interest. In that case, they wouldn't necessarily be a representative sample to use for the correlation analysis of antibiotic uptake and potency in Figure 3.

      They were chosen because of their resistance profile for BDQ, LZD and RIF. This has been addressed in the text, which now reads “Given the antibiotic response profiles observed, we selected BDQ, LZD and RIF to explore the molecular causes of these dramatic changes in antibiotic potency observed across the Mycobacterium genus.”

      (3) Figure 4b: The data in this panel appear inconsistent - for instance, M. houstonense appears to grow at 10X Mtb MIC, but fails to grow at 1X Mtb MIC. Repeating this experiment would better establish the validity of the authors' claims about the relative susceptibility of these strains to RIF.

      The figures got rotated when exported from illustrator. Corrected figure is uploaded, and original plate photos are also uploaded for clarity.

      (4) Figure 4e: Does Arr-X get upregulated in these proteomic datasets? The authors' argument that proteomic upregulation correlates with important drug resistance genes would imply that it might be, so that would be useful information to provide.

      Arr-X is slightly upregulated, but not statistically significant – this could be due to the native expression of Arr-1. Data is displayed in a previous answer.

      (5) I wasn't able to find the supplementary tables that the authors allude to - not sure if that was a file mixup, but those tables would be useful for interpreting the manuscript.

      We are sorry that you couldn’t access the table. It must be a file corruption issues, as the other reviewers were able to. We will make sure that all tables are available and accessible.

      (6) For LC/MS, the authors use peak height instead of peak area, which they argue correlates better with the amount of drug in cells because of the poor peak shape they observed for linezolid. This is not standard practice, so the authors should provide evidence to support this claim by running an LC/MS standard curve, then showing the correlation between peak height and amount of compound added as well as the correlation between peak area and compound.

      Thank you for pointing that out, accuracy calculated and displayed. Both peak area and height can be used, but indeed area is standard practice.

      (7) The authors should provide methods information about the LC column and the gradient settings used for LC-MS, as well as the settings of the MS.

      The full method has been added to the paper.

      Reviewer #3 (Recommendations for the authors):

      I have only minor comments aside from the information in the Public Review:

      (1) Results, section on Intra-bacterial antibiotic accumulation, line 8: "experiment was performed at a concentration of antibiotic PROPORTIONAL to its MIC" would be more accurate?

      Agreed and adjusted according to Reviewer’s suggestion.

      (2) Results, section on A minor role for pre-existing target modification, last sentence: the mere presence of RIF-ribosylating enzymes does not, in and of itself indicate that "RIF modification, and precisely ADP-ribosylation, is the dominant mechanism of resistance to RIF in mycobacteria", as other mechanisms and other forms of modifying enzymes are known to confer rifamycin resistance, with redundancy (e.g., other rifampicin-modifying enzymes, or helR-mediated dissociation of rifampicin-stalled RNA polymerases from DNA). It would be more appropriate to suggest the results presented to this point indicate RIF modification is common among mycobacteria. The evidence from the CRISPRi knockdown of Arrs shown in Fig 5d is the kind of evidence that suggests ribosylation as a dominant mechanism, at least against rifabutin in this particular species.

      Absolutely, there are other possible modifying enzymes that could be encoded by these mycobacterial species. There is a possibility that M. flavescens and M. smegmatis encode for a putative helR (attached alignment) but further experiments would need to be carried out to confirm its ability to displace RIF in the RNAP. Interestingly, the presence of both Arr and HelR has been studied in M. abscessus and those mechanisms of resistance are independent from each other (Molecular Cell 2022 82(17):3166-3177.e5).

      (3) Discussion, 2nd sentence needs grammatical editing.

      Rephrased and it reads “Using our mycobacterial library, we identified for the first time high- and ultra-high-level intrinsic resistance (3) to many of the antibiotics tested. Of note, the resistant phenotype is naturally occurring and not a result of mutations due to exposure to the antibiotic in the clinic – which is the more traditional approach for probing mechanisms of antibiotic resistance. Our observations revealed that resistance profiles are highly variable across the genus and do not follow phylogeny, implicating HGT as the key mechanism for acquisition of resistance determinants and evolution of antibiotic resistance in mycobacteria (42).”

      (4) Discussion, page 7, first line: the inclusion of LZD and BDQ in this statement seems at odds with Figure 2c and the statements in the first paragraph of page 5 highlighting these as examples of drugs to which most mycobacteria are susceptible.

      Indeed, many of the species are susceptible, however the MIC<sub>99</sub> levels observed have never been reported before, and therefore we found it to be an interesting finding to highlight. From a treatment perspective, knowing which species are sensitive to which drugs is of course the most useful outcome of our study.

      (5) The next sentence..."We found that resistance to these antibiotics in mycobacteria cannot be explained by uptake/efflux mechanisms..." is a bit of an over-generalization and conflicts with the evidence presented earlier that efflux could be playing a role in BDQ resistance and the published evidence establishing a clinically significant role for efflux-mediated BDQ resistance in M. tuberculosis, M. avium complex and M. abscessus complex.

      We rephrased it to make it more specific to our findings. It reads “We found that resistance to these antibiotics in mycobacteria do not correlate with by uptake/efflux mechanisms in the species tested and it does not correlate with growth rate. Identification of mycobacterial species highly resistant to BDQ and LZD is worrisome as most of this species, if not all, have never been exposed to these drugs.”

      (6) Methods, section on In vitro activity assay of Arr enzymes, line 1: reference(s) should be provided for previously reported methods.

      Reference now added.

      (7) Figure 2d: the low end of the susceptibility range is not well defined.

      In this figure the susceptibility is not defined as the lowest area of the graph, but the lower concentrations are indeed harder to be defined. Hopefully supplementary figure 1 and the additional table containing the MIC can be informative to address this comment.

      (8) Figures 3c,d: the presentation of the relative antibiotic concentrations could be harmonized between the graphs in 3c and those in 3d to enable a more ready comparison.

      We disagree. The goal of these different panels is exactly to illustrate two distinct points. C gives the relative concentration of antibiotic, while D correlates relative concentration with MIC99. The use of log scale in D further clarifies that there is no correlation between intracellular antibiotic concentration and potency (MIC). This information is not present in C.

      (9) Figure 4f and Supplementary Figure 5b: it is difficult to understand the limited amount of ribsosyl-RIF in M. flavescens in Fig 4f relative to Supplementary Figure 5b (esp. when considering M. smeg as a common comparator); and, further, to understand the seeming lack of correlation between RIF susceptibility, ribosylation and Arr number and catalytic efficiency for these two strains without considering additional resistance mechanisms.

      In reality the difference between figure 4f and Supplementary figure 5b is mainly due to M. smegmatis – that has an apparent lower production of ribosyl-RIF in the experiment described in the supplementary figure. The values for M. flavescens are relatively similar. In addition, the ADP-Ribosyl-RIF is not the final metabolite of the pathway.

      In regards of having the entire picture, it is true that we were unable to completely unravel and correlate MIC value, expression of Arr-1, expression of Arr-3, efficiency of each enzyme, production of ADP-Ribosyl-RIF and the presence of other possible mechanisms of resistance and this is indeed a setback in our study, and of most studies ever published, which usually focus on one resistant determinant.

    1. eLife Assessment

      This valuable study used genetic and pharmacological manipulations of insulin/IGF signaling in renal glomerular podocytes to address the role of insulin/IGF axis in podocytes. Solid data are presented to demonstrate that co-inhibition of insulin/IGF signaling in podocytes led to aberrant splicing of mRNAs, which could contribute to the loss of podocytes in vitro and in vivo in mice. As it stands, the study lacks the assessment of developmental phenotype of podocytes in the mouse model.

      [Editors' note: this paper was reviewed by Review Commons.]

    2. Reviewer #1 (Public review):

      Summary:

      In this manuscript, the role of the insulin receptor and the insulin growth factor receptor was investigated in podocytes. Mice, where both receptors were deleted, developed glomerular dysfunction and developed proteinuria and glomerulrosclerosis over several months. Because of concerns about incomplete KO, the authors generated and studied podocyte cell lines where both receptors were deleted. Loss of both receptors was highly deleterious with greater than 50% cell death. To elucidate the mechanism of cell death, the authors performed global proteomics and found that spliceosome proteins were downregulated. They confirmed this directly by using long-read sequencing. These results suggest a novel role for insulin and IGF1R signaling in RNA splicing in podocytes.

      This is primarily a descriptive study and no technical concerns are raised. The mechanism of how insulin and IGF1 signaling regulates splicing is not directly addressed but implicates potentially the phosphorylation downstream of these receptors. In the revised manuscript, it is shown that the mouse KO is incomplete potentially explaining the slow onset of renal insufficiency. Direct measurement of GFR and serial serum creatinines might also enhance our understanding of progression of disease, proteinuria is a strong sign of renal injury. An attempt to rescue the phenotype by overexpression of SF3B4 would also be useful but may be masked by defects in other spliceosome genes. As insulin and IGF are regulators of metabolism, some assessment of metabolic parameters would be an optional add-on.

      Significance:

      With the GLP1 agonists providing renal protection, there is great interest in understanding the role of insulin and other incretins in kidney cell biology. It is already known that Insulin and IGFR signaling play important roles in other cells of the kidney. So, there is great interest in understanding these pathways in podocytes. The major advance is that these two pathways appear to have a role in RNA metabolism.

      Comments on revised version:

      I'm satisfied with the revised manuscript and the responses to my previous concerns.

    3. Reviewer #2 (Public review):

      Summary:

      In this manuscript, submitted to Review Commons (journal agnostic), Coward and colleagues report on the role of insulin/IGF axis in podocyte gene transcription. They knocked out both the insulin and IGFR1 mice. Dual KO mice manifested a severe phenotype, with albuminuria, glomerulosclerosis, renal failure and death at 4-24 weeks.

      Long read RNA sequencing was used to assess splicing events. Podocyte transcripts manifesting intron retention were identified. Dual knock-out podocytes manifested more transcripts with intron retention (18%) compared wild-type controls (18%), with an overlap between experiments of ~30%.

      Transcript productivity was also assessed using FLAIR-mark-intron-retention software. Intron retention w seen in 18% of ciDKO podocyte transcripts compared to 14% of wild-type podocyte transcripts (P=0.004), with an overlap between experiments of ~30% (indicating the variability of results with this method). Interestingly, ciDKO podocytes showed downregulation of proteins involved in spliceosome function and RNA processing, as suggested by LC/MS and confirmed by Western blot.

      Pladienolide (a spliceosome inhibitor) was cytotoxic to HeLa cells and to mouse podocytes but no toxicity was seen in murine glomerular endothelial cells.

      The manuscript is generally clear and well-written. Mouse work was approved in advance. The four figures are generally well-designed, bars/superimposed dot-plots.

      Methods are generally well described.

      Comments on revised version:

      Coward and colleagues have done an excellent job of responding to all the reviewer comments.

    4. Reviewer #4 (Public review):

      Summary and background:

      This report entitled "The insulin/IGF axis is critically important (for) controlling gene transcription in the podocyte" from Hurcombe et al is based on a mouse double knockdown of the IR and IGF1R and a parallel cultured mouse podocyte model. Insulin/IGF signaling system in mammals evolved as three gene reduplicated peptides (insulin, IGF-1, and IGF-2) and their two receptors IR and IGF1R that cross-react to variable extents with the peptides, are ubiquitously expressed, and signal through parallel pathways. The major downstream effect of insulin is to regulate glucose uptake and metabolism, while that of the IGF pathways is to regulate growth and cell cycling in part through mTORC1. The GH-IGF-1-IGF1R pathway regulates post-natal growth. IGF-2 signaling is thought to play a major role in regulating intrauterine growth and development, although IGF-2 is also present at high levels in post-natal life. Thus, one would anticipate that reducing IR/IGF1R signaling in any cell would slow growth and cell cycling by reducing growth factor and metabolic mTORC1-mediated and other processes including the splicing of RNA for protein synthesis.

      Mouse IR/IGF1R double knockdown model:

      A double knockdown mouse model was generated by interbreeding mice with different genetic backgrounds carrying floxed sites for IR and IGF-1R to produce mixed background offspring with both floxed IR and IGF-1R genes. These mice were crossed so that the podocin promoter driven-Cre (that comes on at about embryonic day 12 bas podocytes are developing) would delete IR and IGF-1R genes. Since podocin is believed to be an absolutely podocyte-specific protein, this podocin promoter this is predicted to specifically knock down the IR and IGF1R genes only in podocytes. The weight and growth of double KO offspring was not different from controls, but some proportion of the double knockdown mice subsequently developed proteinuria by 6 months and 20% died, although no specific data is provided to identify the cause of the deaths since eGFR was not decreased. Surviving mice were evaluated at 6 months of age. The efficacy of knockdown was not demonstrated in the mouse model itself, although a temperature-sensitive cell line developed from these double knockdown mice showed that expression of IR and IGF-1R proteins in the Cre-treated cell line were both reduced by about 50% (no statistical analysis of this result provided). In the knockout mice, proteinuria was significantly increased by 6 months, but not at earlier time points. Histologic analysis showed proteinaceous casts, glomerulosclerosis and interstitial fibrosis. Podocyte number was stated to be reduced by about 30% in double knockdown mice, although the method by which this was evaluated seems to have been by counting WT1 positive nuclei in glomerular cross-sections, an approach that is well-known not to be a reliable way of assessing true podocyte number. No information is provided about podocyte size, density or glomerular volume.

      Comment: If IR/IGF1R deletion plays a significant role in normal podocyte function sufficient to cause proteinuria and glomerulosclerosis then the effect of reduced IR and IGF1R protein expression on podocyte function would have been expected to produce a phenotype before 6 months. A more likely scenario to explain the overall result is that deleting the IR and IGF1R genes at about embryonic day12 impacted podocyte development to a variable extent such that some mice developed fewer podocytes per glomerulus than other mice. As mice grow and their glomeruli and glomerular capillary area increases, those mice with fewer podocytes would not be able to completely cover the filtration surface with foot processes and would develop proteinuria and glomerulosclerosis. If reduced podocyte number per glomerulus is the proximate cause of the observed proteinuria, then modulation of the body and kidney growth rate by calorie restriction to slow growth (lower circulating IGF-1 levels) would be expected to be protective, while a high protein high calorie diet (higher circulating IGF-1 levels) or uni-nephrectomy to increase kidney growth rate would be expected to enhance proteinuria and glomerulosclerosis.

      The model as used may be more representative of a variable degree of podocyte depletion than an effect of impaired IR/IGF1R signaling. Therefore, although the phenotype may be ultimately attributable to the IR/IGF1R gene deletions the proteinuria and glomerulosclerotic phenotype itself was probably a consequence of defective podocyte development. Examining podocyte number, size, density and glomerular volume at earlier time points (4 weeks) would help to answer this question. Therefore, a more appropriate title would be "The insulin/IGF axis is critically important (for) normal podocyte development and deployment". In this context the effect of the knockdowns on splicing would make more sense.

      Cell culture studies. A cell line was generated using a temperature sensitive SV40 system that has been previously reported from this laboratory. A detailed analysis is provided to show that double knockout cells exhibited abnormal spliceosome activity. This forms the basis for the conclusion that "The insulin/IGF axis is critically important (for) controlling gene transcription in the podocyte". There are several concerns that weaken this conclusion.

      (1) In the double knockdown cell culture system about 30% of cells were "lost" by 3 days and about 70% of cells were "lost" by 5days. The studies were done at the 3 day time point. It is not clear whether "lost" cells were in the process of dying, stress-induced detachment, or just growing more slowly than control due to reduced IR and IGF-1R signaling. These processes could have impacted splicing in a non-specific way independent of IR/IGF1R signaling itself.

      (2) Can a single cell line derived from the double floxed mice be relied on to provide an unbiased picture of the effect of deleting IR and IGF-1R? Presumably, the transfection and selection process will select for cells that survive thereby including unknown biases, possibly related to spliceosome function. Is a single cell line adequate? These investigators have extensive experience with this type of analysis, but this question is not addressed in the discussion.

      (3) To determine whether the effect is specific to reduced IR/IGFR signaling the deletion of IR and IGF-1R could be corrected by transfecting full length IR and IGF-1R cDNAs into the cells to restore normal IR/IGF1R signaling. If transfected cells with intact IR and IGF-1R expression and activity returns spliceosome activity to normal this would be evidence that receptors themselves play some role in spliceosome activity, as opposed to the downstream effect on growth limitation/stress on the cells.

      (4) Other ways of testing whether the splicing effect is specifically due to reduced IR/IGF-1R signaling would be to (a) block IR and IGF1R receptors using available inhibitors, (b) remove or reduce insulin, IGF-1 and IGF-2 levels in the culture medium, (c) use low glucose and amino acid culture medium to slow growth rate independent of receptor function, (d) or block intra-cellular signaling via the IR and IGF-1R receptors through mTORC1 inhibition using rapamycin or other signaling targets.

      (5) It would be useful to determine whether the cultured cells stressed in other ways (e.g. ischemia, toxins, etc.) also results in the same splicing abnormalities.

    5. Author response:

      The following is the authors’ response to the original reviews

      Many thanks for your helpful and constructive comments for our work examining the effect of inhibiting both the insulin receptor (IR) and IGF1 receptor (IGF1R) in the podocyte. We are pleased to submit an updated manuscript addressing your concerns.

      (1) A major concern was a lack of mechanistic insight into how deletion (or knock-down) of both receptors caused the spliceosomal phenotype (Reviewer 1 and Reviewer 3).

      We now think this is due to the lack of a network of insulin/IGF phospho-signalling events to a variety of spliceosomal proteins and kinases. The reasons for this are as follows:

      A. Since submitting our paper Turewicz et al have published a comprehensive phospho-proteomic paper examining the effects of 100nM insulin on human primary myotubes (DOI: 10.1038/s41467-025-56335-6). They discovered that multiple post-translational phosphorylation events occur in a variety of spliceosomal proteins at differing time points (1 minute to 60 minutes). Furthermore, they show that mRNA splicing is rapidly modified in response to insulin stimulation in their cells. This follows elegant work from Bastista et al who studied diabetic and non-diabetic iPSC derived human myositis and also detected a spliceosome phosphorylation signature (DOI: 10.1016/j.cmet.2020.08.007).

      B. We have examined phospho-proteosome changes that occur in wild -type podocytes (expressing both the IR and IGF1R) compared to double (IR and IGF1R) knockout cells using phosho-proteomics. We have done this 3 days after inducing receptor knockdown, before major cell loss, and have stimulated the cells with either 10nM insulin or 100mg IGF1.

      Interestingly, we detected several post-translational modifications (PTM) in our data set that are also present in Turewicz’s studies. Of note, 100nM insulin (as used by Turewicz) will signal through both the insulin and IGF1 receptor (and hybrid Insulin/IGF1 receptors) which is relevant to our studies.

      Our work shows a cascade of phospho- signalling events affecting multiple components of the spliceosomal complex and evidence of kinase modulation (phosphorylation) (New Figure 7 and supplementary Figure 5). Also new results section in paper (lines 391-425 in track changes version). We acknowledge that we only studied a single time point after stimulation (10 minutes) and could have missed other PTM in the spliceosomal complex and other kinases. This is mentioned in our new limitations of study section (lines 595-606). This will be a focus of future work. We did not find major PTM differences when stimulating with either insulin or IGF1 in our studies and suspect that the doses of insulin (10nM) and IGF1 (100mg) used are still able to signal through cognate receptors.

      Furthermore, we have examined the relative contributions of the insulin and IGF1 receptor in detail in the model (addressed in point 13 below).

      (2) The phenotype of the mouse is only superficially addressed. The main issues are that the completeness of the mouse KO is never assessed nor is the completeness of the KO in cell lines. The absence of this data is a significant weakness. (Reviewer 1)

      We apologise for not making this clear, but we did assess the level of receptor knockdown in both the animal and cell models. The in vivo model showed variable and non-complete levels of insulin receptor and IGF1 receptor podocyte knock down (shown in supplementary Figure 1C). This is why we made the in vitro floxed podocyte cell lines in which we could robustly knockdown both the IR and IGF1R. We show this using Western blotting (shown in Figure 2A). We agree that calling the models knockout is misleading and have changed all to knock down (KD) now.

      (3) The mouse experiments would be improved if the serum creatinine’s were measured to provide some idea how severe the kidney injury is. (Reviewer 1)

      There is variability in creatinine levels which is not uncommon in transgenic mouse models (probably partly due to variability in receptor knock down levels with cre-lox system). This is part of rationale of developing the robust double receptor knockout cell models where we robustly knocked out both receptors by >80%. We have added measured creatinine levels in a subset of mice in supplementary data (New Supplementary Figure 1E) and mention this in the text (lines 285-286). As some mice died we expect they may have developed acute kidney injury, but we did not serially measure the creatinine’s in every mouse over time. We could have assessed the GFR in a more sensitive way to look at differences. However, we consider the highly significant levels of albuminuria and histological damage observed in our models show a significant kidney phenotype.

      (4) An attempt to rescue the phenotype by overexpression of SF3B4 would also be useful. If this didn't work, an explanation in the text would suffice. (Reviewer 1).

      We did consider doing this but on reflection think it is very unlikely to rescue the phenotype as an array of different spliceosomal proteins quantitatively changed and were differentially phosphorylated / dephosphorylated throughout the complex (as we hope our revised work illustrates now). We think a single protein rescue is highly unlikely to work. We hope this is an appropriate explanation for this action. We have mentioned this in the text now in our discussion (lines 601-602).

      (5) As insulin and IGF are regulators of metabolism, some assessment of metabolic parameters would be an optional add-on. (Reviewer 1).

      Thank you for this suggestion. We did not extensively examine the metabolism of the mice however we did perform blood glucose measurement and weight which are included in the paper (Figure 1A and Figure 1B).

      (6) The authors should caveat the cell experiments by discussing the ramifications of studying the 50% of the cells that survive vs the ones that died. (Reviewer 1).

      We appreciate this and this was the rationale behind cells being studied after 3 days differentiation for total and phospho-proteomics before significant cell loss to avoid the issue of studying the 50% of cells that survive (which happened at 7 days). We have made this clearer in the manuscript. We also have added the data showing less cell death at 3 days in the cell model (New Supp Figure 2B).

      (7) It would be helpful to say that tissue scoring was performed by an investigator masked to sample identity. (Reviewer 2)

      We did this and have added to manuscript (line 113).

      (8) Data are presented as mean/SEM. In general, mean/SD or median/IQR are preferred to allow the reader to evaluate the spread of the data. There may be exceptions where only SEM is reasonable. (Reviewer 2)

      All graphs have now been changed to SD rather than SEM.

      (9) It would be useful to for the reader to be told the number of over-lapping genes (with similar expression between mouse groups) and the results of a statistical test comparing WT and KO mice. The overlap of intron retention events between experimental repeats was about 30% in both knock-out podocytes. This seems low and I am curious to know whether this is typical for this method; a reference could be helpful. (Reviewer 2)

      This is an excellent question. We had 30% overlap as the parameters used for analysis were very stringent. We suspect we could get more than 30% by being less stringent, which still be considered as similar events if requested. Our methods were based on FLAIR analysis (PMID: 32188845). We have added this reference to the manuscript (Line 242 & 680).

      (10) With the GLP1 agonists providing renal protection, there is great interest in understanding the role of insulin and other incretins in kidney cell biology. It is already known that Insulin and IGFR signaling play important roles in other cells of the kidney. So, there is great interest in understanding these pathways in podocytes. The major advance is that these two pathways appear to have a role in RNA metabolism, the major limitations are the lack of information regarding the completeness of the KO's. If, for example, they can determine that in the mice, the KO is complete, that the GFR is relatively normal, then the phenotype they describe is relatively mild. (Reviewer 1)

      Thank you. The receptor knock-out (KO) in the mice is highly unlikely to be complete (Please see comments above and Supplementary Figure 1C). There are many examples of “KO” animal models targeting other tissues showing that complete KO of these receptors seems difficult to achieve, particularly in reference to the IGF1 receptor. In the brain, which also contains terminally differentiated cells, barely 50% of IGF1R knockdown was achieved in the target cells (PMID:28595357). In ovarian granulosa cells (PMID:28407051) -several tissue specific drivers tried but couldn't achieve any better than 80%. The paper states that 10% of IGF1R is sufficient for function in these cells so they conclude that their knockdown animals are probably still responding to IGF1. Finally, in our recent IGF1R podocyte knockdown model we found Cre levels were important for excision of a single homozygous floxed gene (PMID: 38706850) hence we were not surprised that trying to excise two homozygous floxed genes (insulin receptor and IGF1 receptor) was challenging. This was the rationale for making the double receptor knockout cell lines to understand processes / biology in more detail. As stated earlier, we have changed our description of the mice and cell lines from knock-out to knock-down throughout the revised manuscript as this is more accurate.

      (11) For the in vivo studies, the only information given is for mice at 24 weeks of age. There needs to be a full-time course of when the albuminuria was first seen and the rate of development. Also, GFR was not measured. Since the podocin-Cre utilized was not inducible, there should be a determination of whether there was a developmental defect in glomeruli or podocytes. Were there any differences in wither prenatal post-natal development or number of glomeruli? (Reviewer 3)

      We have added further urinary Albumin:creatinine ratio (uACR) data at 12, 16 and 20 weeks to manuscript. We do not think there was a major developmental phenotype as albuminuria did not become significantly different until several months of age (new Supp Figure 1B). We did consider using a doxycycline inducible model but we know the excision efficiency is much less than the constitutive podocin-cre driven model Author response image 1. This would likely give a very mild (if any) phenotype when attempting to knockout both receptors and not reveal the biology adequately. We acknowledge the weaknesses of the animal model and this was the rationale for generating the cell models.

      (12) Although the in vitro studies are of interest, there are no studies to determine if this is the underlying mechanism for the in vivo abnormalities seen in the mice. Cultured podocytes may not necessarily reflect what is occurring in podocytes in vivo. (Reviewer 3)

      This is a good point. We have now immune-stained the DKD and WT mice for Sf3b4 (a spliceosomal change in our in vitro proteomics) and also find a significant reduction in this protein in podocytes of the DKD mice (New Figure 3F).

      (13) Given that both receptors are deleted in the podocyte cell line, it is not clear if the spliceosome defect requires deletion of both receptors or if there is redundancy in the effect. The studies need to be repeated in podocyte cell lines with either IR or IGFR single deletions. (Reviewer 3)

      We have now performed proteomics and phospho-proteomics in all 4 cell types (Wild-type, Insulin receptor knock down, IGF1R knockdown and double knockdown) at 3 days (New Figure 8 and supplementary Figure 6. Also new results section lines 425 to 450). This shows that both receptors contribute to the pathways (and hence there is a high level of compensation built into the system). For total proteins we detected that spliceosomal tri-snRNP was only reduced when both receptors were lacking but other proteins / pathways had an incremental effect of losing the insulin or IGF1 receptor. Likewise, the spliceosomal phospho-signaling events can go through either the insulin or igf1 receptors predominantly or through both. We think this reflects the complexity of this system and how evolutioatily it has developed in mammals to protect against its loss.

      Finally in revision we have rewritten the discussion with a “limitations of the study” section and hopefully in an easier to read fashion for the readership.

      Author response image 1.

      (A) mT/mG reporter mouse crossed to constitutional podocin Cre heterozygous mouse. Illustrates podocyte specificity for Cre driver and excision Of reporter Figure shows GFP expression in Cre producing cells (top panel scale bar=250vm; bottom panel scale bar=50pm). Cre expression causes GFP to be switched on. (B) mT/mG reporter mouse crossed to podocin RtTA— tet-o-cre heterozygous mouse shows podocyte specificity for driver and approximately 60% excision. (top and bottom panels scale bar=250pm; middle panel scale bar=50pm). Doxycycline required for expression showing not leaky.

    1. eLife Assessment

      In this valuable study, through carefully executed and rigorously controlled experiments, the authors challenged a previously reported role of the Death Receptor 6 (DR6/Tnfrsf21) in Wallerian degeneration (WD). Using two DR6 knockout mouse lines and multiple WD assays, both in vitro and in vivo, the authors provided convincing evidence that loss of DR6 in mice does not protect peripheral axons from WD after injury, at least in the specific contexts of the mice and analyses performed in this study. Due to the lack of certain specific parameters from previous studies (sex, age, mouse strains etc.), the exact reasons underlying the observed inconsistencies between current and previous reports on the protective effects of DR6 remains to be determined. Overall, this is a carefully executed study providing invaluable information toward understanding DR6's role (or lack thereof) in axon degeneration.

    2. Reviewer #1 (Public review):

      Summary:

      The authors show that genetic deletion of the orphan tumor necrosis factor receptor DR6 in mice does not protect peripheral axons against degeneration after axotomy. Similarly, Schwann cells in DR6 mutant mice react to axotomy similarly to wild type controls. These negative results are important because previous work has indicated that loss or inhibition of DR6 is protective in disease models and also against Wallerian degeneration of axons following injury. This carefully executed counterexample is important for the field to consider.

      Strengths:

      A strength of the paper is the use of two independent mouse strains that knockout DR6 in slightly different ways. The authors confirm that DR6 mRNA is absent in these models (western blots for DR6 protein are less convincingly null, but given the absence of mRNA, this is likely an issue of antibody specificity). One of the DR6 knockout strains used is the same strain used in a previous paper examining the effects of DR6 on Wallerian degeneration.

      The authors use a series of established assays to evaluate axon degeneration, including light and electron microscopy on nerve histological samples and cultured dorsal root ganglion neurons in which axons are mechanically severed and degeneration is scored in time lapse microscopy. These assays consistently show a lack of effect of loss of DR6 on Wallerian degeneration in both mouse strains examined.

      Additional strengths are that the authors examine both the axonal response and the Schwann cell response to axotomy and use both in vivo and in vitro assays.

      Therefore, these experiments, the author's data support their conclusion that loss of DR6 does not protect against Wallerian degeneration.

      Weaknesses:

      A weakness of this paper is that no effort is made to determine why the results presented here may differ from previous studies. A notable possibility is that the original mouse strain that showed 5 of 13 mice being protected from Wallerian degeneration was studies on a segregating C57BL/6.129S background.

      Finally, it is important to note that previously reported effects of DR6 inhibition, such as protection of cultured cortical neurons from beta-amyloid toxicity, are not necessarily the same as Wallerian degeneration of axons distal to an injury studied here. The negative results presented here showing that loss of DR6 is not protective against Wallerian degeneration induced by injury are important given the interest in DR6 as a therapeutic target. However, care should be taken in attempting to extrapolate these results to other disease contexts such as ALS or Alzheimer's disease.

    3. Reviewer #3 (Public review):

      Summary:

      The authors revisit the role of DR6 in axon degeneration following physical injury (Wallerian degeneration), examining both its effects on axons and its role in regulating the Schwann cell response to injury. Surprisingly, and in contrast to previous studies, they find that DR6 deletion does not delay the rate of axon degeneration after injury, suggesting that DR6 is not a mediator of this process.

      Overall, this is a valuable study. As the authors note, the current literature on DR6 is inconsistent, and these results provide useful new data and clarification. This work will help other researchers interpret their own data and re-evaluate studies related to DR6 and axon degeneration.

      Strengths:

      (1) The use of two independent DR6 knockout mouse models strengthens the conclusions, particularly when reporting the absence of a phenotype.

      (2) The focus on early time points after injury addresses a key limitation of previous studies. This approach reduces the risk of missing subtle protective phenotypes and avoids confounding results with regenerating axons at later time points after axotomy.

      Comments on revisions:

      I thank the authors for their thorough responses to my previous comments. The revisions have addressed the points raised and have improved the clarity and overall quality of the manuscript. I appreciate the effort taken to strengthen the presentation of the work.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors show that genetic deletion of the orphan tumor necrosis factor receptor DR6 in mice does not protect peripheral axons against degeneration after axotomy. Similarly, Schwann cells in DR6 mutant mice react to axotomy similarly to wild-type controls. These negative results are important because previous work has indicated that loss or inhibition of DR6 is protective in disease models and also against Wallerian degeneration of axons following injury. This carefully executed counterexample is important for the field to consider.

      Strengths:

      A strength of the paper is the use of two independent mouse strains that knock out DR6 in slightly different ways. The authors confirm that DR6 mRNA is absent in these models (western blots for DR6 protein are less convincingly null, but given the absence of mRNA, this is likely an issue of antibody specificity). One of the DR6 knockout strains used is the same strain used in a previous paper examining the effects of DR6 on Wallerian degeneration.

      The authors use a series of established assays to evaluate axon degeneration, including light and electron microscopy on nerve histological samples and cultured dorsal root ganglion neurons in which axons are mechanically severed and degeneration is scored in time-lapse microscopy. These assays consistently show a lack of effect of loss of DR6 on Wallerian degeneration in both mouse strains examined.

      Therefore, in the specific context of these experiments, the author's data support their conclusion that loss of DR6 does not protect against Wallerian degeneration.

      Weaknesses:

      (1) The major weaknesses of this paper include the tone of correcting previously erroneous results and the lack of reporting on important details around animal experiments that would help determine whether the results here really are discordant with previous studies, and if so, why.

      The authors do not report the genetic strain background of the mice used, the sex distributions of their experimental cohorts, or the age of the mice at the time the experiments were performed. All of these are important variables.

      (Response 1) We thank the reviewer for emphasizing the importance of reporting the sex, age, and genetic background of the experimental animals used in our axon protection analyses. We have incorporated this information into the revised manuscript wherever available. The sole exception concerns the genetic background of the conditional DR6 mice generated by Genentech, which remains unknown. The original publication describing these mice (Tam et al., 2012, Dev Cell, PMID 22340501) did not report this information, and we were unable to obtain it directly from Genentech. Details regarding the genetic background of the Wld<sup>S</sup> and aPhr1 mutant mice are provided in their respective original publications, which are cited in our manuscript. Because the Gamage et al. study from the Deppmann laboratory did not report the sex or age of the animals used, we were unable to assess whether these variables might contribute to the differences observed between the two studies. Moreover, we are not aware of published evidence identifying sex or age as modifiers of structural axon preservation in axotomized peripheral nerve stumps in mouse models of delayed Wallerian degeneration. Furthermore, in the original publications describing the phenotypes of transgenic Nmnat2 and Wld<sup>S</sup> mice, as well as Sarm1 or Phr1 knockout mice, sex and age of the animals used in the Wallerian degeneration assays were not reported (PMIDs 23995269, 12106171, 22678360, 23665224). Although, to our knowledge, no large-scale systematic studies have been conducted, over the last 15 years we have never observed any sex-based differences in Wallerian degeneration phenotypes in these mutants exhibiting prominent axon protection. This topic was discussed informally at conferences, and we are also not aware of other investigators having observed such effects.

      In response to the reviewer’s comment regarding “tone”, we made sure that our data and interpretations are presented in a professional, balanced, and objective manner, including a detailed discussion of potential alternative explanations for the discrepant findings.

      (2) The DR6 knockout strain reported in Gamage et al. (2017) was on a C57BL/6.129S segregating background. Gamage et al. reported that loss of DR6 protected axons from Wallerian degeneration for up to 4 weeks, but importantly, only in 38.5% (5 out of 13) mice they examined. In the present paper, the authors speculate on possible causes for differences between the lack of effect seen here and the effects reported in Gamage et al., including possible spontaneous background mutations, epigenetic changes, genetic modifiers, neuroinflammation, and environmental differences. A likely explanation of the incomplete penetrance reported by Gamage et al. is the segregating genetic background and the presence of modifier loci between C57BL/6 and 129S. The authors do not report the genetic background of the mice used in this study, other than to note that the knockout strain was provided by the group in Gamage et al. However, if, for example, that mutation has been made congenic on C57BL/6 in the intervening years, this would be important to know. One could also argue that the results presented here are consistent with 8 out of 13 mice presented in Gamage et al.

      (Response 2) As noted above, we now provide information on the genetic background of the mice in the revised manuscript, where available. We have not backcrossed the constitutive DR6 knockout mice obtained from the Deppmann laboratory (Gamage et al.) to a C57BL/6 background; our colony was maintained primarily through intercrosses of heterozygous animals. Similarly, the conditional DR6 mutant mice used in this study were also not backcrossed to C57BL/6 mice.

      We respectfully hold a different view regarding the reviewer’s final point. We understand it is not appropriate to infer consistency between two datasets by disregarding the subset of results that do not align. By the same logic, it would be flawed to draw conclusions from the Gamage et al. study based solely on the single Wld<sup>S</sup> mouse out of five that did not show axon preservation after nerve injury. Selectively omitting conflicting data does not provide a valid basis for establishing phenotype concordance across studies.

      To further strengthen our study, we note that we performed additional analyses on three more nerve samples from constitutive DR6 null mice during the revision process and have incorporated the resulting data in Fig. 1.

      (3) Age is also an important variable. The protective effects of the spontaneous WldS mutation decrease with age, for example. It is unclear whether the possible protective effects of DR6 also change with age; perhaps this could explain the variable response seen in Gamage et al. and the lack of response seen here.

      (Response 3) As discussed above, we now provide the age information for the mice used for the Wallerian degeneration assessments in the respective figure legends. To our knowledge, there are no prior reports suggesting that age is a significant determinant of structural axon preservation in the indicated mutants. Electrophysiological function and neuromuscular junction preservation decrease with age in axotomized Wld<sup>S</sup> mice (e.g., PMIDs 12231635, 19158292, 15654865), but these parameters are not subject of our study, and we have not studied them. Unfortunately, a direct comparison of ages between our DR6 mutant mice and those used in Gamage et al. (2017) is not possible, as the earlier study from the Deppmann laboratory did not report this information.

      (4) It is unclear if sex is a factor, but this is part of why it should be reported.

      (Response 4) We now report the requested sex information for our axon preservation analyses during nerve injury-induced Wallerian degeneration in the DR6 mouse models in Figs. 1 and 2.

      (5) The authors also state that they do not see differences in the Schwann cell response to injury in the absence of DR6 that were reported in Gamage et al., but this is not an accurate comparison. In Gamage et al., they examined Schwann cells around axons that were protected from degeneration 2 and 4 weeks post-injury. Those axons had much thinner myelin, in contrast to axons protected by WldS or loss of Sarm1, where the myelin thickness remained relatively normal. Thus, Gamage et al. concluded that the protection of axons from degeneration and the preservation of Schwann cell myelin thickness are separate processes. Here, since no axon protection was seen, the same analysis cannot be done, and we can only say that when axons degenerate, the Schwann cells respond the same whether DR6 is expressed or not.

      (Response 5) We appreciate the reviewer’s detailed comments. Our intention was not to directly compare our findings with those of Gamage et al. regarding the myelin behavior at these time points (because we never observed axon protection), but rather to note that we did not observe any DR6-dependent alterations in Schwann cell responses under conditions where axons undergo normal Wallerian degeneration. As the reviewer correctly points out, Gamage et al. analyzed Schwann cell myelin surrounding axons that were protected from degeneration for extended periods, a context fundamentally different from the complete lack of axon protection observed in our DR6-deficient models. Therefore, the specific dissociation between axon preservation and myelin maintenance claimed by Gamage et al. cannot be evaluated in our study. A statement to make this point clearer has been incorporated in the revised manuscript.

      We fully agree with the reviewer’s concluding point: in our experiments, once axons degenerate, Schwann cell responses proceed similarly regardless of DR6 expression. This agreement reinforces one of the central conclusions of our work.

      (6) The authors also take issue with Colombo et al. (2018), where it was reported that there is an increase in axon diameter and a change in the g-ratio (axon diameter to fiber diameter - the axon + myelin) in peripheral nerves in DR6 knockout mice. This change resulted in a small population of abnormally large axons that had thinner myelin than one would expect for their size. The change in g-ratio was specific to these axons and driven by the increased axon diameter, not decreased myelin thickness, although those two factors are normally loosely correlated. Here, the authors report no changes in axon size or g-ratio, but this could also be due to how the distribution of axon sizes was binned for analysis, and looking at individual data points in supplemental figure 3A, there are axons in the DR6 knockout mice that are larger than any axons in wild type. Thus, this discrepancy may be down to specifics and how statistics were performed or how histograms were binned, but it is unclear if the results presented here are dramatically at odds with the results in Colombo et al. (2018).

      (Response 6) Several points raised by the reviewer appear to reflect differences in interpretation of the findings reported in Colombo et al. (2018). That study did not report altered myelination in DR6 null mice at stages when myelination is largely complete (P21). Instead, modest changes were observed at P1, which were reduced by P7, and P21 mutants were reported to be indistinguishable from controls. No analyses of peripheral nerves in older animals were presented, and the authors concluded in the discussion that myelination in young adult DR6 null mice appears normal. In contrast, our analysis of constitutive DR6 null mice at P1 does not reproduce the increase in the number of myelinated fibers per unit area reported by Colombo et al. We obtained similar results in the independent conditional DR6 knockout mouse line. Differences in nerve tissue processing, embedding, staining, or in the microscopic imaging and quantification of thinly myelinated axons in P1 sciatic nerve cross-sections may have contributed to the observed discrepancy. However, because the relevant methodological details were not described in Colombo et al., the underlying reasons for these differences cannot be determined and remain speculative.

      (7) Finally, it is important to note that previously reported effects of DR6 inhibition, such as protection of cultured cortical neurons from beta-amyloid toxicity, are not necessarily the same as Wallerian degeneration of axons distal to an injury studied here. The negative results presented here, showing that loss of DR6 is not protective against Wallerian degeneration induced by injury, are important given the interest in DR6 as a therapeutic target, but they are specific to these mice and this mechanism of induced axon degeneration. The extent to which these findings contradict previous work is difficult to assess due to the lack of detail in describing the mouse experiments, and care should be taken in attempting to extrapolate these results to other disease contexts, such as ALS or Alzheimer's disease.

      (Response 7) We agree with the reviewer’s point and emphasize that our manuscript carefully differentiates our data regarding the function of DR6 in Wallerian degeneration from the potential involvement of DR6 in other forms of axon degeneration. Our findings do not conflict with previous work on DR6 in the context of in vitro beta-amyloid and prion toxicity as well as in vitro models of ALS and multiple sclerosis. We believe these distinctions are explicitly and appropriately articulated throughout the entire manuscript and in more detail in the discussion section.

      Reviewer #1 (Recommendations for the authors):

      (1) The authors should include additional information about the mice used, including strain background for both the DR6 mice and the Cre transgenes crossed into the DR6 conditional knockout, the age of the mice when the nerve crush experiments were performed, and the sex distributions of the experimental cohorts. This information is critical for reproducibility in animal experiments, and that point is compounded here, where the major focus of this paper is taking issue with the reproducibility of previous work.

      (Response 8) This information has been included in the revision. See above responses.

      (2) In the abstract, reference 5 is cited as a study on the response to Schwann cells to injury in a DR6 background, but this probably should be reference 10.

      (Response 9) This typo has been corrected.

      (3) "Site-by-site comparison" in line 201 should be side-by-side?

      (Response 10) This typo has been corrected.

      (4) The paper contains a lot of self-evaluative wording, "surprising contrast," "compelling evidence," "robust results." Whether those adjectives apply should be for the reader to decide, and a drier, more objective tone in the presentation would improve the paper.

      (Response 11) We agree that excessive self-evaluative wording can weaken objectivity. In the manuscript, such phrasing is used sparingly and intentionally to highlight differences from previously published studies, guide the reader, and convey scholarly judgment. We do not consider this limited use to be counterproductive. The adjectives “surprising,” “compelling,” and “robust” each appear only one to three times across the entire manuscript, and the specific phrase “robust results” does not appear at all.

      (5) In Figure 2A, DR6-/-, there is no significant difference, but there is also a lot of variability, and one could argue the authors are seeing axon protection comparable to WldS in 40% of their samples (2/5), which is very similar to Gamage et al.

      (Response 12) We respectfully disagree with this reasoning as it relies on selectively emphasizing only a subset of the data. Please also see our response #2 for more detailed discussion.

      (6) Overall, the data presented here are convincing and support the conclusions drawn, but the paper needs to focus more on the negative results at hand and less on bashing previous studies, particularly when the results presented here do definitively show that the previous studies were incorrect and plausible explanations for differences in outcome exist.

      (Response 13) We have carefully revisited the wording of the manuscript and are confident that our emphasis remains on the central negative finding that DR6 does not regulate axon degeneration and Schwann cell injury responses during Wallerian degeneration. We do not believe the manuscript “bashes” previous studies; nonetheless, we thoroughly re-examined all relevant sections to ensure that our language is neutral, accurate, and non-inflammatory. We believe the current phrasing presents our interpretations in an appropriately balanced, objective, and professional manner.

      Reviewer #2 (Public review):

      Summary:

      This manuscript by Beirowski, Huang, and Babetto revisits the proposed role of Death Receptor 6 (DR6/Tnfrsf21) in Wallerian degeneration (WD). A prior study (Gamage et al., 2017) suggested that DR6 deletion delays axon degeneration and alters Schwann cell responses following peripheral nerve injury. Here, the authors comprehensively test this claim using two DR6 knockout mouse models (the line used in the earlier report plus a CMV-Cre derived floxed ko line) and multiple WD assays in vivo and in vitro, aligned with three positive controls, Sarm1 WldS and Phr1/Mycbp2 mutants. Contrary to the prior findings, they find no evidence that DR6 deletion affects axon degeneration kinetics or Schwann cell dynamics (assessed by cJun expression or [intact+degenerating] myelin abundance after injury) during WD. Importantly, in DRG explant assays, neurites from DR6-deficient mice degenerated at rates indistinguishable from controls. The authors conclude that DR6 is dispensable for WD, and that previously reported protective effects may have been due to confounding factors such as genetic background or spontaneous mutations.

      Strengths:

      The authors employ two independently generated DR6 knockout models, one overlapping with the previously published study, and confirm loss of DR6 expression by qPCR and Western blotting. Multiple complementary readouts of WD are applied (structural, ultrastructural, molecular, and functional), providing a robust test of the hypothesis.

      Comparisons are drawn with established positive controls (WldS, SARM1, Phr1/Mycbp2 mutants), reinforcing the validity of the assays.

      By directly addressing an influential but inconsistent prior report, the manuscript clarifies the role of DR6 and prevents potential misdirection of therapeutic strategies aimed at modulating WD in the PNS. The discussion thoughtfully considers possible explanations for the earlier results, including colony-specific second-site mutations that could explain the incomplete penetrance of the earlier reported phenotype of only 36%.

      Weaknesses:

      (1) The study focuses on peripheral nerves. The manuscript frequently refers to CNS studies to argue for consistency with their findings. It would be more accurate to frame PNS/CNS similarities as reminiscences rather than as consistencies (e.g., line 205ff in the Discussion).

      (Response 14) Axon protection in all key genetic models of delayed axon degeneration, including Wld<sup>S</sup>, SARM1, Phr1/Mycbp2 mutants, has been demonstrated in both the peripheral and central nervous systems. This observation supports the view that core molecular mechanisms regulating axon degeneration are conserved across neuronal populations throughout the entire nervous system. We have scrutinized the wording in our manuscript and are not aware that we frequently refer to CNS studies in regards to axon degeneration. Nevertheless, we have replaced the term “consistent” to avoid potential ambiguity when we discuss the earlier study showing normal Wallerian degeneration in the optic nerves from DR6 knockout mice.

      (2) The DRG explant assays are convincing, though the slight acceleration of degeneration in the DR6 floxed/Cre condition is intriguing (Figure 4E). Could the authors clarify whether this is statistically robust or biologically meaningful?

      (Response 15) We thank the reviewer for noting this aspect of our in vitro data in Fig. 4. The difference observed in the DR6 floxed/Cre condition is statistically significant at the 6h time point following disconnection, as indicated by the p value shown in Fig. 4E. However, a similarly statistically significant acceleration of axon degeneration was not observed in DRG axotomy experiments using constitutive DR6 knockout preparations, although a trend toward more rapid axon breakdown is apparent at 6 h post-axotomy (Fig. 4B). These observations may suggest reduced stability of DR6-deficient axons in this specific neuron-only in vitro context. Further investigation would be required to determine the biological significance of this effect. In contrast, our in vitro quantitative analyses of the initiation and early phases of Wallerian degeneration (Fig. 2) revealed no evidence of accelerated axon disintegration in the DR6 mutant mouse models, highlighting potential differences between in vitro and in vitro systems.

      (3) In the summary (line 43), the authors refer to Hu et al. (2013) (reference 5) as the study that previously reported AxD delay and SC response alteration after injury. However, this study did not investigate the PNS, and I believe the authors intended to reference Gamage et al. (2017) (reference 10) at this point.

      (Response 16) Thanks for pointing this out. We have corrected this typo in the revised manuscript.

      (4) In line 74ff of the results section, the authors claim that developmental myelination is not altered in DR6 mutants at postnatal day 1. However, the variability in Figure S2 appears substantial, and the group size seems underpowered to support this claim. Colombo et al. (2018) (reference 11) reported accelerated myelination at P1, but this study likewise appears underpowered. Possible reasons for these discrepancies and the large variability could be that only a defined cross-sectional area was quantified, rather than the entire nerve cross-section.

      (Response 17) We confirm that the quantification of thinly myelinated axons was performed on entire sciatic nerves from P1 mouse pups, as described in the methods section in our original manuscript. The data shown in Fig. S2 were obtained from 5-9 pups per experimental group. Sample sizes were determined based on a priori power analyses using pilot data, which indicated that a minimum of five biological replicates was sufficient to detect statistically significant differences with acceptable confidence. Comparable sample sizes have been used in our previous studies and by other groups to assess early postnatal myelination (e.g., PMIDs 21949390, 28484008). Several published studies have reported analyses using 3-4 animals per group (e.g., PMIDs 28484008, 25310982, 29367382). For comparison, the study by Colombo et al. used 3-8 pups for the analysis presented in their Fig. 3. We note that the apparent variability in Fig. S2 may be accentuated by the scaling of the y-axis, which was chosen to ensure that individual data points are clearly resolved and visible.

      (5) The authors stress the data of Gamage et al. (2017) on altered SC responses in DR6 mutants after injury. They employed cJun quantification to show that SC reprogramming after injury is not altered in DR6 mutants. This approach is valid and the conclusion trustworthy. Here, the addition of data showing the combined abundance of intact and degenerated myelin does not add much insight. However, Gamage et al. (2017) reported altered myelin thickness in a subset of axons at 14 days after injury, which is considerably later than the time points analyzed in the present study. While, in the Reviewer's view, the thin myelin observed by Gamage et al. in fact resembles remyelination, the authors may wish to highlight the difference in the time points analyzed.

      (Response 18) We consider the additional quantification of the area occupied by intact myelin and myelin debris to provide complementary information that supports the c-Jun-based conclusion that Schwann cell injury responses are normal in DR6-deficient nerves following lesion. We agree with this reviewer that the thin myelin observed by Gamage et al. resembles remyelination, raising the possibility that axon regeneration occurred into the distal nerve stump at the studied 14d post-injury time point (see their Fig. 3). This may have been interpreted as axon protection in this study. In our study, it was impossible to examine such myelin effects since axon protection was never observed in any of the DR6 mutant models at any of the time point we investigated. We have incorporated appropriate additional text to highlight this difference. See also response #5 above.

      Reviewer #3 (Public review):

      Summary:

      The authors revisit the role of DR6 in axon degeneration following physical injury (Wallerian degeneration), examining both its effects on axons and its role in regulating the Schwann cell response to injury. Surprisingly, and in contrast to previous studies, they find that DR6 deletion does not delay the rate of axon degeneration after injury, suggesting that DR6 is not a mediator of this process.

      Overall, this is a valuable study. As the authors note, the current literature on DR6 is inconsistent, and these results provide useful new data and clarification. This work will help other researchers interpret their own data and re-evaluate studies related to DR6 and axon degeneration.

      Strengths:

      (1) The use of two independent DR6 knockout mouse models strengthens the conclusions, particularly when reporting the absence of a phenotype.

      (2) The focus on early time points after injury addresses a key limitation of previous studies. This approach reduces the risk of missing subtle protective phenotypes and avoids confounding results with regenerating axons at later time points after axotomy.

      Weaknesses:

      (1) The study would benefit from including an additional experimental paradigm in which DR6 deficiency is expected to have a protective effect, to increase confidence in the experimental models, and to better contextualize the findings within different pathways of axon degeneration. For example, DR6 deletion has been shown in more than one study to be partially axon protective in the NGF deprivation model in DRGs in vitro. Incorporating such an experiment could be straightforward and would strengthen the paper, especially if some of the neuroprotective effects previously reported are confirmed.

      (Response 19) We thank the reviewer for these suggestions. We would like to highlight that our study addresses the role of DR6 in Wallerian degeneration, whereas in vitro NGF deprivation has been used to model developmental axon pruning. Previous work indicates fundamental biological differences between these regressive pathways regulating the stereotyped removal of axon segments. We feel that studying this alternative form of axon degeneration is beyond the scope of the current work and could be addressed in a separate manuscript. Although additional tests will be needed, we note that our preliminary data using samples from both DR6 knockout mouse models suggest no axon protection after NGF-deprivation in DRG neuron preparations in our hands (deprivation of the growth factor and administration of anti-NGF antibody).

      (2) The quality of some figures could be improved, particularly the EM images in Figure 2. As presented, they make it difficult to discern subtle differences.

      (Response 20) We have pseudocolored intact (turquoise) and degenerated (magenta) myelinated fibers on the high-resolution semithin micrographs (not electron micrographs) in the new Fig. 2 to make the distinction between the two fiber categories clearer.

      Reviewer #3 (Recommendations for the authors):

      (1) Line 121: The authors mention toluidine blue staining, but it does not appear to be shown in Figure S5.

      (Response 21) This appears to be a misunderstanding. Fig. S5A shows the ultrastructure of dedifferentiated Schwann cells in transmission electron micrographs, while Figs. S5B and C show quantification of the area occupied by myelin sheaths and myelin debris profiles on osmium tetroxide and toluidine blue stained nerve sections from the two DR6 mutant models, based on semithin light microscopy. These are two different aspects of the analysis. The text has been modified in the revised manuscript to make the distinction clearer.

      (2) Line 175: The authors should add NMNAT2 to the list of enzymes implicated in the regulation of Wallerian degeneration in mammals.

      (Response 22) Nmnat2 and a literature reference (Milde et al., 2013) has been incorporated in the discussion of the revised manuscript to address this point.

      (3) Line 201: Please correct the typo "site-by-site" to "side-by-side."

      (Response 23) This typo has been corrected.