10,000 Matching Annotations
  1. Aug 2024
    1. Reviewer #2 (Public Review):

      Summary:<br /> The authors aim at challenging the relevance of cell populations with characteristic selectivity for specific aspects of navigation (e.g. place cells, head direction and border cells) in the processing of spatial information. Their claim is that such cells naturally emerge in any system dealing with the estimation of position in an environment, without the need for a special involvement of these cells in the computations. In particular the work shows how when provided with spatial error signals, networks designed for invariant object recognition spontaneously organize the activity in their hidden layers into a mixture of spatially selective cells, some of them passing classification criteria for place, head direction or border cells. Crucially, these cells are not necessary for position decoding, nor are they the most informative when it comes to the performance of the network in reconstructing spatial position from visual scenes. These results lead the authors to claim that focusing on the classification of specific cell types is hindering rather than helping advancement in the understanding of spatial cognition. In fact they claim that the attention should rather be pointed at understanding highly-dimensional population coding, regardless of its direct interpretability or its appeal to human observers.

      Strengths:<br /> Methodologically the paper is consistent and convincingly support the author claims regarding the role of cell types in coding for spatial aspects of cognition. It is also interesting how the authors leverage on established machine learning systems to provide a sort of counter-argument to the use of such techniques to establish a parallel between artificial and biological neural representations. In the recent past similar applications of artificial neural networks to spatial navigation have been directed at proving the importance of specific neural substrates (take for example Banino et al. 2018 for grid cells), while in this case the same procedure is used to unveil them as epiphenomena, so general and unspecific to be of very limited use in understanding the actual functioning of the neural system. I am quite confident that this stance regarding the role of place cells and co. could gather large sympathy and support in the greater part of the neuroscience community, or at least among the majority of theoretical neuroscientists with some interest in the hippocampus and higher cognition.

      Weaknesses:<br /> My criticism of the paper can be articulated in three main points:<br /> - What about grid cells? Grid cells are notably not showing up in the analyses of the paper. But they surely can be considered as the 'mother' of all tailored spatial cells of the hippocampal formation. Are they falling outside the author's assessment of the importance of this kind of cells? Some discussion of the place grid cells occupy in the vision of the authors would greatly help.<br /> - The network used in the paper is still guided by a spatial error signal, and the network is trained to minimize spatial decoding error. In a sense, although object classfication networks are not designed for spatial navigation, one could say that the authors are in some way hacking this architecture and turning it into a spatial navigation one through learning. I wonder if their case could be strengthened by devising a version of their experiment based on some form of self-supervised or unsupervised learning.<br /> - The last point is more about my perception of the community studying hippocampal functions, rather than being directed at the merits of the paper itself. My question is whether the paper is fighting an already won battle. That is whether the focus on the minute classification of response profiles of cells in the hippocampus is in fact already considered an 'old' approach, very useful for some initial qualitative assessments but of limited power when asked to provide deeper insight into the functioning of hippocampal computations (or computations of any other brain circuit).

    2. Reviewer #3 (Public Review):

      Summary:<br /> In this paper, the authors demonstrate the inevitably of the emergence of some degree of spatial information in sufficiently complex systems, even those that are only trained on object recognition (i.e. not "spatial" systems). As such, they present an important null hypothesis that should be taken into consideration for experimental design and data analysis of spatial tuning and its relevance for behavior.

      Strengths:<br /> The paper's strengths include the use of a large multi-layer network trained in a detailed visual environment. This illustrates an important message for the field: that spatial tuning can be a result of sensory processing. While this is a historically recognized and often-studied fact in experimental neuroscience, it is made more concrete with the use of a complex sensory network. Indeed, the manuscript is a cautionary tale for experimentalists and computational researchers alike against blindly applying and interpreting metrics without adequate controls.

      Weaknesses:<br /> However, the work has a number of significant weaknesses. Most notably: the degree and quality of spatial tuning is not analyzed to the standards of evidence historically used in studies of spatial tuning in the brain, and the authors do not critically engage with past work that studies the sensory influences of these cells; there are significant issues in the authors' interpretation of their results and its impact on neuroscientific research; the ability to linearly decode position from a large number of units is not a strong test of spatial information, nor is it a measure of spatial cognition; and the authors make strong but unjustified claims as to the implications of their results in opposition to, as opposed to contributing to, work being done in the field.

      The first weakness is that the degree and quality of spatial tuning that emerges in the network is not analyzed to the standards of evidence that have been used in studies of spatial tuning in the brain. Specifically, the authors identify place cells, head direction cells, and border cells in their network and their conjunctive combinations. However, these forms of tuning are the most easily confounded by visual responses, and it's unclear if their results will extend to forms of spatial tuning that are not. Further, in each case, previous experimental work to further elucidate the influence of sensory information on these cells has not been acknowledged or engaged with.

      For example, consider the head direction cells in Figure 3C. In addition to increased activity in some directions, these cells also have a high degree of spatial nonuniformity, suggesting they are responding to specific visual features of the environment. In contrast, the majority of HD cells in the brain are only very weakly spatially selective, if at all, once an animal's spatial occupancy is accounted for (Taube et al 1990, JNeurosci). While the preferred orientation of these cells are anchored to prominent visual cues, when they rotate with changing visual cues the entire head direction system rotates together (cells' relative orientation relationships are maintained, including those that encode directions facing AWAY from the moved cue), and thus these responses cannot be simply independent sensory-tuned cells responding to the sensory change) (Taube et al 1990 JNeurosci, Zugaro et al 2003 JNeurosci, Ajbi et al 2023).

      As another example, the joint selectivity of detected border cells with head direction in Figure 3D suggests that they are "view of a wall from a specific angle" cells. In contrast, experimental work on border cells in the brain has demonstrated that these are robust to changes in the sensory input from the wall (e.g. van Wijngaarden et al 2020), or that many of them are not directionally selective (Solstad et al 2008).

      The most convincing evidence of "spurious" spatial tuning would be the emergence of HD-independent place cells in the network, however, these cells are a small minority (in contrast to hippocampal data, Thompson and Best 1984 JNeurosci, Rich et al 2014 Science), the examples provided in Figure 3 are significantly more weakly tuned than those observed in the brain, and the metrics used by the authors to quantify place cell tuning are not clearly defined in the methods, but do not seem to be as stringent as those commonly used in real data. (e.g. spatial information, Skaggs et al 1992 NeurIPS).

      Indeed, the vast majority of tuned cells in the network are conjunctively selective for HD (Figure 3A). While this conjunctive tuning has been reported, many units in the hippocampus/entorhinal system are *not* strongly hd selective (Muller et al 1994 JNeurosci, Sangoli et al 2006 Science, Carpenter et al 2023 bioRxiv). Further, many studies have been done to test and understand the nature of sensory influence (e.g. Acharya et al 2016 Cell), and they tend to have a complex relationship with a variety of sensory cues, which cannot readily be explained by straightforward sensory processing (rev: Poucet et al 2000 Rev Neurosci, Plitt and Giocomo 2021 Nat Neuro). E.g. while some place cells are sometimes reported to be directionally selective, this directional selectivity is dependent on behavioral context (Markus et al 1995, JNeurosci), and emerges over time with familiarity to the environment (Navratiloua et al 2012 Front. Neural Circuits). Thus, the question is not whether spatially tuned cells are influenced by sensory information, but whether feed-forward sensory processing alone is sufficient to account for their observed turning properties and responses to sensory manipulations.

      These issues indicate a more significant underlying issue of scientific methodology relating to the interpretation of their result and its impact on neuroscientific research. Specifically, in order to make strong claims about experimental data, it is not enough to show that a control (i.e. a null hypothesis) exists, one needs to demonstrate that experimental observations are quantitatively no better than that control.

      Where the authors state that "In summary, complex networks that are not spatial systems, coupled with environmental input, appear sufficient to decode spatial information." what they have really shown is that it is possible to decode *some degree* of spatial information. This is a null hypothesis (that observations of spatial tuning do not reflect a "spatial system"), and the comparison must be made to experimental data to test if the so-called "spatial" networks in the brain have more cells with more reliable spatial info than a complex-visual control.

      Further, the authors state that "Consistent with our view, we found no clear relationship between cell type distribution and spatial information in each layer. This raises the possibility that "spatial cells" do not play a pivotal role in spatial tasks as is broadly assumed." Indeed, this would raise such a possibility, if 1) the observations of their network were indeed quantitatively similar to the brain, and 2) the presence of these cells in the brain were the only evidence for their role in spatial tasks. However, 1) the authors have not shown this result in neural data, they've only noticed it in a network and mentioned the POSSIBILITY of a similar thing in the brain, and 2) the "assumption" of the role of spatially tuned cells in spatial tasks is not just from the observation of a few spatially tuned cells. But from many other experiments including causal manipulations (e.g. Robinson et al 2020 Cell, DeLauilleon et al 2015 Nat Neuro), which the authors conveniently ignore. Thus, I do not find their argument, as strongly stated as it is, to be well-supported.

      An additional weakness is that linear decoding of position is not a strong test, nor is it a measure of spatial cognition. The ability to decode position from a large number of weakly tuned cells is not surprising. However, based on this ability to decode, the authors claim that "'spatial' cells do not play a privileged role in spatial cognition". To justify this claim, the authors would need to use the network to perform e.g. spatial navigation tasks, then investigate the network's ability to perform these tasks when tuned cells were lesioned.

      Finally, I find a major weakness of the paper to be the framing of the results in opposition to, as opposed to contributing to, the study of spatially tuned cells. For example, the authors state that "If a perception system devoid of a spatial component demonstrates classically spatially-tuned unit representations, such as place, head-direction, and border cells, can "spatial cells" truly be regarded as 'spatial'?" Setting aside the issue of whether the perception system in question does indeed demonstrate spatially-tuned unit representations comparable to those in the brain, I ask "Why not?" This seems to be a semantic game of reading more into a name then is necessarily there. The names (place cells, grid cells, border cells, etc) describe an observation (that cells are observed to fire in certain areas of an animal's environment). They need not be a mechanistic claim (that space "causes" these cells to fire) or even, necessarily, a normative one (these cells are "for" spatial computation). This is evidenced by the fact that even within e.g. the place cell community, there is debate about these cells' mechanisms and function (eg memory, navigation, etc), or if they can even be said to serve only a single function. However, they are still referred to as place cells, not as a statement of their function but as a history-dependent label that refers to their observed correlates with experimental variables. Thus, the observation that spatially tuned cells are "inevitable derivatives of any complex system" is itself an interesting finding which *contributes to*, rather than contradicts, the study of these cells. It seems that the authors have a specific definition in mind when they say that a cell is "truly" "spatial" or that a biological or artificial neural network is a "spatial system", but this definition is not stated, and it is not clear that the terminology used in the field presupposes their definition.

      In sum, the authors have demonstrated the existence of a control/null hypothesis for observations of spatially-tuned cells. However, 1) It is not enough to show that a control (null hypothesis) exists, one needs to test if experimental observations are no better than control, in order to make strong claims about experimental data, 2) the authors do not acknowledge the work that has been done in many cases specifically to control for this null hypothesis in experimental work or to test the sensory influences on these cells, and 3) the authors do not rigorously test the degree or source of spatial tuning of their units.

    1. eLife assessment

      This important study offers convincing evidence that fmo-4 plays essential roles in established lifespan interventions and downstream of its paralog fmo-2, a beneficial advancement in our understanding of this enzyme family that underscores their importance in longevity and stress resistance. The study also suggests a connection between fmo-4 and dysregulation of calcium signalling. The authors' conclusions and interpretations were generally based on solid genetic methodology and evidence.

    2. Reviewer #1 (Public Review):<br /> Summary:<br /> This interesting and well written article by Tuckowski et al. summarizes work connecting the flavin-containing monooxygenase FMO-4 with increased lifespan through a mechanism involving calcium signaling in the nematode Caenorhabditis elegans.

      The authors have previously studied another fmo in worms, FMO-2, prompting them to look at additional members of this family of proteins. They show that fmo-4 is up in dietary restricted worms and necessary for the increased lifespan of these animals as well as of rsks-1 (s6 kinase) knockdown animals. They then show that overexpression of fmo-4 is sufficient to significantly increase lifespan, as well as healthspan and paraquat resistance. Further, they demonstrate that overexpression of fmo-4 solely in the hypodermis of the animal recapitulates the entire effect of fmo-4 OE.

      In terms of interactions between fmo-2 and fmo-4 they show that fmo-4 is necessary for the previously reported effects of fmo-2 on lifespan, while the effects of fmo-4 do not depend on fmo-2.

      Next the authors use RNASeq to compare fmo-4 OE animals to wild type. Their analyses suggested the possibility that FMO-4 was modulating calcium signaling, and through additional experiments specifically identified the calcium signaling genes crt-1, itr-1, and mcu-1 as important fmo-4 interactors<br /> in this context. As previously published work has shown that loss of the worm transcription factor atf-6 can extend lifespan through crt-1, itr-1 and mcu-1, the authors asked about interactions between fmo-4 and atf-6. They showed that fmo-4 is necessary for both lifespan extension and increased paraquat resistance upon RNAi knockdown of atf-6.

      Overall this clearly written manuscript summarizes interesting and novel findings of great interest in the biology of aging and suggests promising avenues for future work in this area.

      Strengths:<br /> This paper contains a large number of careful, well executed and analysed experiments in support of its existing conclusions, and which also point toward significant future directions for this work. In addition it is clear and very well written.

      Weaknesses:<br /> Within the scope of the current work there are no major weaknesses. That said, the authors themselves note pressing questions beyond the scope of this study that remain unanswered. For instance, the mechanistic nature of the interactions between FMO-4 and the other players in this story, for example in terms of direct protein-protein interactions, is not at all understood yet. Further, powerful tools such as GCaMP expressing animals will enable a much more detailed understanding of what exactly is happening to calcium levels, and where and when it is happening, in these animals.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Members of a conserved family of flavin-containing monooxygenases (FMOs) are necessary and at least partly sufficient for lifespan extension induced by diet restriction and hypoxia. Of 5 FMOs in C. elegans, fmo-2 has received the majority of attention, but this study identifies that fmo-4 is also an important, positive modulator of lifespan. Based on differential requirements of fmo-2 and fmo-4 in stress resistance and lifespan extension paradigms, the authors conclude that fmo-4 acts through mechanisms that are overlapping, but distinct from fmo-2. Ultimately, the authors place fmo-2 genetically within a pathway involving atf-6, calreticulin, the IP3 receptor, and mitochondrial calcium uniporter, which was previously shown to link ER calcium homeostasis to mitochondrial homeostasis and longevity. Because the known enzymatic activity of FMOs involves oxygenating xenobiotic and endogenous metabolites, these findings highlight a potential new link between redox/metabolic homeostasis and ER-mitochondrial calcium signaling, while revealing that different FMO family members regulate stress resistance and lifespan through distinct mechanisms.

      Strengths:<br /> The authors have used genetics to discover an interesting and unanticipated new link between conserved FMOs and ER calcium pathways known to regulate lifespan.

      The genetic epistasis patterns for lifespan and stress resistance phenotypes are generally clean and compelling.

      Weaknesses:<br /> The effects of carbachol and EDTA on intracellular calcium levels are inferred, especially in the tissues where fmo-4 is acting. Validating that these agents and fmo-4 itself have an impact on calcium in relevant subcellular compartments is important to support conclusions on how fmo-4 regulates and responds to calcium.

      Experiments are generally reliant on RNAi. While in most cases experiments reveal positive results, indicating RNAi efficacy, key conclusions could be strengthened with the incorporation of mutants.

      While FMO-4 is clearly placed in the ER calcium pathway genetically, a putative molecular mechanism by which FMO-4 would alter ER calcium remains unclear. Notably, Tuckowski et al. highlight this gap in the discussion as well.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors assessed the potential involvement of fmo-4 in a diverse set of longevity interventions, showing that this gene is required for DR and S6 kinase knockdown related lifespan extension. Using comprehensive epistasis experiments they find this gene to be a required downstream player in the longevity and stress resistance provided by fmo-2 overexpression. They further showed that fmo-4 ubiquitous overexpression is sufficient to provide longevity and paraquat (mitochondrial) stress resistance, and that overexpression specifically in the hypodermis is sufficient to recapitulate most of these effects.

      Interestingly, they find that fmo-4 overexpression sensitizes worms to thapsigargin during development, an effect that they link with a potential dysregulation in calcium signalling. They go on to show that fmo-4 expression is sensitive to drugs that both increase or decrease calcium levels, and these drugs differentially affect lifespan of fmo-4 mutants compared to wild-type worms. Similarly, knockdown of genes involved in calcium binding and signalling also differentially affect lifespan and paraquat resistance of fmo-4 mutants.

      Finally, they suggest that atf-6 limits the expression of fmo-4, and that fmo-4 is also acting downstream of benefits produced by atf-6 knockdown.

      Strengths:<br /> • comprehensive lifespans experiments: clear placement of fmo-4 within established longevity interventions.<br /> • clear distinction in functions and epistatic interactions between fmo-2 and fmo-4 which lays a strong foundation for a longevity pathway regulated by this enzyme family.

      Weaknesses:<br /> • no obvious transcriptomic evidence supporting a link between fmo-4 and calcium signalling: either for knockout worms or fmo-4 overexpressing strains.<br /> • no direct measures of alterations in calcium flux, signalling or binding that strongly support a connection with fmo-4.<br /> • no measures of mitochondrial morphology or activity that strongly support a connection with fmo-4.<br /> • lack of a complete model that places fmo-4 function downstream of DR and mTOR signalling (first Results section), fmo-2 (second Results section) and at the same time explains connection with calcium signalling.

    1. eLife assessment

      The study by Kleinman and Foster identifies a role for VTA dopamine signaling in modulating hippocampal replay and sharp-wave ripples, specifically highlighting how VTA inactivation leads to aberrant replay activities in scenarios without reward changes and during exposure to novel environments. This valuable work contributes to our understanding of the neurobiological mechanisms underlying spatial memory and learning, suggesting that dopamine plays a pivotal role in linking reward context and novelty to memory consolidation processes. However, the evidence as currently presented is incomplete. More rigorous statistical reporting and histological verification of the experimental approach, and a more consistent approach to experimental dosing and timing, which are crucial for confirming the reproducibility and reliability of the observed effects, are needed.

    2. Reviewer #1 (Public Review):

      This manuscript by Kleinman & Foster investigates the dependence of hippocampal replay on VTA activity. They recorded neural activity from the dorsal CA1 region of the hippocampus while chemogenetically silencing VTA dopamine neurons as rats completed laps on a linear track with reward delivery at each end. Reward amount changed across task epochs within a session on one end of the track. The authors report that VTA activity is necessary for an increase in sharp-wave rate to remain localized to the feeder that undergoes a change in reward magnitude, an effect that was especially pronounced in a novel environment. They follow up on this result with a second experiment in which reward magnitude varies unpredictably at one end of the linear track and report that changes in sharp-wave rate at the variable location reflect both the amount of reward rats just received there, in addition to a smaller modulation that is reminiscent of reward prediction error coding, in which the previous reward rats received at the variable location affects the magnitude of the subsequent change in sharp-wave rate that occurs on the present visit.

      This work is technically innovative, combining neural recordings with chemogenetic inactivation. The question of how VTA activity affects replay in the hippocampus is interesting and important given that much of the work implicating hippocampal replay in memory consolidation and planning comes from reward-motivated behavioral tasks. Enthusiasm for the manuscript is dampened by some technical considerations about the chemogenetic portion of the experiments. Additionally, there are some interpretational issues related to whether changes in reward magnitude affected sharp-wave rate directly, or whether the reported changes in sharp-wave rate alter behavior and these behavioral changes affect sharp-wave rate.

      Major issues:

      Chemogenetics validation

      Little validation is provided for the chemogenetic manipulations. The authors report that animals were excluded due to lack of expression but do not quantify/document the extent of expression in the animals that were included in the study. There's no independent verification that VTA was actually inhibited by the chemogenetic manipulation besides the experimental effects of interest.

      The authors report a range of CNO doses. What determined the dose that each rat received? Was it constant for an individual rat? If not, how was the dose determined? The authors may wish to examine whether any of their CNO effects were dependent on dose.

      The authors tested the same animal multiple times per day with relatively little time between recording sessions. Can they be certain that the effect of CNO wore off between sessions? Might successive CNO injections in the same day have impacted neural activity in the VTA differently? Could the chemogenetic manipulation have grown stronger with each successive injection (or maybe weaker due to something like receptor desensitization)? The authors could test statistically whether the effects of CNO that they report do not depend on the number of CNO injections a rat received over a short period of time.

      Motivational considerations

      In a similar vein, running multiple sessions per day raises the possibility that rats' motivation was not constant across all data collection time points. The authors could test whether any measures of motivation (laps completed, running speed) changed across the sessions conducted within the same day. This is a particularly tricky issue, because my read of the methods is that saline sessions were only conducted as the first session of any recording day, which means there's a session order/time of day and potential motivational confound in comparing saline to CNO sessions.

      Statistics, statistical power, and effect sizes

      Throughout the manuscript, the authors employ a mixture of t-tests, ANOVAs, and mixed-effects models. Only the mixed effects models appropriately account for the fact that all of this data involves repeated measurements from the same subject. The t-tests are frequently doubly inappropriate because they both treat repeated measures as independent and are not corrected for multiple comparisons.

      The number of animals in these studies is on the lower end for this sort of work, raising questions about whether all of these results are statistically reliable and likely to generalize. This is particularly pronounced in the reward volatility experiment, where the number of rats in the experimental group is halved to just two. The results of this experiment are potentially very exciting, but the sample size makes this feel more like pilot data than a finished product.

      The effect sizes of the various manipulations appear to be relatively modest, and I wonder if the authors could help readers by contextualizing the magnitude of these results further. For instance, when VTA inactivation increases mis-localization of SWRs to the unchanged end of the track, roughly how many misplaced sharp-waves are occurring within a session, and what would their consequence be? On this particular behavioral task, it's not clear that the animals are doing worse in any way despite the mislocalization of sharp-waves. And it seems like the absolute number of extra sharp-waves that occur in some of these conditions would be quite small over the course of a session, so it would be helpful if the authors could speculate on how these differences might translate to meaningful changes in processes like consolidation, for instance.

      How directly is reward affecting sharp-wave rate?

      Changes in reward magnitude on the authors' task cause rats to reallocate how much time they spent at each end. Coincident with this behavioral change, the authors identify changes in the sharp-wave rate, and the assumption is that changing reward is altering the sharp-wave rate. But it also seems possible that by inducing longer pauses, increased reward magnitude is affecting the hippocampal network state and creating an occasion for more sharp-waves to occur. It's possible that any manipulation so altering rats' behavior would similarly affect the sharp-wave rate.

      For instance, in the volatility experiment, on trials when no reward is given sharp-wave rate looks like it is effectively zero. But this rate is somewhat hard to interpret. If rats hardly stopped moving on trials when no reward was given, and the hippocampus remained in a strong theta network state for the full duration of the rat's visit to the feeder, the lack of sharp-waves might not reflect something about reward processing so much as the fact that the rat's hippocampus didn't have the occasion to emit a sharp-wave. A better way to compute the sharp-wave rate might be to use not the entire visit duration in the denominator, but rather the total amount of time the hippocampus spends in a non-theta state during each visit. Another approach might be to include visit duration as a covariate with reward magnitude in some of the analyses. Increasing reward magnitude seems to increase visit duration, but these probably aren't perfectly correlated, so the authors might gain some leverage by showing that on the rare long visit to a low-reward end sharp-wave rate remains reliably low. This would help exclude the explanation that sharp-wave rate follows increases in reward magnitude simply because longer pauses allow a greater opportunity for the hippocampus to settle into a non-theta state.

      The authors seem to acknowledge this issue to some extent, as a few analyses have the moments just after the rat's arrival at a feeder and just before departure trimmed out of consideration. But that assumes these sorts of non-theta states are only occurring at the very beginning and very end of visits when in fact rats might be doing all sorts of other things during visits that could affect the hippocampus network state and the propensity to observe sharp-waves.

      Minor issues

      The title/abstract should reflect that only male animals were used in this study.

      The title refers to hippocampal replay, but for much of the paper the authors are measuring sharp-wave rate and not replay directly, so I would favor a more nuanced title.

      Relatedly, the interpretation of the mislocalization of sharp-waves following VTA inactivation suggests that the hippocampus is perhaps representing information inappropriately/incorrectly for consolidation, as the increased rate is observed both for a location that has undergone a change in reward and one that has not. However, the authors are measuring replay rate, not replay content. It's entirely possible that the "mislocalized" replays at the unchanged end are, in fact, replaying information about the changed end of the track. A bit more nuance in the discussion of this effect would be helpful.

      The authors use decoding accuracy during movement to determine which sessions should be included for decoding of replay direction. Details on cross-validation are omitted and would be appreciated. Also, the authors assume that sessions failed to meet inclusion criteria because of ensemble size, but this information is not reported anywhere directly. More info on the ensemble size of included/excluded sessions would be helpful.

      For most of the paper, the authors detect sharp-waves using ripple power in the LFP, but for the analysis of replay direction, they use a different detection procedure based on the population firing rate of recorded neurons. Was there a reason for this switch? It's somewhat difficult to compare reported sharpwave/replay rates of the analyses given that different approaches were used.

    3. Reviewer #2 (Public Review):

      (1) Summary<br /> Kleinman and Foster's study investigates the role of dopamine signaling in the ventral tegmental area (VTA) on hippocampal replay and sharp-wave ripples (SWR) in rats exposed to changes in reward magnitude and environmental novelty. The authors utilize chemogenetic silencing techniques to modulate dopamine neuron activity in the VTA while conducting simultaneous electrophysiological recordings from the hippocampal CA1 region. Their findings suggest that VTA dopamine signaling is critical for modulating hippocampal replay in response to changes in reward context and novelty, with specific disruptions observed in replay dynamics when VTA is inhibited, particularly in novel environments.

      (2) Strengths<br /> The research addresses a significant gap in our understanding of the neurobiological underpinnings of memory and spatial learning, highlighting the importance of dopamine-mediated processes. The methodological approach is robust, combining chemogenetic silencing with precise electrophysiological measurements, which allows for a detailed examination of the neural circuits involved. The study provides important insights into how hippocampal replay and SWR are influenced by reward prediction errors, as well as the role of dopamine in these processes. Specifically, the authors note that VTA silencing unexpectedly did not prevent increases in ripple activities where reward was increased, but induced significant aberrant increases in environments where reward levels were unchanged, highlighting a novel dependency of hippocampal replay on dopamine and a VTA-independent reward prediction error signal in familiar environments. These findings are critical for understanding the consolidation of episodic memory and the neural basis of learning.

      (3) Weaknesses<br /> Despite the strengths in methodology and conceptual framework, the study has several weaknesses that could affect the interpretation of the results. There is a need for more rigorous histological validation to confirm the extent and specificity of viral expression and electrode placements, which is crucial for ensuring the accuracy of the findings. Variability in the dosing and timing of chemogenetic interventions could also lead to inconsistencies in the data, suggesting a need for more standardized experimental protocols.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors of this work are trying to understand the role dopaminergic terminals coming from VTA have on hippocampal mechanisms of memory consolidation, with emphasis on the replay of hippocampal patterns of activity during periods of consummatory behavior in reward locations. Previous work suggested that replay of relevant spatial trajectories supports reward localization and influences behavior.

      The authors then tried to separate two conditions that were known to cause an increase in replay activity - spatial novelty encoding and variation of reward magnitude - and evaluate how these changed when VTA dopamine neurons were inactivated by a chemogenetic tool. They found that the rate of reverse replay (trajectory going away from the goal location) is increased with reward only in novel, but not in familiar environments. Overall this suggests that the VTA dopamine signal is critical during learning of novel locations, but not during explorations of already familiar environments.

      Strengths:<br /> The inactivation of VTA projections during goal-oriented behavior and in-vivo analysis of patterns of hippocampal activity during both novelty and reward variability. This work also adds to the body of evidence that reverse replay constitutes an important mechanism in learning spatial goal locations. It also points to the role of VTA in reward prediction errors with consequences for spatial navigation.

      Weaknesses:<br /> It remains to be determined whether novelty and larger rewards are associated with longer ripple duration, not just rate, and larger content/trajectories of replay sequences as previously described (Fernández-Ruiz, 2019), and whether dopamine signal from the VTA has a role on this.

    1. eLife assessment

      This useful study reports a reanalysis of one experiment of a previously published report to characterize the dynamics of neural population codes during visual working memory in the presence of distracting information. The evidence supporting the claims of dynamic codes is incomplete, as only a subset of the original data is analyzed, there is only modest evidence for dynamic coding in the results, and the result might be affected by the signal-to-noise ratio. This research will be of interest to cognitive neuroscientists working on the neural bases of visual perception and memory.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors re-analyzed Experiment 1 of a public dataset (Rademaker et al, 2019, Nature Neuroscience) which includes fMRI and behavioral data recorded while participants held an oriented grating in visual working memory (WM) and performed a delayed recall task at the end of an extended delay period. In that experiment, participants were pre-cued on each trial as to whether there would be a distracting visual stimulus presented during the delay period (filtered noise or randomly oriented grating). In this manuscript, the authors focused on identifying whether the neural code in the retinotopic cortex for remembered orientation was 'stable' over the delay period, such that the format of the code remained the same, or whether the code was dynamic, such that information was present, but encoded in an alternative format. They identify some time points - especially towards the beginning/end of the delay - where the multivariate activation pattern fails to generalize to other time points and interpret this as evidence for a dynamic code. Additionally, the authors compare the representational format of remembered orientation in the presence vs absence of a distracting stimulus, averaged over the delay period. This analysis suggested a 'rotation' of the representational subspace between distracting orientations and remembered orientations, which may help preserve simultaneous representations of both remembered and viewed stimuli.

      Strengths:

      (1) Direct comparisons of coding subspaces/manifolds between time points and task conditions is an innovative and useful approach for understanding how neural representations are transformed to support cognition.

      (2) Re-use of existing datasets substantially goes beyond the authors' previous findings by comparing the geometry of representational spaces between conditions and time points, and by looking explicitly for dynamic neural representations

      Weaknesses:

      (1) Only Experiment 1 of Rademaker et al (2019) is reanalyzed. The previous study included another experiment (Expt 2) using different types of distractors which did result in distractor-related costs to neural and behavioral measures of working memory. The Rademaker et al (2019) study uses these two results to conclude that neural WM representations are protected from distraction when distraction does not impact behavior, but conditions that do impact behavior also impact neural WM representations. Considering this previous result is critical for relating the present manuscript's results to the previous findings, it seems necessary to address Experimentt 2's data in the present work

      (2) Primary evidence for 'dynamic coding', especially in the early visual cortex, appears to be related to the transition between encoding/maintenance and maintenance/recall, but the delay period representations seem overall stable, consistent with previous findings

      (3) Dynamicism index used in Figure 1f quantifies the proportion of off-diagonal cells with significant differences in decoding performance from the diagonal cell. It's unclear why the proportion of time points is the best metric, rather than something like a change in decoding accuracy. This is addressed in the subsequent analysis considering coding subspaces, but the utility of the Figure 1f analysis remains weakly justified.

      (4) There is no report of how much total variance is explained by the two PCs defining the subspaces of interest in each condition, and timepoint. It could be the case that the first two principal components in one condition (e.g., sensory distractor) explain less variance than the first two principal components of another condition.

      (5) Converting a continuous decoding metric (angular error) to "% decoding accuracy" serves to obfuscate the units of the actual results. Decoding precision (e.g., sd of decoding error histogram) would be more interpretable and better related to both the previous study and behavioral measures of WM performance.

      (6) This report does not make use of behavioral performance data in the Rademaker et al (2019) dataset.

      (7) Given there were observed differences between individual retinotopic ROIs in the temporal cross-decoding analyses shown in Figure 1, the lack of data presented for the subspace analyses for the corresponding individual ROIs is a weakness

    3. Reviewer #2 (Public Review):

      Summary:

      In this work, Degutis and colleagues addressed an interesting issue related to the concurrent coding of sensory percepts and visual working memory contents in visual cortices. They used generalization analyses to test whether working memory representations change over time, diverge from sensory percepts, and vary across distraction conditions. Temporal generalization analysis demonstrated that off-diagonal decoding accuracies were lower than on-diagonal decoding accuracies, regardless of the presence of intervening distractions, implying that working memory representations can change over time. They further showed that the coding space for working memory contents showed subtle but statistically significant changes over time, potentially explaining the impaired off-diagonal decoding performance. The neural coding of sensory distractions instead remained largely stable. Generalization analyses between target and distractor codes showed overlaps but were not identical. Cross-condition decodings had lower accuracies compared to within-condition decodings. Finally, within-condition decoding revealed more reliable working memory representations in the condition with intervening random noises compared to cross-condition decoding using a trained classifier on data from the no-distraction condition, indicating a change in the VWM format between the noise distractor and no-distractor trials.

      Strengths:

      This paper demonstrates a clever use of generalization analysis to show changes in the neural codes of working memory contents across time and distraction conditions. It provides some insights into the differences between representations of working memory and sensory percepts, and how they can potentially coexist in overlapping brain regions.

      Weaknesses:

      (1) An alternative interpretation of the temporal dynamic pattern is that working memory representations become less reliable over time. As shown by the authors in Figure 1c and Figure 4a, the on-diagonal decoding accuracy generally decreased over time. This implies that the signal-to-noise ratio was decreasing over time. Classifiers trained with data of relatively higher SNR and lower SNR may rely on different features, leading to poor generalization performance. This issue should be addressed in the paper.

      (2) The paper tests against a strong version of stable coding, where neural spaces representing WM contents must remain identical over time. In this version, any changes in the neural space will be evidence of dynamic coding. As the paper acknowledges, there is already ample evidence arguing against this possibility. However, the evidence provided here (dynamic coding cluster, angle between coding spaces) is not as strong as what prior studies have shown for meaningful transformations in neural coding. For instance, the principal angle between coding spaces over time was smaller than 8 degrees, and around 7 degrees between sensory distractors and WM contents. This suggests that the coding space for WM was largely overlapping across time and with that for sensory distractors. Therefore, the major conclusion that working memory contents are dynamically coded is not well-supported by the presented results.

      (3) Relatedly, the main conclusions, such as "VWM code in several visual regions did not generalize well between different time points" and "VWM and feature-matching sensory distractors are encoded in separable coding spaces" are somewhat subjective given that cross-condition generalization analyses consistently showed above chance-level performance. These results could be interpreted as evidence of stable coding. The authors should use more objective descriptions, such as 'temporal generalization decoding showed reduced decoding accuracy in off-diagonals compared to on-diagonals.

    1. eLife assessment

      This study provides an in-depth exploration of the impact of X-linked ZDHHC9 gene mutations on cognitive deficits and epilepsy, with a particular focus on the expression and function of ZDHHC9 in myelin-forming oligodendrocytes (OLs). These valuable findings offer insights into ZDHHC9-related X-linked intellectual disability (XLID) and shed light on the regulatory mechanisms of palmitoylation in myelination. The experimental design and analysis of results are solid, providing a reference for further research in this field.

    2. Reviewer #1 (Public Review):

      In this work Jeong and colleagues focus on exploring the role of the acyltransferase ZDHHC9 in myelinating OLs in particular in the palmitoylation of several myelin proteins. After confirming the specific enrichment of the Zdhhc9 transcript in mouse and human OLs, the authors examine the subcellular localization of the protein in vitro and observed that in comparison with other isoforms, ZDHHC9 localizes at OLs cell bodies and at discrete puncta in the processes. These observations (Figures 1 and 2) led the authors to hypothesize that ZDHHC9 plays an important role in myelination. No gross changes were detected in OL development in Zdhhc9 KO mice and analyses from P28 Zdhhc9 KO mice crossed with Mobp-EGFP reporter mice did not show changes in EGFP+ OL differentiation (Figure 3). However, and given the observed subcellular localization of ZDHHC9 in OL processes (Figure 2) and the observation that the percentage of unmyelinated axons is increased in Zdhhc9 KO (Figure 6), early time points to examine the differentiated pools of OLs and their capacity to extend processes/contact axons need to be considered.

      Maturation of OL in Zdhhc9 KO was examined by crossing Zdhhc9 KO with Pdgfra-CreER; R26- EGFP and following the newly EGFP-labelled OPCs following tamoxifen administration. No changes in the numbers of EGFP+ OL were detected. The authors concluded that the loss of ZDHHC9 does not alter oligodendrogenesis in either the young or mature CNS. The authors observed defects in Zdhhc9 KO OL protrusions that they attributed to abnormal OL membrane expansion (Fig 4 and 5). Can they show evidence for this?

      The authors report that Zdhhc9 KO primary and secondary branches in OL were longer, some contained spheroid-like swellings and the OL protrusion complexity was higher. However, these data is partially contradictory to what they show in OL differentiation experiments in vitro (Fig 7). There is also no evidence for increased membrane expansion in Zdhhc9 knockdown myelin forming cells in culture. How to reconcile this?

    3. Reviewer #2 (Public Review):

      This study provides an in-depth exploration of the impact of X-linked ZDHHC9 gene mutations on cognitive deficits and epilepsy, with a particular focus on the expression and function of ZDHHC9 in myelin-forming oligodendrocytes (OLs). These findings offer crucial insights into understanding ZDHHC9-related X-linked intellectual disability (XLID) and shed light on the regulatory mechanisms of palmitoylation in myelination. The experimental design and analysis of results are convincing, providing a valuable reference for further research in this field. However, upon careful review, I believe the article still needs further improvement and supplementation in the following aspects:

      (1) Regarding the subcellular localization experiment of ZDHHC9 mutants in OL, it is currently limited to in vitro cultured OL, lacking validation in vivo OL or myelin sheath. Additionally, it is necessary to investigate whether the abnormal subcellular localization of ZDHHC9 mutants affects their enzyme activity and palmitoylation modification of substrate proteins.

      (2) The experimental period (P21+21 days) using genetic labeling to track the development of myelinating cells may not be long enough. It is recommended to extend the observation time and analyze at more time points to more comprehensively reflect the impact of Zdhhc9 KO.

      (3) The author speculates that Zdhhc9 may regulate myelination by affecting the membrane localization of specific myelin proteins, but lacks direct experimental evidence to support this. It is suggested to detect the expression and distribution of relevant proteins in the myelin of Zdhhc9 KO mice.

      (4) Although the article mentions the association of Zdhhc9 with intellectual disabilities, it does not involve behavioral analysis of Zdhhc9 KO mice. It is recommended to supplement some behavioral experimental data to support the important role of Zdhhc9 in maintaining normal cognitive function, enhancing the clinical relevance of the article.

      (5) For the abnormal myelination observed in Zdhhc9 KO mice, including unmyelinated large-diameter axons and excessively myelinated small-diameter axons, the article lacks in-depth research and explanation on the exact mechanism and mode of action of ZDHHC9 in regulating myelination.

      (6) The function of ZDHHC9 in OL may be related to the Golgi apparatus, but its exact role in these structures is still unclear. It is suggested to discuss in more detail the role of ZDHHC9 in the Golgi apparatus in the discussion section.

      (7) More experimental support and in-depth research are needed on the detailed mechanism of how ZDHHC9 and Golga7 cooperatively regulate MBP palmitoylation, and how this decrease in palmitoylation level leads to myelination defects.

      In summary, it is recommended that the authors address the above issues through additional experiments and improved discussions to further strengthen the credibility and clinical relevance of the article.

    1. eLife assessment

      This valuable study investigates the development of high-level visual responses in infants, finding that neural responses specific to faces are present by 4-6 months, and those to other object categories later. The study is methodologically solid, using state-of-the-art experimental design and analysis approaches. The findings should be of interest to researchers in the fields of cognitive psychology and neuroscience.

    2. Reviewer #1 (Public Review):

      Summary:

      In the paper, Yan and her colleagues investigate at which stage of development different categorical signals can be detected with EEG using a steady-state visual evoked potential paradigm. The study reports the development trajectory of selective responses to five categories (i.e., faces, limbs, corridors, characters, and cars) over the first 1.5 years of life. It reveals that while responses to faces show significant early development, responses to other categories (i.e., characters and limbs) develop more gradually and emerge later in infancy. The paper is well-written and enjoyable, and the content is well-motivated and solid.

      Strengths:

      (1) This study contains a rich dataset with a substantial amount of effort. It covers a large sample of infants across ages (N=45) and asks an interesting question about when visual category representations emerge during the first year of life.

      (2) The chosen category stimuli are appropriate and well-controlled. These categories are classic and important for situating the study within a well-established theoretical framework.

      (3) The brain measurements are solid. Visual periodicity allows for the dissociation of selective responses to image categories within the same rapid image stream, which appears at different intervals. This is important for the infant field, as it provides a robust measure of ERPs with good interpretability.

      Weaknesses:

      The study would benefit from a more detailed explanation of analysis choices, limitations, and broader interpretations of the findings. This includes:<br /> a) improving the treatment of bias from specific categories (e.g., faces) towards others;<br /> b) justifying the specific experimental and data analysis choices;<br /> c) expanding the interpretation and discussion of the results.

      I believe that giving more attention to these aspects would improve the study and contribute positively to the field.

    3. Reviewer #2 (Public Review):

      Summary:

      The current work investigates the neural signature of category representation in infancy. Neural responses during steady-state visually-evoked potentials (ssVEPs) were recorded in four age groups of infants between 3 and 15 months. Stimuli (i.e., faces, limbs, corridors, characters, and cars) were presented at 4.286 Hz with category changes occurring at a frequency of 0.857 Hz. The results of the category frequency analyses showed that reliable responses to faces emerge around 4-6 months, whereas responses to libs, corridors, and characters emerge at around 6-8 months. Additionally, the authors trained a classifier for each category to assess how consistent the responses were across participants (leave-one-out approach). Spatiotemporal responses to faces were more consistent than the responses to the remaining categories and increased with increasing age. Faces showed an advantage over other categories in two additional measures (i.e., representation similarity and distinctiveness). Together, these results suggest a different developmental timing of category representation.

      Strengths:

      The study design is well organized. The authors described and performed analyses on several measures of neural categorization, including innovative approaches to assess the organization of neural responses. Results are in support of one of the two main hypotheses on the development of category representation described in the introduction. Specifically, the results suggest a different timing in the formation of category representations, with earlier and more robust responses emerging for faces over the remaining categories. Graphic representations and figures are very useful when reading the results.

      Weaknesses:

      The role of the adult dataset in the goal of the current work is unclear. All results are reported in the supplementary materials and minimally discussed in the main text. The unique contribution of the results of the adult samples is unclear and may be superfluous.

      It would be useful to report the electrodes included in the analyses and how they have been selected.

    4. Reviewer #3 (Public Review):

      Yan et al. present an EEG study of category-specific visual responses in infancy from 3 to 15 months of age. In their experiment, infants viewed visually controlled images of faces and several non-face categories in a steady state evoked potential paradigm. The authors find visual responses at all ages, but face responses only at 4-6 months and older, and other category-selective responses at later ages. They find that spatiotemporal patterns of response can discriminate faces from other categories at later ages.

      Overall, I found the study well-executed and a useful contribution to the literature. The study advances prior work by using well-controlled stimuli, subgroups of different ages, and new analytic approaches.

      I have two main reservations about the manuscript: (1) limited statistical evidence for the category by age interaction that is emphasized in the interpretation; and (2) conclusions about the role of learning and experience in age-related change that are not strongly supported by the correlational evidence presented.

      (1) The overall argument of the paper is that selective responses to various categories develop at different trajectories in infants, with responses to faces developing earlier. Statistically, this would be most clearly demonstrated by a category-by-age interaction effect. However, the statistical evidence for a category by interaction effect presented is relatively weak, and no interaction effect is tested for frequency domain analyses. The clearest evidence for a significant interaction comes from the spatiotemporal decoding analysis (p. 10). In the analysis of peak amplitude and latency, an age x category interaction is only found in one of four tests, and is not significant for latency or left-hemisphere amplitude (Supp Table 8). For the frequency domain effects, no test for category by age interaction is presented. The authors find that the effects of a category are significant in some age ranges and not others, but differences in significance don't imply significant differences. I would recommend adding category by age interaction analysis for the frequency domain results, and ensuring that the interpretation of the results is aligned with the presence or lack of interaction effects.

      (2) The authors argue that their results support the claim that category-selective visual responses require experience or learning to develop. However, the results don't bear strongly on the question of experience. Age-related changes in visual responses could result from experience or experience-independent maturational processes. Finding age-related change with a correlational measure does not favor either of these hypotheses. The results do constrain the question of experience, in that they suggest against the possibility that category-selectivity is present in the first few months of development, which would in turn suggest against a role of experience. However the results are still entirely consistent with the possibility of age effects driven by experience-independent processes. The manner in which the results constrain theories of development could be more clearly articulated in the manuscript, with care taken to avoid overly strong claims that the results demonstrate a role of experience.

    1. eLife assessment

      This study presents important findings indicating that tinnitus patients have abnormal auditory prediction signals. The results are based on well-controlled experiments for a large cohort of patients. The reported observations constitute a new set of convincing evidence for the strong link between tinnitus and central auditory processing disorders and will be of interest to clinicians, auditory scientists, and neuroscientists studying prediction mechanisms.

    2. Reviewer #1 (Public Review):

      This work presents a replicable difference in predictive processing between subjects with and without tinnitus. In two independent MEG studies and using a passive listening paradigm, the authors identify an enhanced prediction score in tinnitus subjects compared to control subjects. In the second study, individuals with and without tinnitus were carefully matched for hearing levels (next to age and sex), increasing the probability that the identified differences could truly be attributed to the presence of tinnitus. Results from the first study could successfully be replicated in the second, although the effect size was notably smaller.

      Throughout the manuscript, the authors provide a thoughtful interpretation of their key findings and offer several interesting directions for future studies. Their conclusions are fully supported by their findings. Moreover, the authors are sufficiently aware of the inherent limitations of cross-sectional studies.

      Strengths:

      The robustness of the identified differences in prediction scores between individuals with and without tinnitus is remarkable, especially as successful replication studies are rare in the tinnitus field. Moreover, the authors provide several plausible explanations for the decline of the effect size observed in the second study.

      The rigorous matching for hearing loss, in addition to age and sex, in the second study is an important strength. This ensures that the identified differences cannot be attributed to differences in hearing levels between the groups.

      The used methodology is explained clearly and in detail, ensuring that the used paradigms may be employed by other researchers in future studies. Moreover, the registering of the data collection and analysis methods for Study 2 as a Registered Report should be commended, as the authors have clearly adhered to the methods as registered.

      Weaknesses:

      Although the authors have been careful to match their experimental groups for age, sex, and hearing loss, there are other factors that may confound the current results. For example, subjects with tinnitus might present with psychological comorbidities such as anxiety and depression. The authors' exclusion of distress as a candidate for explaining the found effects is based solely on an assessment of tinnitus-related distress, while it is currently not possible to exclude the effects of elevated anxiety or depression levels on the results. Additionally, as the authors address in the discussion, the presence of hyperacusis may also play a role in predictive processing in this population.

      The authors write that sound intensity was individually determined by presenting a short audio sequence to the participants and adjusting the loudness according to an individual pleasant volume. Neural measurements made during listening paradigms might be influenced by sound intensity levels. The intensity levels chosen by the participants might therefore also have an effect on the outcomes. The authors currently do not provide information on the sound intensity levels in the experimental groups, making it impossible to assess whether sound intensity levels might have played a role.

    3. Reviewer #2 (Public Review):

      Summary:

      This study aimed to test experimentally a theoretical framework that aims to explain the perception of tinnitus, i.e., the perception of a phantom sound in the absence of external stimuli, through differences in auditory predictive coding patterns. To this aim, the researchers compared the neural activity preceding and following the perception of a sound using MEG in two different studies. The sounds could be highly predictable or random, depending on the experimental condition. They revealed that individuals with tinnitus and controls had different anticipatory predictions. This finding is a major step in characterizing the top-down mechanisms underlying sound perception in individuals with tinnitus.

      Strengths:

      This article uses an elegant, well-constructed paradigm to assess the neural dynamics underlying auditory prediction. The findings presented in the first experiment were partially replicated in the second experiment, which included 80 participants. This large number of participants for an MEG study ensures very good statistical power and a strong level of evidence. The authors used advanced analysis techniques - Multivariate Pattern Analysis (MVPA) and classifier weights projection - to determine the neural patterns underlying the anticipation and perception of a sound for individuals with or without tinnitus. The authors evidenced different auditory prediction patterns associated with tinnitus. Overall, the conclusions of this paper are well supported, and the limitations of the study are clearly addressed and discussed.

      Weaknesses:

      Even though the authors took care of matching the participants in age and sex, the control could be more precise. Tinnitus is associated with various comorbidities, such as hearing loss, anxiety, depression, or sleep disorders. The authors assessed individuals' hearing thresholds with a pure tone audiogram, but they did not take into account the high frequencies (6 kHz to 16 kHz) in the patient/control matching. Moreover, other hearing dysfunctions, such as speech-in-noise deficits or hyperacusis, could have been taken into account to reinforce their claim that the observed predictive pattern was not linked to hearing deficits. Mental health and sleep disorders could also have been considered more precisely, as they were accounted for only indirectly with the score of the 10-item mini-TQ questionnaire evaluating tinnitus distress. Lastly, testing the links between the individuals' scores in auditory prediction and tinnitus characteristics, such as pitch, loudness, duration, and occurrence (how often it is perceived during the day), would have been highly informative.

    1. Reviewer #2 (Public Review):

      Summary:

      Here the authors show a novel direct neuronal reprogramming model using a very pure culture system of oligodendrocyte progenitor cells and demonstrate hallmarks of corticospinal neurons to be induced when using Neurogenin2, a dominant-negative form of Olig2 in combination with the CSN master regulator Fezf2.

      Strengths:

      This is a major achievement as the specification of reprogrammed neurons towards adequate neuronal subtypes is crucial for repair and still largely missing. The work is carefully done and the comparison of the neurons induced only by Neurogenin 2 versus the NVOF cocktail is very interesting and convincingly demonstrates a further subtype specification by the cocktail.

      Weaknesses:

      As carefully as it is done in vitro, the identity of projection neurons can best be assessed in vivo. If this is not possible, it could be interesting to co-culture different brain regions and see if these neurons reprogrammed with the cocktail, indeed preferentially send out axons to innervate a co-cultured spinal cord versus other brain region tissue.

    2. eLife assessment

      This study presents fundamental new findings introducing a new approach for the reprogramming of brain glial cells to corticospinal neurons. The data is highly compelling, with multiple lines of evidence demonstrating the success of this new assay. These exciting findings set the stage for future studies of the potential of these reprogrammed cells to form functional connections in vivo and their utility in clinical conditions where corticospinal neurons are compromised.

    3. Reviewer #1 (Public Review):

      Summary:

      The manuscript by Ozcan et al., presents compelling evidence demonstrating the latent potential of glial precursors of the adult cerebral cortex for neuronal reprogramming. The findings substantially advance our understanding of the potential of endogenous cells in the adult brain to be reprogrammed. Moreover, they describe a molecular cocktail that directs reprogramming toward corticospinal neurons (CSN).

      Strengths:

      Experimentally, the work is compelling and beautifully designed, with no major caveats. The main conclusions are fully supported by the experiments. The work provides a characterization of endogenous progenitors, genetic strategies to isolate them, and proof of concept of exploiting these progenitors' potential to produce a specific desired neuronal type with "a la carte" combination of transcription factors.

      Weaknesses:

      Some issues need to be addressed or clarified before publication. The manuscript requires editing. It is dense and rich in details while in other parts there are a few mistakes.

    4. Reviewer #3 (Public Review):

      Summary:

      Ozkan, Padmanabhan, and colleagues aim to develop a lineage reprogramming strategy towards generating subcerebral projection neurons from endogenous glia with the specificity needed for disease modelling and brain repair. They set out by targeting specifically Sox6-positive NG2 glia. This choice is motivated by the authors' observation that the early postnatal forebrain of Sox6 knockout mice displays marked ectopic expression of the proneural transcription factor (TF) Neurog2, suggesting a latent neurogenic program may be derepressed in NG2 cells, which normally express Sox6. Cultured NG2 glia transfected with a construct ("NVOF") encoding Neurog2, the corticofugal neuron-specifying TF Fezf2, and a constitutive repressor form of Olig2 are efficiently reprogrammed to neurons. These acquire complex morphologies resembling those of mature endogenous neurons and are characterized by fewer abnormalities when compared to neurons induced by Neurog2 alone. NVOF-induced neurons, as a population, also express a narrower range of cortical neuron subtype-specific markers, suggesting narrowed subtype specification, a potential step forward for Neurog2-driven neuronal reprogramming. Comparison of NVOF- and Neurog2-induced neurons to endogenous subcerebral projection neurons (SCPN) also indicates Fezf2 may aid Neurog2 in directing the generation of SCPN-like neurons at the expense of other cortical neuronal subtypes.

      Strengths:

      The report describes a novel, highly homogeneous in vitro system amenable to efficient reprogramming. The authors provide evidence that Fezf2 shapes the outcome of Neurog2-driven reprogramming towards a subcerebral projection neuron identity, consistent with its known developmental roles. Also, the use of the modified RNA for transient expression of Neurog2 is very elegant.

      Weaknesses:

      The molecular characterization of NVOF-induced neurons is carried out at the bulk level, therefore not allowing to fully assess heterogeneity among NVOF-induced neurons. The suggestion of a latent neurogenic potential in postnatal cortical glia is only partially supported by the data from the Sox6 knockout. Finally, some of the many exciting implications of the study remain untested.

      Discussion:

      The study has many exciting implications that could be further tested. For example, an ultimate proof of the subcerebral projection neuron identity would be to graft NVOF cells into neonatal mice and study their projections. Another important implication is that Sox6-deficient NG2 glia may not only express Neurog2 but activate a more complete neurogenic programme, a possibility that remains untested here. Also, is the subcerebral projection neuron dependent on the starting cell population? Could other NG2 glia, not expressing Sox6, also be co-axed by the NVOF cocktail into subcerebral projection neurons? And if not, do they express other (Sox) transcription factors that render them more amenable to reprogramming into other cortical neuron subtypes? The authors state that Sox6-positive NG2 glia are a quiescent progenitor population. Given that NG2 glia is believed to undergo proliferation as a whole, are Sox6-positive NG2 glia an exception from this rule? Finally, the authors seem to imply that subcerebral projection neurons and Sox6-positive NG2 glia are lineage-related. However, direct evidence for this conjecture seems missing.

    1. eLife assessment

      This important study using engineered mouse models provides a first and compelling demonstration of a pathogenic phenotype associated with lack of expression of p53AS, an isoform of the p53 protein with a different C-terminus than canonical p53. The role of this isoform has been elusive so far and this first demonstration represents a substantial advance in our understanding of the complex role(s) of p53 isoforms. The revised manuscript adequately addresses previous concerns.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors originally investigated the function of p53 isoforms with an alternative C-terminus encoded by the Alternatively Spliced (AS) exon in place of exon 11 encoding the canonical "α" C-terminal domain. For this purpose, the authors create a mouse model with a specific deletion of the AS exon.

      Strengths:

      Interestingly, wt or p53ΔAS/ΔAS mouse embryonic fibroblasts did not differ in cell cycle control, expression of well-known p53 target genes, proliferation under hyperoxic conditions, or the growth of tumor xenografts. However, p53-AS isoforms were shown to confer male-specific protection against lymphomagenesis in Eμ-Myc transgenic mice, prone to highly penetrant B-cell lymphomas. In fact, p53ΔAS/ΔAS Eμ-Myc mice were less protected from developing B-cell lymphomas compared to WT counterparts. The important difference that the authors find between WT and p53ΔAS/ΔAS Eμ-Myc males is a higher number of immature B cells in p53ΔAS/ΔAS vs WT mice. Higher expression of Ackr4 and lower expression of Mt2 was found in p53+/+ Eμ-Myc males compared to p53ΔAS/ΔAS counterparts, suggesting that these two transcripts are in part regulators of B-cell lymphomagenesis and enrichment for immature B cells.

      The manuscript integrates an elegant genetic approach with in vivo analyses providing a robust set of data which strengthens the role of p53 isoforms in leukemogenesis.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript provides a detailed analysis of B-cell lymphomagenesis in mice lacking an alternative exon in region encoding the C-terminal (regulatory) domain of the p53 protein and thus enable to assemble the so-called p53AS isoform. This isoform differs from canonical p53 by the replacement of roughly 30 c-terminal residues by about 10 residues encoded by the alternative exon. There is biochemical and biological evidence that p53AS retains strong transcriptional and somewhat enhanced suppressive activities, with mouse models expressing protein constructs similar to p53AS showing signs of increased p53 activity leading to rapid and lethal anemia. However, the precise role of the alternative p53AS variant has not been addressed so far in a mouse model aimed at demonstrating whether the lack of this particular p53 isoform (trp53ΔAS/ΔAS mice) may cause a specific pathological phenotype.

      Results show that lack of AS expression does not noticeably affect p53 the patterns of protein expression and transcriptional activity but reveals a subtle pathogenic phenotype, with trp53ΔAS/ΔAS males, but not females, tending to develop more frequently and earlier B-cell lymphoma than WT. Next, the authors then introduced ΔAS in transgenic Eμ-Myc mice that show accelerated lymphomagenesis. They show that lack of AS caused increased lethality and larger tumor lymph nodes in p53ΔAS Eμ-Myc males compared to their p53WT Eμ-Myc male counterparts, but not in females. Comparative transcriptomics identified a small set of candidate, differentially expressed gene, including Ackr4 (atypical chemokine receptor 4), which was significantly expressed in the spleens of ΔAS compared to WT controls. Ackr4 encodes a dummy receptor acting as an interceptor for multiple chemokines and thus may negatively regulate a chemokine/cytokine signalling axis involved in lymphomagenesis, which is down-regulated by estrogen signalling. Using in vitro cell models, the authors provide evidence that Ackr4 is a transcriptional target for p53 and that its p53-dependent activation is repressed by 17b-oestradiol. Finally, seeking evidence for a relevance for this gene in human lymphomagenesis, the authors analyse Burkitt lymphoma transcriptomic datasets and show that high ACKR4 expression correlated with better survival in males, but not in females

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) In the first paragraph of the result section it is not clear why the authors introduce the function of p53ΔAS/ΔAS in thymocyte and then they mention fibroblasts. The authors should clarify this point. The authors should also explain based on what rationale they use doxorubicin and nutlin to analyze p53 activity (Figure 1 and figure S1). 

      We thank the reviewer for this comment. In the revised manuscript, we corrected this by mentioning, at the beginning of the Results section: “We analyzed cellular stress responses in thymocytes, known to undergo a p53-dependent apoptosis upon irradiation (Lowe et al., 1993), and in primary fibroblasts, known to undergo a p53-dependent cell cycle arrest in response to various stresses - e.g. DNA damage caused by irradiation or doxorubicin (Kastan et al., 1992), and the Nutlin-mediated inhibition of Mdm2, a negative regulator of p53 (Vassilev et al., 2004).”

      (2) The authors should provide quantification for the western blot in figure 2D because the reduction of p53 protein level in mutant vs wt tumors is not striking. 

      In the previous version of the manuscript, the quantification of p53 bands had been included, but quantification results were mentioned below the actin bands, rather than the p53 bands, and this was probably confusing. We have corrected this in the revised version of the manuscript. The quantification results are now provided just below the p53 bands in Figs. 1B and 2D, which should clarify this point. For Figure 2D, the quantifications show a strong decrease in p53 levels for 3 out of 4 analyzed mutant tumors. For consistency purposes, in the revised manuscript the quantification results also appear below Myc bands in Fig. 2C.

      (3) In the discussion section, the authors propose that a difference in Ackr4 expression may have prognostic value and that measuring ACKR4 gene expression in male patients with Burkitt lymphoma could be useful to identify the patients at higher risk. However the authors perform a lot of correlative analysis, both in mice and in patients, but the manuscript lacks of functional experiments that could help to functionally characterize Ackr4 and Mt2 in the etiology of B-cell lymphomas in males (both in mouse and in human models).

      In the previous version of the manuscript, we proposed that Ackr4 might act as a suppressor of B-cell lymphomagenesis by attenuating Myc signaling. This hypothesis relied on studies showing that Ackr4 impairs the Ccr7 signaling cascade, which may lead to decreased Myc activity (Ulvmar et al., 2014; Shi et al., 2015; Bastow et al., 2021) and that the loss of Ccr7 may delay Myc-driven lymphomagenesis (Rehm et al., 2011). Furthermore, we proposed that the increased expression of Mt2 in p53ΔAS/ΔAS Em-Myc male splenic cells reflected an increase in Myc activity, because Mt2 is known to be regulated by Myc (Qin et al., 2021) and because the Mt2 promoter is bound by Myc in B cells according to experiments reported in the ChIP-Atlas database. However, in the first version of the manuscript this hypothesis might have appeared only partially supported by our data because an increase in Myc activity could be expected to have a more general impact, i.e. an impact not only on the expression of Mt2, but also on the expression of many canonical Myc target genes. In the revised manuscript, we show that this is indeed the case. We performed a gene set enrichment analysis (GSEA) comparing the RNAseq data from p53ΔAS/ΔAS Eμ-Myc and p53+/+ Eμ-Myc male splenic cells and found an enrichment of hallmark Myc targets in p53ΔAS/ΔAS Eμ-Myc cells. These new data, which strengthen our hypothesis of differences in Myc signaling intensity, are presented in Fig. 3K and Table S2.

      Importantly, we now go beyond correlative analyses by providing direct experimental evidence that ACKR4 impacts on the behavior of Burkitt lymphoma cells. We used a CRISPR-Cas9 approach to knock-out ACKR4 in Raji Burkitt lymphoma cells and found that ACKR4 KO cells exhibited a 4-fold increase in chemokine-guided cell migration. These new data are presented in Figure 4F and the supplemental Figures S5-S7.  

      Finally, following a suggestion of Reviewer#2, we now also point out that “Ackr4 regulates B cell differentiation (Kara et al., 2018), which raises the possibility that an altered p53-Ackr4 pathway in p53ΔAS/ΔAS Eμ-Myc male splenic cells might contribute to increase the pools of pre-B and immature B cells that may be prone to lymphomagenesis.”

      In sum, we now mention in the Discussion that a decrease in Ackr4 expression might promote B-cell lymphomagenesis through three non-exclusive mechanisms.

      Reviewer #2 (Recommendations For The Authors): 

      (1) A great addition would be to demonstrate how p53AS specifically contributes to the regulation of Ackr4. In particular, is there evidence that p53AS might be preferentially recruited on p53 RE within that gene as compared to WT? The availability of specific antibodies that distinguish between AS and WT p53 might help to address this (experimentally complex) question. As a note, usage of such antibodies would also strengthen Fig 1B, in which the AS isoform appears as a mere faint shadow under p53, thus making its "disappearance" in trp53ΔAS/ΔAS difficult to evaluate. 

      We agree with the referee that efficient antibodies against p53-AS isoforms would have been useful. In fact, we tried a non-commercial antibody developed for that purpose, but it led to many unspecific bands in western blots and appeared not reliable. Importantly however, our luciferase assays clearly show that both p53-a and p53-AS can transactivate Ackr4, a result that might be expected because these isoforms share the same DNA binding domain. Furthermore, because p53-a isoforms appear more abundant than p53-AS isoforms at the protein and RNA levels (Figs. 1B and S1A), and because the loss of p53-AS isoforms leads to a significant decrease in p53-a protein levels (Figs. 1B and 2D), we think that in p53ΔAS/ΔAS cells the reduction in p53-a levels might be the main reason for a decreased transactivation of Ackr4. This is now more clearly discussed in the revised manuscript.

      (2) A most interesting observation is in Fig3 A and Fig S3, showing that spleen cells of p53ΔAS Eμ-Myc males (but not females) were enriched in pre-B and immature B cells as compared to WT counterparts. This observation points to a possible defect in B cell maturation process. It would be most interesting to determine whether this particular defect is directly mediated by a p53AS-Ackr4 axis. The hypothesis raised by the authors in the Discussion section is that increased Ackr4 expression may delay lymphomatogenesis, but data in Fig 3A and 3S actually suggest that ΔAS increases the pool of immature B-cell that may be prone to lymphomagenesis. 

      We thank the reviewer for this useful comment, which we integrated in the Discussion of the revised manuscript. Ackr4 was shown to regulate B cell differentiation (Kara at al. (2018) J Exp Med 215, 801–813), so this is indeed one of the possible mechanisms by which a deregulation of the p53-Ackr4 axis might promote lymphomagenesis. We now mention: “Ackr4 regulates B cell differentiation (Kara et al., 2018), which raises the possibility that an altered p53-Ackr4 pathway in p53ΔAS/ΔAS Eμ-Myc male splenic cells might contribute to increase the pools of pre-B and immature B cells that may be prone to lymphomagenesis.” This is presented as one of three possible mechanisms by which decreased Ackr4 levels may promote tumorigenesis, the two others being the impact of Ackr4 on the chemokine-guided migration of lymphoma cells and its apparent effect on Myc signalling.

      (3) The concordance with a male-specific prognostic effect of Ackr4 is most interesting in itself but is only of correlative evidence with respect to the study. Is there any information on whether p53AS expression is also a prognostic factor in BL? And is there evidence that Ackr4 may also be a male-specific prognostic factor in other B-cell malignancies, e.g. Multiple Myeloma?

      We have now performed the CRISPR-mediated knock-out of ACKR4 in Burkitt lymphoma cells and found that it leads to a dramatic increase in chemokine-guided cell migration, which goes beyond correlation. This significant new result is mentioned in the revised abstract and presented in detail in Figures 4F and S5-S7.

      Regarding p53-AS isoforms, they are murine-specific isoforms (Marcel et al. (2011) Cell Death Diff 18, 1815-1824), so there is no information on p53-AS expression in Burkitt lymphoma. Human p53 isoforms with alternative C-terminal domains are p53b and p53g isoforms, but the datasets we analyzed did not provide any information on the relative levels of p53a (the canonical isoform), p53b or p53g isoforms. We agree with the referee that this is an interesting question, but that cannot be answered with currently available datasets.

      Regarding the different types of B-cell malignancies, we had already shown that Ackr4 is a male-specific prognostic factor in Burkitt lymphomas but not in Diffuse Large B cell lymphomas, which indicated that it is not a prognostic factor in all types of B cell lymphomas. For this revision, we also searched for its potential prognostic value in multiple myeloma, and found that, as for DLBCL, it is not a prognostic factor in this cancer type. This new analysis is presented in Figure S4C.

    1. eLife assessment

      This study presents a dataset obtained through a single cell RNA-Sequencing of sea cucumber regenerating intestine 9 days post evisceration. The data were collected and analyzed using standard single cells analysis from n=2 adult sea cucumbers captured from the wild, which represents a useful resource for future studies. Although cell type validation is attempted, it is performed on samples from the same 2 animals (and not independent samples), rendering the validation incomplete. Further, the RNA localization images provided in the paper could benefit from improved spatial context, and many strong statements in the discussion should be better justified and supported by the presented data. With the validation part strengthened, this paper would be of interest to development and regeneration fields.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Medina-Feliciano et al. investigated the single cell transcriptomic profile of holoturian regenerating intestine following evisceration, a process used to expel their viscera in response to predation. Using single cell RNA-sequencing and standard analysis such as "Find cluster markers", "Enrichment analysis of Gene Ontology" and "RNA velocity", they identify 13 cell clusters and potential identity. Based merely on bioinformatic analysis they identified potentially proliferating clusters and potential trajectories of cell differentiation. This manuscript represents a useful dataset that can provide candidate cell types and cell markers for more in-depth functional analysis for gaining a better understanding of the holoturian intestine regeneration. The conclusions of this paper are supported only by bioinformatic analyses, since the in vivo validation through HCR does not sufficiently support them.

      Strengths:<br /> - The Authors are providing a single cell dataset obtained from sea cucumber regenerating their intestine. This represents a first fundamental step to an unbiased approach to better understand this regeneration process and the cellular dynamics taking part in it.<br /> - The Authors run all the standard analyses providing the reader with a well digested set of information about cell clusters, potential cell types, potential functions and potential cell differentiation trajectories.

      Weaknesses:<br /> - The entire study is based on only 2 adult animals, that were used for both the single cell dataset and the HCR. Additionally, the animals were caught from the ocean preventing information about their age or their life history. This makes the n extremely small and reduces the confidence of the conclusions.<br /> - All the fluorescent pictures present in this manuscript present red nuclei and green signals being not color-blind friendly. Additionally, many of the images lack sufficient quality to determine if the signal is real. Additional images of a control animal (not eviscerated) and of a negative control would help data interpretation. Finally, in many occasions a zoomed out image would help the reader to provide context and have a better understanding of where the signal is localized.<br /> - The Authors frequently report the percentage of cells with a specific feature (either labelled or expressing a certain gene or belonging to a certain cluster). This number can be misleading since that is calculated after cell dissociation and additional procedures (such as staining or sequencing and dataset cleanup) that can heavily bias the ratio between cell types. Similarly, the Authors cannot compare cell percentage between anlage and mesentery samples since that can be affected by technical aspects related to cell dissociation, tissue composition and sequencing depth.<br /> - The Authors decided to validate only a few clusters and in many cases there are no positive controls (such as specific localization, specific function, changes between control and regenerating animals, co-stain) that could actually validate the cluster identity and the specificity of the selected marker. There is no validation of the trajectory analysis and there is no validation of the proliferating cluster with H3P or BrdU stainings.<br /> - It is not clear what is already known about holothurian intestine regeneration and what are the new findings in this manuscript. The Authors reference several papers throughout the whole result sectioning mentioning how the steps of regeneration, the proliferating cells, some of the markers and some of the cell composition of mesenteries and anlages was already known.

    3. Reviewer #2 (Public Review):

      Summary:<br /> This research offers a comprehensive analysis of the regenerative process in sea cucumbers and builds upon decades of previous research. The approach involves a detailed examination using single-cell sequencing, making it a crucial reference paper while shedding new light on regeneration in this organism.

      Strengths:<br /> Detailed analysis of single-cell sequencing data and high-quality RNA localization images provide significant new insights into regeneration in sea cucumbers and, more broadly, in animals.

      Weaknesses:<br /> The spatial context of the RNA localization images is not well represented, making it difficult to understand how the schematic model was generated from the data. In addition, multiple strong statements in the conclusion should be better justified and connected to the data provided.

    4. Reviewer #3 (Public Review):

      Summary:<br /> The authors have done a good job of creating a "resource" paper for the study of gut regeneration in sea cucumbers. They present a single-cell RNAseq atlas for the reconstitution of Holothuria glaberrima gut following self-evisceration in response to a potassium chloride injection. The authors provide data characterizing cellular populations and precursors of the regenerating anlage at 9 days post evisceration. As a "Tools and Resources" contribution to eLife, this work, with some revisions, could be appropriate. It will be impactful in the fields of regeneration, particularly in invertebrates, but also in comparative studies in other species, including evolutionary studies. Some of these comparative studies could extend to vertebrates and could therefore impact regenerative medicine in the future.

      Strengths:<br /> • Novel and useful information for a model organism and question for which this type of data has not yet been reported<br /> • Single-cell gene expression data will be valuable for developing testable hypotheses in the future<br /> • Marker genes for cell types provided to the field<br /> • Interesting predictions about possible lineage relationships between cells during sea cucumber gut regeneration

      Weaknesses:<br /> • Possible theoretical advances regarding lineage trajectories of cells during sea cucumber gut regeneration, but the claims that can be made with this data alone are still predictive<br /> • Better microscopy is needed for many figures to be convincing<br /> • Some minor additions to the figures will help readers understand the data more clearly

    5. Author response:

      Reviewer #1

      - The entire study is based on only 2 adult animals, that were used for both the single cell dataset and the HCR. Additionally, the animals were caught from the ocean preventing information about their age or their life history. This makes the n extremely small and reduces the confidence of the conclusions. 

      This statement is incorrect.  While the scRNAseq was indeed performed in two animals (n=2), the HCR-FISH was performed in 3-5 animals (depending on the probe used).  These were different animals from those used for the scRNAseq.  We are partly responsible for this confusion, since we did not state the number of animals used for the HSC-FISH in the manuscript. 

      - All the fluorescent pictures present in this manuscript present red nuclei and green signals being not color-blind friendly. Additionally, many of the images lack sufficient quality to determine if the signal is real. Additional images of a control animal (not eviscerated) and of a negative control would help data interpretation. Finally, in many occasions a zoomed out image would help the reader to provide context and have a better understanding of where the signal is localized. 

      Fluorescent photos will be changed to color-blind friendly colors. 

      Diagrams, arrows and new photos will be included as to guide readers to the signal

      or labeling in cells. In the original manuscript 6 out of 7 cluster validations included a photo of a normal, non-eviscerated control.  We will make certain that this is highlighted in the resubmission and that ALL figures with HCR-FISH labeling will include data from control animals.

      - The Authors frequently report the percentage of cells with a specific feature (either labelled or expressing a certain gene or belonging to a certain cluster). This number can be misleading since that is calculated after cell dissociation and additional procedures (such as staining or sequencing and dataset cleanup) that can heavily bias the ratio between cell types. Similarly, the Authors cannot compare cell percentage between anlage and mesentery samples since that can be affected by technical aspects related to cell dissociation, tissue composition and sequencing depth. 

      The Reviewer has correctly identified the limitations of using cell percentages in scRNA-seq analyses. However, these percentages do offer a general overview of the sequenced cell populations and highlight potential differences between samples. In addition, these percentages, as addressed by the Reviewer, not only emphasize the shortcommings of the dissociation methods but at the same time provide some explanation for the absence of particular cell populations, as we describe in the manuscript. In our future resubmission, we will acknowledge these limitations and inform readers of any potential biases introduced by relying on these numbers.

      - The Authors decided to validate only a few clusters and in many cases there are no positive controls (such as specific localization, specific function, changes between control and regenerating animals, co-stain) that could actually validate the cluster identity and the specificity of the selected marker. There is no validation of the trajectory analysis and there is no validation of the proliferating cluster with H3P or BrdU stainings. 

      We validated the seven clusters that were important to reach our conclusions. Six of these had controls of normal (uneviscerated) intestine.  Nonetheless we will increase the number of cluster validations and include the dividing cell cluster using BrdU.

      - It is not clear what is already known about holothurian intestine regeneration and what are the new findings in this manuscript. The Authors reference several papers throughout the whole result sectioning mentioning how the steps of regeneration, the proliferating cells, some of the markers and some of the cell composition of mesenteries and anlages was already known. 

      The manuscript presents several novel findings on holothurian intestine regeneration, including:

      - The integration of multiple cellular processes, reported for the first time within a single species, along with the identification of the specific mRNAs expressed by each involved cell population.

      - A comparative analysis of the sea cucumber anlage structure, highlighting its similarities to previously described blastemal structures.

      - The identification of the potential dedifferentiated cell populations that form the foundation of the anlage, serving as the epicenter for proliferating and differentiating cells.

      We will ensure that these and other significant findings are prominently emphasized in the resubmitted manuscript.

      Reviewer #2

      - The spatial context of the RNA localization images is not well represented, making it difficult to understand how the schematic model was generated from the data. In addition, multiple strong statements in the conclusion should be better justified and connected to the data provided.

      As explained above we will make an effort to provide a better understanding of the cellular/tissue localization of the labeled cells. Similarly, we will revise the conclusions so that the statements made are well justified.

      Reviewer #3

      - Possible theoretical advances regarding lineage trajectories of cells during sea cucumber gut regeneration, but the claims that can be made with this data alone are still predictive.

      We are conscious that the results from these lineage trajectories are still predictive and will emphasize this in the text. Nonetheless, they are important part of our analyses that provide the theoretical basis for future experiments.

      - Better microscopy is needed for many figures to be convincing. Some minor additions to the figures will help readers understand the data more clearly.

      As explained above we will make an effort to provide a better

      understanding of the cellular/tissue localization of the labeled cells.  Similarly, we will revise the conclusions so that the statements made are well justified.

    1. eLife assessment

      Transient receptor potential mucolipin 1 (TRPML1) functions as a lysosomal ion channel whose variants are associated with lysosomal storage disorder mucolipidosis type IV. This important report describes local and global structural changes driven by the binding of regulatory phospholipids and by mutations allosteric that allosterically cause gain or loss of channel function. Most of the claims related to the allosteric regulation of TRPML1 have solid support by two new cryo-EM structures, that of the gain of function Y404W mutant and that of the wild-type channel bound to the inhibitor PI(4,5)P2. The new cryo-EM findings are evaluated within the context of previously reported TRPML1 structures, and a proposed allosteric gating mechanism is partially supported by functional electrophysiology results.

    2. Reviewer #1 (Public Review):

      In their manuscript, Gan and colleagues identified a functional critical residue, Tyr404, which when mutated to W or A results in GOF and LOF of TRPML1 activity, respectively. In addition, the authors provide a high-resolution structure of TRPML1 with PI(4,5)P2 inhibitor. This high-resolution structure also revealed a bound phospholipid likely sphingomyelin at the agonist/antagonist site, providing a plausible explanation for sphingomyelin inhibition of TRPML1.

      This is an interesting study, revealing valuable additional information on TRPML1 gating mechanisms including effects on endogenous phospholipids on channel activity. The provided data are convincing. Some major open questions remain. The work will be of interest to a wide audience including industry researchers occupied with TRPML1 exploration as a drug target.

    3. Reviewer #2 (Public Review):

      The transient receptor potential mucolipin 1 (TRPML1) functions as a lysosomal organelle ion channel whose variants are associated with lysosomal storage disorder mucolipidosis type IV. Understanding sites that allosterically control the TRPML1 channel function may provide new molecular moieties to target with prototypic drugs.

      Gan et al provide the first high-resolution cryo-EM structures of the TRPML1 channel (Y404W) in the open state without any activating ligands. This new structure demonstrates how a mutation at a site some distance away from the pore can influence the channel's conducting state. However, the authors do not provide a structural analysis of the Y404W pore which would validate their open-state claims. Nonetheless, Gan et al provide compelling electrophysiology evidence which supports the proposed Y404W gain of function effect. The authors propose an allosteric mechanism with the following molecular details- the Y404 to W sidechain substitution provides extra van der Waals contacts within the pocket surrounded by helices of the VSD-like domain and causes S4 bending which in turn opens to the pore through the S4-S5 linker. Conversely, the author functionally demonstrates that an alanine mutation at this site causes a loss of function. Although the authors do not provide a structure of the Y404A mutation, they propose that the alanine substitution disrupts the sidechain packing and likely destabilizes the open conformation. TRPM1 channels are regulated by PIP2 species, which is related to their cell function. In the membrane of lysosomes, PI(3,5)P2 activates the channel, whereas PI(4,5)P2 found in the plasma membrane has inhibitory effects. To understand its lipid regulation, the authors solved a cryo-EM structure of TRPM1 bound to PI(4,5)P2 in its presumed closed state. Again, while the provided functional evidence suggests that PI(4,5)P2 occupancy inhibits TRPML1 current, the authors do not provide analysis of the pore which would support their closed state assertion. Within this same structure, the authors observe a density that may be attributed to sphingomyelin (or possibly phosphocholine). Using electrophysiology of WT and the Y404W channels, the authors report sphingomyelins antagonist effect on TRPML1 currents under low luminal (external) pH. Taken together, the results described in Gan et al provide compelling evidence for a gating (open, closed) mechanism of the TRPML1 pore which can be allosterically regulated by altered packing and lipid interactions within the VSDL.

    1. eLife assessment

      The study addresses a central question in systems neuroscience (validation of active inference models of exploration) using a combination of behavior, neuroimaging, and modelling. The data provided are useful but incomplete due to issues with multiple comparisons and lack of model validation.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper presents a compelling and comprehensive study of decision-making under uncertainty. It addresses a fundamental distinction between belief-based (cognitive neuroscience) formulations of choice behavior with reward-based (behavioral psychology) accounts. Specifically, it asks whether active inference provides a better account of planning and decision making, relative to reinforcement learning. To do this, the authors use a simple but elegant paradigm that includes choices about whether to seek both information and rewards. They then assess the evidence for active inference and reinforcement learning models of choice behavior, respectively. After demonstrating that active inference provides a better explanation of behavioral responses, the neuronal correlates of epistemic and instrumental value (under an optimized active inference model) are characterized using EEG. Significant neuronal correlates of both kinds of value were found in sensor and source space. The source space correlates are then discussed sensibly, in relation to the existing literature on the functional anatomy of perceptual and instrumental decision-making under uncertainty.

    3. Reviewer #2 (Public Review):

      Summary:

      Zhang and colleagues use a combination of behavioral, neural, and computational analyses to test an active inference model of exploration in a novel reinforcement learning task.

      Strengths:

      The paper addresses an important question (validation of active inference models of exploration). The combination of behavior, neuroimaging, and modeling is potentially powerful for answering this question.

      I appreciate the addition of details about model fitting, comparison, and recovery, as well as the change in some of the methods.

      Weaknesses:

      The authors do not cite what is probably the most relevant contextual bandit study, by Collins & Frank (2018, PNAS), which uses EEG.

      The authors cite Collins & Molinaro as a form of contextual bandit, but that's not the case (what they call "context" is just the choice set). They should look at the earlier work from Collins, starting with Collins & Frank (2012, EJN).

      Placing statistical information in a GitHub repository is not appropriate. This needs to be in the main text of the paper. I don't understand why the authors refer to space limitations; there are none for eLife, as far as I'm aware.

      In answer to my question about multiple comparisons, the authors have added the following: "Note that we did not attempt to correct for multiple comparisons; largely, because the correlations observed were sustained over considerable time periods, which would be almost impossible under the null hypothesis of no correlations." I'm sorry, but this does not make sense. Either the authors are doing multiple comparisons, in which case multiple comparison correction is relevant, or they are doing a single test on the extended timeseries, in which case they need to report that. There exist tools for this kind of analysis (e.g., Gershman et al., 2014, NeuroImage). I'm not suggesting that the authors should necessarily do this, only that their statistical approach should be coherent. As a reference point, the authors might look at the aforementioned Collins & Frank (2018) study.

      I asked the authors to show more descriptive comparison between the model and the data. Their response was that this is not possible, which I find odd given that they are able to use the model to define a probability distribution on choices. All I'm asking about here is to show predictive checks which build confidence in the model fit. The additional simulations do not address this. The authors refer to figures 3 and 4, but these do not show any direct comparison between human data and the model beyond model comparison metrics.

    4. Reviewer #3 (Public Review):

      Summary:

      This paper aims to investigate how the human brain represents different forms of value and uncertainty that participate in active inference within a free-energy framework, in a two-stage decision task involving contextual information sampling, and choices between safe and risky rewards, which promotes shifting between exploration and exploitation. They examine neural correlates by recording EEG and comparing activity in the first vs second half of trials and between trials in which subjects did and did not sample contextual information, and perform a regression with free-energy-related regressors against data "mapped to source space."

      Strengths:

      This two-stage paradigm is cleverly designed to incorporate several important processes of learning, exploration/exploitation and information sampling that pertain to active inference. Although scalp/brain regions showing sensitivity to the active-inference related quantities do not necessary suggest what role they play, they are illuminating and useful as candidate regions for further investigation. The aims are ambitious, and the methodologies impressive. The paper lays out an extensive introduction to the free energy principle and active inference to make the findings accessible to a broad readership.

      Weaknesses:<br /> In its revised form the paper is complete in providing the important details. Though not a serious weakness, it is important to note that the high lower-cutoff of 1 Hz in the bandpass filter, included to reduce the impact of EEG noise, would remove from the EEG any sustained, iteratively updated representation that evolves with learning across trials, or choice-related processes that unfold slowly over the course of the 2-second task windows.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This paper presents a compelling and comprehensive study of decision-making under uncertainty. It addresses a fundamental distinction between belief-based (cognitive neuroscience) formulations of choice behaviour with reward-based (behavioural psychology) accounts. Specifically, it asks whether active inference provides a better account of planning and decision-making, relative to reinforcement learning. To do this, the authors use a simple but elegant paradigm that includes choices about whether to seek both information and rewards. They then assess the evidence for active inference and reinforcement learning models of choice behaviour, respectively. After demonstrating that active inference provides a better explanation of behavioural responses, the neuronal correlates of epistemic and instrumental value (under an optimised active inference model) are characterised using EEG. Significant neuronal correlates of both kinds of value were found in sensor and source space. The source space correlates are then discussed sensibly, in relation to the existing literature on the functional anatomy of perceptual and instrumental decision-making under uncertainty.

      Strengths:

      The strengths of this work rest upon the theoretical underpinnings and careful deconstruction of the various determinants of choice behaviour using active inference. A particular strength here is that the experimental paradigm is designed carefully to elicit both information-seeking and reward-seeking behaviour; where the information-seeking is itself separated into resolving uncertainty about the context (i.e., latent states) and the contingencies (i.e., latent parameters), under which choices are made. In other words, the paradigm - and its subsequent modelling - addresses both inference and learning as necessary belief and knowledge-updating processes that underwrite decisions.

      The authors were then able to model belief updating using active inference and then look for the neuronal correlates of the implicit planning or policy selection. This speaks to a further strength of this study; it provides some construct validity for the modelling of belief updating and decision-making; in terms of the functional anatomy as revealed by EEG. Empirically, the source space analysis of the neuronal correlates licences some discussion of functional specialisation and integration at various stages in the choices and decision-making.

      In short, the strengths of this work rest upon a (first) principles account of decision-making under uncertainty in terms of belief updating that allows them to model or fit choice behaviour in terms of Bayesian belief updating - and then use relatively state-of-the-art source reconstruction to examine the neuronal correlates of the implicit cognitive processing.

      Response: We are deeply grateful for your careful review of our work and for the thoughtful feedback you have provided. Your dedication to ensuring the quality and clarity of the work is truly admirable. Your comments have been invaluable in guiding us towards improving the paper, and We appreciate your time and effort in not just offering suggestions but also providing specific revisions that I can implement. Your insights have helped us identify areas where I can strengthen the arguments and clarify the methodology.

      Comment 1:

      The main weaknesses of this report lies in the communication of the ideas and procedures. Although the language is generally excellent, there are some grammatical lapses that make the text difficult to read. More importantly, the authors are not consistent in their use of some terms; for example, uncertainty and information gain are sometimes conflated in a way that might confuse readers. Furthermore, the descriptions of the modelling and data analysis are incomplete. These shortcomings could be addressed in the following way.

      First, it would be useful to unpack the various interpretations of information and goal-seeking offered in the (active inference) framework examined in this study. For example, it will be good to include the following paragraph:

      "In contrast to behaviourist approaches to planning and decision-making, active inference formulates the requisite cognitive processing in terms of belief updating in which choices are made based upon their expected free energy. Expected free energy can be regarded as a universal objective function, specifying the relative likelihood of alternative choices. In brief, expected free energy can be regarded as the surprise expected following some action, where the expected surprise comes in two flavours. First, the expected surprise is uncertainty, which means that policies with a low expected free energy resolve uncertainty and promote information seeking. However, one can also minimise expected surprise by avoiding surprising, aversive outcomes. This leads to goal-seeking behaviour, where the goals can be regarded as prior preferences or rewarding outcomes.

      Technically, expected free energy can be expressed in terms of risk plus ambiguity - or rearranged to be expressed in terms of expected information gain plus expected value, where value corresponds to (log) prior preferences. We will refer to both decompositions in what follows; noting that both decompositions accommodate information and goal-seeking imperatives. That is, resolving ambiguity and maximising information gain have epistemic value, while minimising risk or maximising expected value have pragmatic or instrumental value. These two kinds of values are sometimes referred to in terms of intrinsic and extrinsic value, respectively [1-4]."

      Response 1: We deeply thank you for your comments and corresponding suggestions about our interpretations of active inference. In response to your identified weaknesses and suggestions, we have added corresponding paragraphs in the Methods section (The free energy principle and active inference, line 95-106):

      “Active inference formulates the necessary cognitive processing as a process of belief updating, where choices depend on agents' expected free energy. Expected free energy serves as a universal objective function, guiding both perception and action. In brief, expected free energy can be seen as the expected surprise following some policies. The expected surprise can be reduced by resolving uncertainty, and one can select policies with lower expected free energy which can encourage information-seeking and resolve uncertainty. Additionally, one can minimize expected surprise by avoiding surprising or aversive outcomes (oudeyer et al., 2007; Schmidhuber et al., 2010). This leads to goal-seeking behavior, where goals can be viewed as prior preferences or rewarding outcomes.

      Technically, expected free energy can also be expressed as expected information gain plus expected value, where the value corresponds to (log) prior preferences. We will refer to both formulations in what follows. Resolving ambiguity, minimizing risk, and maximizing information gain has epistemic value while maximizing expected value have pragmatic or instrumental value. These two types of values can be referred to in terms of intrinsic and extrinsic value, respectively (Barto et al., 2013; Schwartenbeck et al., 2019).”

      Oudeyer, P. Y., & Kaplan, F. (2007). What is intrinsic motivation? A typology of computational approaches. Frontiers in neurorobotics, 1, 108.

      Schmidhuber, J. (2010). Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE transactions on autonomous mental development, 2(3), 230-247.

      Barto, A., Mirolli, M., & Baldassarre, G. (2013). Novelty or surprise?. Frontiers in psychology, 4, 61898.

      Schwartenbeck, P., Passecker, J., Hauser, T. U., FitzGerald, T. H., Kronbichler, M., & Friston, K. J. (2019). Computational mechanisms of curiosity and goal-directed exploration. elife, 8, e41703.

      Comment 2:

      The description of the modelling of choice behaviour needs to be unpacked and motivated more carefully. Perhaps along the following lines:

      "To assess the evidence for active inference over reinforcement learning, we fit active inference and reinforcement learning models to the choice behaviour of each subject. Effectively, this involved optimising the free parameters of active inference and reinforcement learning models to maximise the likelihood of empirical choices. The resulting (marginal) likelihood was then used as the evidence for each model. The free parameters for the active inference model scaled the contribution of the three terms that constitute the expected free energy (in Equation 6). These coefficients can be regarded as precisions that characterise each subjects' prior beliefs about contingencies and rewards. For example, increasing the precision or the epistemic value associated with model parameters means the subject would update her beliefs about reward contingencies more quickly than a subject who has precise prior beliefs about reward distributions. Similarly, subjects with a high precision over prior preferences or extrinsic value can be read as having more precise beliefs that she will be rewarded. The free parameters for the reinforcement learning model included..."

      Response 2: We deeply thank you for your comments and corresponding suggestions about our description of the behavioral modelling. In response to your identified weaknesses and suggestions, we have added corresponding content in the Results section (Behavioral results, line 279-293):

      “To assess the evidence for active inference over reinforcement learning, we fit active inference (Eq.9), model-free reinforcement learning, and model-based reinforcement learning models to the behavioral data of each participant. This involved optimizing the free parameters of active inference and reinforcement learning models. The resulting likelihood was used to calculate the Bayesian Information Criterion (BIC) (Vrieze 2012) as the evidence for each model. The free parameters for the active inference model (AL, AI, EX, prior, and α) scaled the contribution of the three terms that constitute the expected free energy in Eq.9. These coefficients can be regarded as precisions that characterize each participant's prior beliefs about contingencies and rewards. For example, increasing α means participants would update their beliefs about reward contingencies more quickly, increasing AL means participants would like to reduce ambiguity more, and increasing AI means participants would like to learn the hidden state of the environment and avoid risk more. The free parameters for the model-free reinforcement learning model are the learning rate α and the temperature parameter γ and the free parameters for the model-based are the learning rate α, the temperature parameter γ and prior (the details for the model-free reinforcement learning model can be seen in Eq.S1-11 and the details for the model-based reinforcement learning model can be seen Eq.S12-23 in the Supplementary Method). The parameter fitting for these three models was conducted using the `BayesianOptimization' package in Python (Frazire 2018), first randomly sampling 1000 times and then iterating for an additional 1000 times.”

      Vrieze, S. I. (2012). Model selection and psychological theory: a discussion of the differences between the Akaike information criterion (AIC) and the Bayesian information criterion (BIC). Psychological methods, 17(2), 228.

      Frazier, P. I. (2018). A tutorial on Bayesian optimization. arXiv preprint arXiv:1807.02811.

      Comment 3:

      In terms of the time-dependent correlations with expected free energy - and its constituent terms - I think the report would benefit from overviewing these analyses with something like the following:

      "In the final analysis of the neuronal correlates of belief updating - as quantified by the epistemic and intrinsic values of expected free energy - we present a series of analyses in source space. These analyses tested for correlations between constituent terms in expected free energy and neuronal responses in source space. These correlations were over trials (and subjects). Because we were dealing with two-second timeseries, we were able to identify the periods of time during decision-making when the correlates were expressed.

      In these analyses, we focused on the induced power of neuronal activity at each point in time, at each brain source. To illustrate the functional specialisation of these neuronal correlates, we present whole-brain maps of correlation coefficients and pick out the most significant correlation for reporting fluctuations in selected correlations over two-second periods. These analyses are presented in a descriptive fashion to highlight the nature and variety of the neuronal correlates, which we unpack in relation to the existing EEG literature in the discussion. Note that we did not attempt to correct for multiple comparisons; largely, because the correlations observed were sustained over considerable time periods, which would be almost impossible under the null hypothesis of no correlations."

      Response 3: We deeply thank you for your comments and corresponding suggestions about our description of the regression analysis in the source space. In response to your suggestions, we have added corresponding content in the Results section (EEG results at source level, line 331-347):

      “In the final analysis of the neural correlates of the decision-making process, as quantified by the epistemic and intrinsic values of expected free energy, we presented a series of linear regressions in source space. These analyses tested for correlations over trials between constituent terms in expected free energy (the value of avoiding risk, the value of reducing ambiguity, extrinsic value, and expected free energy itself) and neural responses in source space. Additionally, we also investigated the neural correlate of (the degree of) risk, (the degree of) ambiguity, and prediction error. Because we were dealing with a two-second time series, we were able to identify the periods of time during decision-making when the correlates were expressed. The linear regression was run by the "mne.stats.linear regression" function in the MNE package (Activity ~ Regressor + Intercept). Activity is the activity amplitude of the EEG signal in the source space and regressor is one of the regressors that we mentioned (e.g., expected free energy, the value of reducing ambiguity, etc.).

      In these analyses, we focused on the induced power of neural activity at each time point, in the brain source space. To illustrate the functional specialization of these neural correlates, we presented whole-brain maps of correlation coefficients and picked out the brain region with the most significant correlation for reporting fluctuations in selected correlations over two-second periods. These analyses were presented in a descriptive fashion to highlight the nature and variety of the neural correlates, which we unpacked in relation to the existing EEG literature in the discussion. Note that we did not attempt to correct for multiple comparisons; largely, because the correlations observed were sustained over considerable time periods, which would be almost impossible under the null hypothesis of no correlations.”

      Comment 4:

      There was a slight misdirection in the discussion of priors in the active inference framework. The notion that active inference requires a pre-specification of priors is a common misconception. Furthermore, it misses the point that the utility of Bayesian modelling is to identify the priors that each subject brings to the table. This could be easily addressed with something like the following in the discussion:

      "It is a common misconception that Bayesian approaches to choice behaviour (including active inference) are limited by a particular choice of priors. As illustrated in our fitting of choice behaviour above, priors are a strength of Bayesian approaches in the following sense: under the complete class theorem [5, 6], any pair of choice behaviours and reward functions can be described in terms of ideal Bayesian decision-making with particular priors. In other words, there always exists a description of choice behaviour in terms of some priors. This means that one can, in principle, characterise any given behaviour in terms of the priors that explain that behaviour. In our example, these were effectively priors over the precision of various preferences or beliefs about contingencies that underwrite expected free energy."

      Response 4: We deeply thank you for your comments and corresponding suggestions about the prior of Bayesian methods. In response to your suggestions, we have added corresponding content in the Discussion section (The strength of the active inference framework in decision-making, line 447-453):

      “However, it may be the opposite. As illustrated in our fitting results, priors can be a strength of Bayesian approaches. Under the complete class theorem (Wald 1947; Brown 1981), any pair of behavioral data and reward functions can be described in terms of ideal Bayesian decision-making with particular priors. In other words, there always exists a description of behavioral data in terms of some priors. This means that one can, in principle, characterize any given behavioral data in terms of the priors that explain that behavior. In our example, these were effectively priors over the precision of various preferences or beliefs about contingencies that underwrite expected free energy.”

      Wald, A. (1947). An essentially complete class of admissible decision functions. The Annals of Mathematical Statistics, 549-555.

      Brown, L. D. (1981). A complete class theorem for statistical problems with finite sample spaces. The Annals of Statistics, 1289-1300.

      Reviewer #2 (Public Review):

      Summary:

      Zhang and colleagues use a combination of behavioral, neural, and computational analyses to test an active inference model of exploration in a novel reinforcement learning task.

      Strengths:

      The paper addresses an important question (validation of active inference models of exploration). The combination of behavior, neuroimaging, and modeling is potentially powerful for answering this question.

      Response: We want to express our sincere gratitude for your thorough review of our work and for the valuable comments you have provided. Your attention to detail and dedication to improving the quality of the work are truly commendable. Your feedback has been invaluable in guiding us towards revisions that will strengthen the work. We have made targeted modifications based on most of the comments. However, due to factors such as time and energy constraints, we have not added corresponding analyses for several comments.

      Comment 1:

      The paper does not discuss relevant work on contextual bandits by Schulz, Collins, and others. It also does not mention the neuroimaging study of Tomov et al. (2020) using a risky/safe bandit task.

      Response 1:

      We deeply thank you for your suggestions about the relevant work. We now discussion and cite these representative papers in the Introduction section (line 42-55):

      “The decision-making process frequently involves grappling with varying forms of uncertainty, such as ambiguity - the kind of uncertainty that can be reduced through sampling, and risk - the inherent uncertainty (variance) presented by a stable environment. Studies have investigated these different forms of uncertainty in decision-making, focusing on their neural correlates (Daw et al., 2006; Badre et al., 2012; Cavanagh et al., 2012).

      These studies utilized different forms of multi-armed bandit tasks, e.g the restless multi-armed bandit tasks (Daw et al., 2006; Guha et al., 2010), risky/safe bandit tasks (Tomov et al., 2020; Fan et al., 2022; Payzan et al., 2013), contextual multi-armed bandit tasks (Schulz et al., 2015; Schulz et al., 2015; Molinaro et al., 2023). However, these tasks either separate risk from ambiguity in uncertainty, or separate action from state (perception). In our work, we develop a contextual multi-armed bandit task to enable participants to actively reduce ambiguity, avoid risk, and maximize rewards using various policies (see Section 2.2) and Figure 4(a)). Our task makes it possible to study whether the brain represents these different types of uncertainty distinctly (Levy et al., 2010) and whether the brain represents both the value of reducing uncertainty and the degree of uncertainty. The active inference framework presents a theoretical approach to investigate these questions. Within this framework, uncertainties can be reduced to ambiguity and risk. Ambiguity is represented by the uncertainty about model parameters associated with choosing a particular action, while risk is signified by the variance of the environment's hidden states. The value of reducing ambiguity, the value of avoiding risk, and extrinsic value together constitute expected free energy (see Section 2.1).”

      Daw, N. D., O'doherty, J. P., Dayan, P., Seymour, B., & Dolan, R. J. (2006). Cortical substrates for exploratory decisions in humans. Nature, 441(7095), 876-879.

      Badre, D., Doll, B. B., Long, N. M., & Frank, M. J. (2012). Rostrolateral prefrontal cortex and individual differences in uncertainty-driven exploration. Neuron, 73(3), 595-607.

      Cavanagh, J. F., Figueroa, C. M., Cohen, M. X., & Frank, M. J. (2012). Frontal theta reflects uncertainty and unexpectedness during exploration and exploitation. Cerebral cortex, 22(11), 2575-2586.

      Guha, S., Munagala, K., & Shi, P. (2010). Approximation algorithms for restless bandit problems. Journal of the ACM (JACM), 58(1), 1-50.

      Tomov, M. S., Truong, V. Q., Hundia, R. A., & Gershman, S. J. (2020). Dissociable neural correlates of uncertainty underlie different exploration strategies. Nature communications, 11(1), 2371.

      Fan, H., Gershman, S. J., & Phelps, E. A. (2023). Trait somatic anxiety is associated with reduced directed exploration and underestimation of uncertainty. Nature Human Behaviour, 7(1), 102-113.

      Payzan-LeNestour, E., Dunne, S., Bossaerts, P., & O’Doherty, J. P. (2013). The neural representation of unexpected uncertainty during value-based decision making. Neuron, 79(1), 191-201.

      Schulz, E., Konstantinidis, E., & Speekenbrink, M. (2015, April). Exploration-exploitation in a contextual multi-armed bandit task. In International conference on cognitive modeling (pp. 118-123).

      Schulz, E., Konstantinidis, E., & Speekenbrink, M. (2015, November). Learning and decisions in contextual multi-armed bandit tasks. In CogSci.

      Molinaro, G., & Collins, A. G. (2023). Intrinsic rewards explain context-sensitive valuation in reinforcement learning. PLoS Biology, 21(7), e3002201.

      Levy, I., Snell, J., Nelson, A. J., Rustichini, A., & Glimcher, P. W. (2010). Neural representation of subjective value under risk and ambiguity. Journal of neurophysiology, 103(2), 1036-1047.

      Comment 2:

      The statistical reporting is inadequate. In most cases, only p-values are reported, not the relevant statistics, degrees of freedom, etc. It was also not clear if any corrections for multiple comparisons were applied. Many of the EEG results are described as "strong" or "robust" with significance levels of p<0.05; I am skeptical in the absence of more details, particularly given the fact that the corresponding plots do not seem particularly strong to me.

      Response 2: We deeply thank you for your comments about our statistical reporting. We have optimized the fitting model and rerun all the statistical analyses. As can be seen (Figure 6, 7, 8, S3, S4, S5), the new regression results are significantly improved compared to the previous ones. Due to the limitation of space, we place the other relevant statistical results, including t-values, std err, etc., on our GitHub (https://github.com/andlab-um/FreeEnergyEEG). Currently, we have not conducted multiple comparison corrections based on Reviewer 1’s comments (Comments 3) “Note that we did not attempt to correct for multiple comparisons; largely, because the correlations observed were sustained over considerable time periods, which would be almost impossible under the null hypothesis of no correlations”.

      Author response image 1.

      Comment 3:

      The authors compare their active inference model to a "model-free RL" model. This model is not described anywhere, as far as I can tell. Thus, I have no idea how it was fit, how many parameters it has, etc. The active inference model fitting is also not described anywhere. Moreover, you cannot compare models based on log-likelihood, unless you are talking about held-out data. You need to penalize for model complexity. Finally, even if active inference outperforms a model-free RL model (doubtful given the error bars in Fig. 4c), I don't see how this is strong evidence for active inference per se. I would want to see a much more extensive model comparison, including model-based RL algorithms which are not based on active inference, as well as model recovery analyses confirming that the models can actually be distinguished on the basis of the experimental data.

      Response 3: We deeply thank you for your comments about the model comparison details. We previously omitted some information about the comparison model, as classical reinforcement learning is not the focus of our work, so we put the specific details in the supplementary materials. Now we have placed relevant information in the main text (see the part we have highlighted in yellow). We have now added the relevant information regarding the model comparison in the Results section (Behavioral results, line 279-293):

      “To assess the evidence for active inference over reinforcement learning, we fit active inference (Eq.9), model-free reinforcement learning, and model-based reinforcement learning models to the behavioral data of each participant. This involved optimizing the free parameters of active inference and reinforcement learning models. The resulting likelihood was used to calculate the Bayesian Information Criterion (BIC) as the evidence for each model. The free parameters for the active inference model (AL, AI, EX, prior, and α) scaled the contribution of the three terms that constitute the expected free energy in Eq.9. These coefficients can be regarded as precisions that characterize each participant's prior beliefs about contingencies and rewards. For example, increasing α means participants would update their beliefs about reward contingencies more quickly, increasing AL means participants would like to reduce ambiguity more, and increasing AI means participants would like to learn the hidden state of the environment and avoid risk more. The free parameters for the model-free reinforcement learning model are the learning rate α and the temperature parameter γ and the free parameters for the model-based are the learning rate α, the temperature parameter γ and prior (the details for the model-free reinforcement learning model can be found in Eq.S1-11 and the details for the model-based reinforcement learning model can be found in Eq.S12-23 in the Supplementary Method). The parameter fitting for these three models was conducted using the `BayesianOptimization' package in Python, first randomly sampling 1000 times and then iterating for an additional 1000 times.”

      We have now incorporated model-based reinforcement learning into our comparison models and placed the descriptions of both model-free and model-based reinforcement learning algorithms in the supplementary materials. We have also changed the criterion for model comparison to Bayesian Information Criterion. As indicated by the results, the performance of the active inference model significantly outperforms both comparison models.

      Sorry, we didn't do model recovery before, but now we have placed the relevant results in the supplementary materials. From the result figures, we can see that each model fits its own generated simulated data well:

      “To demonstrate how reliable our models are (the active inference model, model-free reinforcement learning model, and model-based reinforcement learning model), we run some simulation experiments for model recovery. We use these three models, with their own fitting parameters, to generate some simulated data. Then we will fit all three sets of data using these three models.

      The model recovery results are shown in Fig.S6. This is the confusion matrix of models: the percentage of all subjects simulated based on a certain model that is fitted best by a certain model. The goodness-of-fit was compared using the Bayesian Information Criterion. We can see that the result of model recovery is very good, and the simulated data generated by a model can be best explained by this model.”

      Author response image 2.

      Comment 4:

      Another aspect of the behavioral modeling that's missing is a direct descriptive comparison between model and human behavior, beyond just plotting log-likelihoods (which are a very impoverished measure of what's going on).

      Response 4: We deeply thank you for your comments about the comparison between the model and human behavior. Due to the slight differences between our simulation experiments and real behavioral experiments (the "you can ask" stage), we cannot directly compare the model and participants' behaviors. However, we can observe that in the main text's simulation experiment (Figure 3), the active inference agent's behavior is highly consistent with humans (Figure 4), exhibiting an effective exploration strategy and a desire to reduce uncertainty. Moreover, we have included two additional simulation experiments in the supplementary materials, which demonstrate that active inference may potentially fit a wide range of participants' behavioral strategies.

      Author response image 3.

      (An active inference agent with AL=AI=EX=0. It can accomplish tasks efficiently like a human being, reducing the uncertainty of the environment and maximizing the reward.)

      Author response image 4.

      (An active inference agent with AL=AI=0, EX=10. It will only pursue immediate rewards (not choosing the "Cue" option due to additional costs), but it can also gradually optimize its strategy due to random effects.)

      Author response image 5.

      (An active inference agent with EX=0, AI=AL=10. It will only pursue environmental information to reduce the uncertainty of the environment. Even in "Context 2" where immediate rewards are scarce, it will continue to explore.)

      Figure (a) shows the decision-making of active inference agents in the Stay-Cue choice. Blue corresponds to agents choosing the "Cue" option and acquiring "Context 1"; orange corresponds to agents choosing the "Cue" option and acquiring "Context 2"; purple corresponds to agents choosing the "Stay" option and not knowing the information about the hidden state of the environment. The shaded areas below correspond to the probability of the agents making the respective choices.

      Figure (b) shows the decision-making of active inference agents in the Stay-Cue choice. The shaded areas below correspond to the probability of the agents making the respective choices.

      Figure (c) shows the rewards obtained by active inference agents.

      Figure (d) shows the reward prediction errors of active inference agents.

      Figure (e) shows the reward predictions of active inference agents for the "Risky" path in "Context 1" and "Context 2".

      Comment 5:

      The EEG results are intriguing, but it wasn't clear that these provide strong evidence specifically for the active inference model. No alternative models of the EEG data are evaluated.

      Overall, the central claim in the Discussion ("we demonstrated that the active inference model framework effectively describes real-world decision-making") remains unvalidated in my opinion.

      Response 5: We deeply thank you for your comments. We applied the active inference model to analyze EEG results because it best fit the participants' behavioral data among our models, including the new added results. Further, our EEG results serve only to verify that the active inference model can be used to analyze the neural mechanisms of decision-making in uncertain environments (if possible, we could certainly design a more excellent reinforcement learning model with a similar exploration strategy). We aim to emphasize the consistency between active inference and human decision-making in uncertain environments, as we have discussed in the article. Active inference emphasizes both perception and action, which is also what we wish to highlight: during the decision-making process, participants not only passively receive information, but also actively adopt different strategies to reduce uncertainty and maximize rewards.

      Reviewer #3 (Public Review):

      Summary:

      This paper aims to investigate how the human brain represents different forms of value and uncertainty that participate in active inference within a free-energy framework, in a two-stage decision task involving contextual information sampling, and choices between safe and risky rewards, which promotes a shift from exploration to exploitation. They examine neural correlates by recording EEG and comparing activity in the first vs second half of trials and between trials in which subjects did and did not sample contextual information, and perform a regression with free-energy-related regressors against data "mapped to source space." Their results show effects in various regions, which they take to indicate that the brain does perform this task through the theorised active inference scheme.

      Strengths:

      This is an interesting two-stage paradigm that incorporates several interesting processes of learning, exploration/exploitation, and information sampling. Although scalp/brain regions showing sensitivity to the active-inference-related quantities do not necessarily suggest what role they play, it can be illuminating and useful to search for such effects as candidates for further investigation. The aims are ambitious, and methodologically it is impressive to include extensive free-energy theory, behavioural modelling, and EEG source-level analysis in one paper.

      Response: We would like to express our heartfelt thanks to you for carefully reviewing our work and offering insightful feedback. Your attention to detail and commitment to enhancing the overall quality of our work are deeply admirable. Your input has been extremely helpful in guiding us through the necessary revisions to enhance the work. We have implemented focused changes based on a majority of your comments. Nevertheless, owing to limitations such as time and resources, we have not included corresponding analyses for a few comments.

      Comment 1:

      Though I could surmise the above general aims, I could not follow the important details of what quantities were being distinguished and sought in the EEG and why. Some of this is down to theoretical complexity - the dizzying array of constructs and terms with complex interrelationships, which may simply be part and parcel of free-energy-based theories of active inference - but much of it is down to missing or ambiguous details.

      Response 1: We deeply thank you for your comments about our work’s readability. We have significantly revised the descriptions of active inference, models, research questions, etc. Focusing on active inference and the free energy principle, we have added relevant basic descriptions and unified the terminology. We have added information related to model comparison in the main text and supplementary materials. We presented our regression results in clearer language. Our research focused on the brain's representation of decision-making in uncertain environments, including expected free energy, the value of reducing ambiguity, the value of avoiding risk, extrinsic value, ambiguity, and risk.

      Comment 2:

      In general, an insufficient effort has been made to make the paper accessible to readers not steeped in the free energy principle and active inference. There are critical inconsistencies in key terminology; for example, the introduction states that aim 1 is to distinguish the EEG correlates of three different types of uncertainty: ambiguity, risk, and unexpected uncertainty. But the abstract instead highlights distinctions in EEG correlates between "uncertainty... and... risk" and between "expected free energy .. and ... uncertainty." There are also inconsistencies in mathematical labelling (e.g. in one place 'p(s|o)' and 'q(s)' swap their meanings from one sentence to the very next).

      Response 2: We deeply thank you for your comments about the problem of inconsistent terminology. First, we have unified the symbols and letters (P, Q, s, o, etc.) that appeared in the article and described their respective meanings more clearly. We have also revised the relevant expressions of "uncertainty" throughout the text. In our work, uncertainty refers to ambiguity and risk. Ambiguity can be reduced through continuous sampling and is referred to as uncertainty about model parameters in our work. Risk, on the other hand, is the inherent variance of the environment and cannot be reduced through sampling, which is referred to as uncertainty about hidden states in our work. In the analysis of the results, we focused on how the brain encodes the value of reducing ambiguity (Figure 8), the value of avoiding risk (Figure 6), and (the degree of) ambiguity (Figure S5) during action selection. We also analyzed how the brain encodes reducing ambiguity and avoiding risk during belief update (Figure 7).

      Comment 3:

      Some basic but important task information is missing, and makes a huge difference to how decision quantities can be decoded from EEG. For example:

      - How do the subjects press the left/right buttons - with different hands or different fingers on the same hand?

      Response 3: We deeply thank you for your comments about the missing task information. We have added the relevant content in the Methods section (Contextual two-armed bandit task and Data collection, line 251-253):

      “Each stage was separated by a jitter ranging from 0.6 to 1.0 seconds. The entire experiment consists of a single block with a total of 120 trials. The participants are required to use any two fingers of one hand to press the buttons (left arrow and right arrow on the keyboard).”

      Comment 4:

      - Was the presentation of the Stay/cue and safe/risky options on the left/right sides counterbalanced? If not, decisions can be formed well in advance especially once a policy is in place.

      Response 4: The presentation of the Stay/cue and safe/risky options on the left/right sides was not counterbalanced. It is true that participants may have made decisions ahead of time. However, to better study the state of participants during decision-making, our choice stages consist of two parts. In the first two seconds, we ask participants to consider which option they would choose, and after these two seconds, participants are allowed to make their choice (by pressing the button).

      We also updated the figure of the experiment procedure as below (We circled the time that the participants spent on making decisions).

      Author response image 6.

      Comment 5:

      - What were the actual reward distributions ("magnitude X with probability p, magnitude y with probability 1-p") in the risky option?

      Response 5: We deeply thank you for your comments about the missing task information. We have placed the relevant content in the Methods section (Contextual two-armed bandit task and Data collection, line 188-191):

      “The actual reward distribution of the risky path in "Context 1" was [+12 (55%), +9 (25%), +6 (10%), +3 (5%), +0 (5%)] and the actual reward distribution of the risky path in "Context 2" was [+12 (5%), +9 (5%), +6 (10%), +3 (25%), +0 (55%)].”

      Comment 6:

      The EEG analysis is not sufficiently detailed and motivated.

      For example,

      - why the high lower-filter cutoff of 1 Hz, and shouldn't it be acknowledged that this removes from the EEG any sustained, iteratively updated representation that evolves with learning across trials?

      Response 6: We deeply thank you for your comments about our EEG analysis. The 1Hz high-pass filter may indeed filter out some useful information. We chose a 1Hz high-pass filter to filter out most of the noise and prevent the noise from affecting our results analysis. Additionally, there are also many decision-related works that have applied 1Hz high-pass filtering in EEG data preprocessing (Yau et al., 2021; Cortes et al., 2021; Wischnewski et al., 2022; Schutte et al., 2017; Mennella et al., 2020; Giustiniani et al., 2020).

      Yau, Y., Hinault, T., Taylor, M., Cisek, P., Fellows, L. K., & Dagher, A. (2021). Evidence and urgency related EEG signals during dynamic decision-making in humans. Journal of Neuroscience, 41(26), 5711-5722.

      Cortes, P. M., García-Hernández, J. P., Iribe-Burgos, F. A., Hernández-González, M., Sotelo-Tapia, C., & Guevara, M. A. (2021). Temporal division of the decision-making process: An EEG study. Brain Research, 1769, 147592.

      Wischnewski, M., & Compen, B. (2022). Effects of theta transcranial alternating current stimulation (tACS) on exploration and exploitation during uncertain decision-making. Behavioural Brain Research, 426, 113840.

      Schutte, I., Kenemans, J. L., & Schutter, D. J. (2017). Resting-state theta/beta EEG ratio is associated with reward-and punishment-related reversal learning. Cognitive, Affective, & Behavioral Neuroscience, 17, 754-763.

      Mennella, R., Vilarem, E., & Grèzes, J. (2020). Rapid approach-avoidance responses to emotional displays reflect value-based decisions: Neural evidence from an EEG study. NeuroImage, 222, 117253.

      Giustiniani, J., Nicolier, M., Teti Mayer, J., Chabin, T., Masse, C., Galmès, N., ... & Gabriel, D. (2020). Behavioral and neural arguments of motivational influence on decision making during uncertainty. Frontiers in Neuroscience, 14, 583.

      Comment 7:

      - Since the EEG analysis was done using an array of free-energy-related variables in a regression, was multicollinearity checked between these variables?

      Response 7: We deeply thank you for your comments about our regression. Indeed, we didn't specify our regression formula in the main text. We conducted regression on one variable each time, so there was no need for a multicollinearity check. We have now added the relevant content in the Results section (“EEG results at source level” section, line 337-340):

      “The linear regression was run by the "mne.stats.linear regression" function in the MNE package (Activity ~ Regressor + Intercept). Activity is the activity amplitude of the EEG signal in the source space and regressor is one of the regressors that we mentioned (e.g., expected free energy, the value of reducing ambiguity, etc.).”

      Comment 8:

      - In the initial comparison of the first/second half, why just 5 clusters of electrodes, and why these particular clusters?

      Response 8: We deeply thank you for your comments about our sensor-level analysis. These five clusters are relatively common scalp EEG regions to analyze (left frontal, right frontal, central, left parietal, and right parietal), and we referred previous work analyzed these five clusters of electrodes (Laufs et al., 2006; Ray et al., 1985; Cole et al., 1985). In addition, our work pays more attention to the analysis in source space, exploring the corresponding functions of specific brain regions based on active inference models.

      Laufs, H., Holt, J. L., Elfont, R., Krams, M., Paul, J. S., Krakow, K., & Kleinschmidt, A. (2006). Where the BOLD signal goes when alpha EEG leaves. Neuroimage, 31(4), 1408-1418.

      Ray, W. J., & Cole, H. W. (1985). EEG activity during cognitive processing: influence of attentional factors. International Journal of Psychophysiology, 3(1), 43-48.

      Cole, H. W., & Ray, W. J. (1985). EEG correlates of emotional tasks related to attentional demands. International Journal of Psychophysiology, 3(1), 33-41.

      Comment 9:

      How many different variables are systematically different in the first vs second half, and how do you rule out less interesting time-on-task effects such as engagement or alertness? In what time windows are these amplitudes being measured?

      Response 9 (and the Response for Weaknesses 11): There were no systematic differences between the first half and the second half of the trials, with the only difference being the participants' experience. In the second half, participants had a better understanding of the reward distribution of the task (less ambiguity). The simulation results can well describe these.

      Author response image 7.

      As shown in Figure (a), agents can only learn about the hidden state of the environment ("Context 1" (green) or "Context 2" (orange)) by choosing the "Cue" option. If agents choose the "Stay" option, they will not be able to know the hidden state of the environment (purple). The risk of agents is only related to wh

      ether they choose the "Cue" option, not the number of rounds. Figure (b) shows the Safe-Risky choices of agents, and Figure (e) is the reward prediction of agents for the "Risky" path in "Context 1" and "Context 2". We can see that agents update the expected reward and reduce ambiguity by sampling the "Risky" path. The ambiguity of agents is not related to the "Cue" option, but to the number of times they sample the "Risky" path (rounds).

      In our choosing stages, participants were required to think about their choices for the first two seconds (during which they could not press buttons). Then, they were asked to make their choices (press buttons) within the next two seconds. This setup effectively kept participants' attention focused on the task. And the two second during the “Second choice” stage when participants decide which option to choose (they cannot press buttons) are measured for the analysis of the sensor-level results.

      Comment 10:

      In the comparison of asked and not-asked trials, what trial stage and time window is being measured?

      Response 10: We have added relevant descriptions in the main text. The two second during the “Second choice” stage when participants decide which option to choose (they cannot press buttons) are measured for the analysis of the sensor-level results.

      Author response image 8.

      Comment 11:

      Again, how many different variables, of the many estimated per trial in the active inference model, are different in the asked and not-asked trials, and how can you know which of these differences is the one reflected in the EEG effects?

      Response 11: The difference between asked trials and not-asked trials lies only in whether participants know the specific context of the risky path (the level of risk for the participants). A simple comparison indeed cannot tell us which of these differences is reflected in the EEG effects. Therefore, we subsequently conducted model-based regression analysis in the source space.

      Comment 12:

      The authors choose to interpret that on not-asked trials the subjects are more uncertain because the cue doesn't give them the context, but you could equally argue that they don't ask because they are more certain of the possible hidden states.

      Response 12: Our task design involves randomly varying the context of the risky path. Only by choosing to inquire can participants learn about the context. Participants can only become increasingly certain about the reward distribution of different contexts of the risky path, but cannot determine which specific context it is. Here are the instructions for the task that we will tell the participants (line 226-231).

      "You are on a quest for apples in a forest, beginning with 5 apples. You encounter two paths: 1) The left path offers a fixed yield of 6 apples per excursion. 2) The right path offers a probabilistic reward of 0/3/6/9/12 apples, and it has two distinct contexts, labeled "Context 1" and "Context 2," each with a different reward distribution. Note that the context associated with the right path will randomly change in each trial. Before selecting a path, a ranger will provide information about the context of the right path ("Context 1" or "Context 2") in exchange for an apple. The more apples you collect, the greater your monetary reward will be."

      Comment 13:

      - The EEG regressors are not fully explained. For example, an "active learning" regressor is listed as one of the 4 at the beginning of section 3.3, but it is the first mention of this term in the paper and the term does not arise once in the methods.

      Response 13: We have accordingly revised the relevant content in the main text (as in Eq.8). Our regressors now include expected free energy, the value of reducing ambiguity, the value of avoiding risk, extrinsic value, prediction error, (the degree of) ambiguity, reducing ambiguity, and avoiding risk.

      Comment 14:

      - In general, it is not clear how one can know that the EEG results reflect that the brain is purposefully encoding these very parameters while implementing this very mechanism, and not other, possibly simpler, factors that correlate with them since there is no engagement with such potential confounds or alternative models. For example, a model-free reinforcement learning model is fit to behaviour for comparison. Why not the EEG?

      Response 14: We deeply thank you for your comments. Due to factors such as time and effort, and because the active inference model best fits the behavioral data of the participants, we did not use other models to analyze the EEG data. At both the sensor and source level, we observed the EEG signal and brain regions that can encode different levels of uncertainties (risk and ambiguity). The brain's uncertainty driven exploration mechanism cannot be explained solely by a simple model-free reinforcement learning approach.

      Recommendations for the authors:

      Response: We have made point-to-point revisions according to the reviewer's recommendations, and as these revisions are relatively minor, we have only responded to the longer recommendations here.

      Reviewer #1 (Recommendations For The Authors)

      I enjoyed reading this sophisticated study of decision-making. I thought your implementation of active inference and the subsequent fitting to choice behaviour - and study of the neuronal (EEG) correlates - was impressive. As noted in my comments on strengths and weaknesses, some parts of your manuscript with difficult to read because of slight collapses in grammar and an inconsistent use of terms when referring to the mathematical quantities. In addition to the paragraphs I have suggested, I would recommend the following minor revisions to your text. In addition, you will have to fill in some of the details that were missing from the current version of the manuscript. For example:

      Recommendation 1:

      Which RL model did you use to fit the behavioural data? What were its free parameters?

      Response 1: We have now added information related to the comparison models in the behavioral results and supplementary materials. We applied both simple model-free reinforcement learning and model-based reinforcement learning. The free parameters for the model-free reinforcement learning model are the learning rate α and the temperature parameter γ, while the free parameters for the model-based approach are the learning rate α, the temperature parameter γ, and the prior.

      Recommendation 2:

      When you talk about neuronal activity in the final analyses (of time-dependent correlations) what was used to measure the neuronal activity? Was this global power over frequencies? Was it at a particular frequency band? Was it the maximum amplitude within some small window et cetera? In other words, you need to provide the details of your analysis that would enable somebody to reproduce your study at a certain level of detail.

      Response 2: In the final analyses, we used the activity amplitude at each point in the source space for our analysis. Previously, we had planned to make our data and models available on GitHub to facilitate easier replication of our work.

      Reviewer #3 (Recommendations For The Authors)

      Recommendation 1:

      It might help to explain the complex concepts up front, to use the concrete example of the task itself - presumably, it was designed so that the crucial elements of the active inference framework come to the fore. One could use hypothetical choice patterns in this task to exemplify different factors such as expected free energy and unexpected uncertainty at work. It would also be illuminating to explain why behaviour on this task is fit better by the active inference model than a model-free reinforcement learning model.

      Response 1: Thank you for your suggestions. We have given clearer explanations to the three terms in the active inference formula: the value of reducing ambiguity, the value of avoiding risk, and the extrinsic value (Eq.8), which makes it easier for readers to understand active inference.

      In addition, we can simply view active inference as a computational model similar to model-based reinforcement learning, where the expected free energy represents a subjective value, without needing to understand its underlying computational principles or neurobiological background. In our discussion, we have argued why the active inference model fits the participants' behavior better than our reinforcement learning model, as the active inference model has an inherent exploration mechanism that is consistent with humans, who instinctively want to reduce environmental uncertainty (line 435-442).

      “Active inference offers a superior exploration mechanism compared with basic model-free reinforcement learning  (Figure 4 (c)). Since traditional reinforcement learning models determine their policies solely on the state, this setting leads to difficulty in extracting temporal information (Laskin et al., 2020) and increases the likelihood of entrapment within local minima. In contrast, the policies in active inference are determined by both time and state. This dependence on time (Wang et al., 2016) enables policies to adapt efficiently, such as emphasizing exploration in the initial stages and exploitation later on. Moreover, this mechanism prompts more exploratory behavior in instances of state ambiguity. A further advantage of active inference lies in its adaptability to different task environments (Friston et al., 2017). It can configure different generative models to address distinct tasks, and compute varied forms of free energy and expected free energy.”

      Laskin, M., Lee, K., Stooke, A., Pinto, L., Abbeel, P., & Srinivas, A. (2020). Reinforcement learning with augmented data. Advances in neural information processing systems, 33, 19884-19895.

      Wang, J. X., Kurth-Nelson, Z., Tirumala, D., Soyer, H., Leibo, J. Z., Munos, R., ... & Botvinick, M. (2016). Learning to reinforcement learn. arXiv preprint arXiv:1611.05763.

      Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., & Pezzulo, G. (2017). Active inference: a process theory. Neural computation, 29(1), 1-49.

      Recommendation 2:

      Figure 1A provides a key example of the lack of effort to help the reader understand. It suggests the possibility of a concrete example but falls short of providing one. From the caption and text, applied to the figure, I gather that by choosing either to run or to raise one's arms, one can control whether it is daytime or nighttime. This is clearly wrong but it is what I am led to think by the paper.

      Response 2: Thank you for your suggestion, which we had not considered before. In this figure, we aim to illustrate that "the agent receives observations and optimizes his cognitive model by minimizing variational free energy → the agent makes the optimal action by minimizing expected free energy → the action changes the environment → the environment generates new observations for the agent." We have now modified the image to be simpler to prevent any possible confusion for readers. Correspondingly, we removed the figure of a person raising their hand and the shadowed house in Figure a.

      Author response image 9.

      Recommendation 3:

      I recommend an overhaul in the labelling and methodological explanations for consistency and full reporting. For example, line 73 says sensory input is 's' and the cognitive model is 'q(s),' and the cause of the sensory input is 'p(s|o)' but on the very next line, the cognitive model is 'p(s|o)' and the causes of sensory input are 'q(s).' How this sensory input s relates to 'observations' or 'o' is unclear, and meanwhile, capital S is the set of environmental states. P seems to refer to the generative distribution, but it also means probability.

      Response 3: Thank you for your advice. Now we have revised the corresponding labeling and methodological explanations in our work to make them consistent. However, we are not sure how to make a good modification to P here. In many works, P can refer to a certain probability distribution or some specific probabilities.

      Recommendation 4:

      Even the conception of a "policy" is unclear (Figure 2B). They list 4 possible policies, which are simply the 4 possible sequences of steps, stay-safe, cue-risky, etc, but with no contingencies in them. Surely a complete policy that lists 'cue' as the first step would entail a specification of how they would choose the safe or risky option BASED on the information in that cue

      Response 4: Thank you for your suggestion. In active inference, a policy actually corresponds to a sequence of actions. The policy of "first choosing 'Cue' and then making the next decision based on specific information" differs from the meaning of policy in active inference.

      Recommendation 5:

      I assume that the heavy high pass filtering of the EEG (1 Hz) is to avoid having to baseline-correct the epochs (of which there is no mention), but the authors should directly acknowledge that this eradicates any component of decision formation that may evolve in any way gradually within or across the stages of the trial. To take an extreme example, as Figure 3E shows, the expected rewards for the risky path evolve slowly over the course of 60 trials. The filter would eliminate this.

      Response 5: Thank you for your suggestion. The heavy high pass filtering of the EEG (1 Hz) is to minimize the noise in the EEG data as much as possible.

      Recommendation 6:

      There is no mention of the regression itself in the Methods section - the section is incomplete.

      Response 6: Thank you for your suggestion. We have now added the relevant content in the Results section (EEG results at source level, line 337-340):

      “The linear regression was run by the "mne.stats.linear regression" function in the MNE package (Activity ∼ Regressor + Intercept, Activity is the activity amplitude of the EEG signal in the source space and regressor is one of the regressors that we mentioned).”

      Recommendation 7:

      On Lines 260-270 the same results are given twice.

      Response 7: Thank you for your suggestion. We have now deleted redundant content.

      Recommendation 8:

      Frequency bands are displayed in Figure 5 but there is no mention of those in the Methods. In Figure 5b Theta in the 2nd half is compared to Delta in the 1st half- is this an error?

      Response 8: Thank you for your suggestion. It indeed was an error (they should all be Theta) and now we have corrected it.

      Author response image 10.

    1. eLife assessment

      The study presents a valuable finding that the Endothelin B receptor (ETBR) expressed by the satellite glial cells (SGCs) in the dorsal root ganglions (DRG) inhibited sensory axon regeneration in both adult and aged mice. The evidence supporting most of the conclusions was solid, and the work will be of interest to neuroscientists working on axon regeneration and the involvement of non-neuronal cell types in regulating axon regeneration. Although the proposed mechanism is intriguing and the methodology is robust, the molecular mechanisms by which ETBR regulates axon regeneration are not fully elucidated.

    2. Reviewer #1 (Public Review):

      The manuscript by Feng et al. reported that the Endothelin B receptor (ETBR) expressed by the satellite glial cells (SGCs) in the dorsal root ganglions (DRG) acted to inhibit sensory axon regeneration in both adult and aged mice. Thus, pharmacological inhibition of ETBR with specific inhibitors resulted in enhanced sensory axon regeneration in vitro and in vivo. In addition, sensory axon regeneration significantly reduces in aged mice and inhibition of ETBR could restore such defect in aged mice. Moreover, the study provided some evidence that the reduced level of gap junction protein connexin 43 might act downstream of ETBR to suppress axon regeneration in aged mice. Overall, the study revealed an interesting SGC-derived signal in the DRG microenvironment to regulate sensory axon regeneration. It provided additional evidence that non-neuronal cell types in the microenvironment function to regulate axon regeneration via cell-cell interaction.

      However, the molecular mechanisms by which ETBR regulates axon regeneration are unclear, and the manuscript's structure is not well organized, especially in the last section. Some discussion and explanation about the data interpretation are needed to improve the manuscript.

      (1) The result showed that the level of ETBR did not change after the peripheral nerve injury. Does this mean that its endogenous function is to limit spontaneous sensory axon regeneration? In other words, the results suggest that SGCs expressing ETBR or vascular endothelial cells expressing its ligand ET-1 act to suppress sensory axon regeneration. Some explanation or discussion about this is necessary. Moreover, does the protein level of ETBR or its ligand change during aging?

      (2) In ex vivo experiments, NGF was added to the culture medium. Previous studies have shown that adult sensory neurons could initiate fast axon growth in response to NGF within 24 hours. In addition, dissociated sensory neurons could also initiate spontaneous regenerative axon growth without NGF after 48 hours. Some discussion or rationale is needed to explain the difference between NGF-induced or spontaneous axon growth of culture adult sensory neurons and the roles of ETBR and SGCs.

      (3) In cultured dissociated sensory neurons, inhibiting ETBR also enhanced axon growth, which meant the presence of SGCs surrounding the sensory neurons. Some direct evidence is needed to show the cellular relationship between them in culture.

      (4) In Figure 3, the in vivo regeneration experiments first showed enhanced axon regeneration either 1 day or 3 days after the nerve injury. The study then showed that inhibiting ETBR could enhance sensory axon growth in vitro from uninjured naïve neurons or conditioning lesioned neurons. To my knowledge, in vivo sensory axon regeneration is relatively slow during the first 2 days after the nerve injury and then enters the fast regeneration mode on the 3rd day, representing the conditioning lesion effect in vivo. Some discussion is needed to compare the in vitro and the in vivo model of axon regeneration.

      (5) In Figure 5, the study showed that the level of connexin 43 increased after ETBR inhibition in either adult or aged mice, proposing an important role of connexin 43 in mediating the enhancing effect of ETBR inhibition on axon regeneration. However, in the study, there was no direct evidence supporting that ETBR directly regulates connexin 43 expression in SGCs. Moreover, there was no functional evidence that connexin 43 acted downstream of ETBR to regulate axon regeneration.

    3. Reviewer #2 (Public Review):

      Summary:

      In this interesting and original study, Feng and colleagues set out to address the effect of manipulating endothelin signaling on nerve regeneration, focusing on the crosstalk between endothelial cells (ECs) in dorsal root ganglia (DRG), which secrete ET-1 and satellite glial cells (SGCs) expressing ETBR receptor. The main finding is that ETBR signaling is a default brake on axon growth, and inhibiting this pathway promotes axon regeneration after nerve injury and counters the decline in regenerative capacity that occurs during aging. ET-1 and ETBR are mapped in ECs and SGCs, respectively, using scRNA-seq of DRGs from adult or aged mice. Although their expression does not change upon injury, it is modulated during aging, with a reported increase in plasma levels of ET-1 (a potent vasoconstrictive signal). Using in vitro explant assays coupled with pharmacological inhibition in mouse models of nerve injury, the authors demonstrate that ET-1/ETBR curbs axonal growth, and the ETAR/ETBR antagonist Bosentan boosts regrowth during the early phase of repair. In addition, Bosentan restores the ability of aged DRG neurons to regrow after nerve lesions. Despite Bosentan inhibiting both endothelin receptors A and B, comparison with an ETAR-specific antagonist indicates that the effects can be attributed to the ET-1/ETBR pathway. In the DRGs, ETBR is mostly expressed by SGCs (and a subset of Schwann cells) a cell type that previous studies, including work from this group, have implicated in nerve regeneration. SGCs ensheath and couple with DRG neurons through gap junctions formed by Cx43. Based on their own findings and evidence from the literature, the pro-regenerative effects of ETBR inhibition are in part attributed to an increase in Cx43 levels, which are expected to enhance neuron-SGC coupling. Finally, gene expression analysis in adult vs aged DRGs predicts a decrease in fatty acid and cholesterol metabolism, for which previous work by the authors has shown a requirement in SGCs to promote axon regeneration.

      Strengths:

      The study is well-executed and the main conclusion that "ETBR signaling inhibits axon regeneration after nerve injury and plays a role in age-related decline in regenerative capacity" (line 77) is supported by the data. Given that Bosentan is an FDA-approved drug, the findings may have therapeutic value in clinical settings where peripheral nerve regeneration is suboptimal or largely impaired, as it often happens in aged individuals. In addition, the study highlights the importance of vascular signals in nerve regeneration, a topic that has gained traction in recent years. Importantly, these results further emphasize the contribution of long-neglected SGCs to nerve tissue homeostasis and repair. Although the study does not reach a complete mechanistic understanding, the results are robust and are expected to attract the interest of a broader readership.

      Weaknesses:

      Despite these positive comments provided above, the following points should be considered:

      (1) This study examines the contribution of the ET-1 pathway in the ganglia, and in vitro assays are consistent with the idea that important signaling events take place there. Nevertheless, it remains to be determined whether the accelerated axon regrowth observed in vivo depends also on cellular crosstalk mediated by ET-1 at the lesion site. Are ECs along the nerve secreting ET-1? What cells are present in the nerve stroma that could respond and participate in the repair process? Would these interactions be sensitive to Bosentan? It may be difficult to dissect this contribution, but it should at least be discussed.

      (2) It is suggested that the permeability of DRG vessels may facilitate the release of "vascular-derived signals" (lines 82-84). Is it possible that the ET-1/ETBR pathway modulates vascular permeability, and that this, in turn, contributes to the observed effects on regeneration?

      (3) Is the affinity of ET-3 for ETBR similar to that of ET-1? Can it be excluded that ET-3 expressed by fibroblasts is relevant for controlling SGC responses upon injury/aging?

      (4) ETBR inhibition in dissociated (mixed) cultures uncovers the restraining activity of endothelin signaling on axon growth (Figure 2C). Since neurons do not express ET-1 receptors, based on scRNA-seq analysis, these results are interpreted as an indication that basal ETBR signaling in SGC curbs the axon growth potential of sensory neurons. For this to occur in dissociated cultures, however, one should assume that SGC-neuron association is present, similar to in vivo, or to whole DRG cultures (Figure 2C). Has this been tested? In both in vitro experimental settings (dissociated and whole DRG cultures) how is ETBR stimulated over up to 7 days of culture? In other words, where does endothelin come from in these cultures (which are unlikely to support EC/blood vessel growth)? Is it possible that the relevant ligand here derives from fibroblasts (see point #6)? Or does it suggest that ETBR can be constitutively active (i.e., endothelin-independent signaling)? Is there any chance that endothelin is present in the culture media or Matrigel?

      (5) The discovery that ET-1/ETBR signaling in SGC curtails the growth capacity of axons at baseline raises questions about the physiological role of this pathway. What happens when ETBR signaling is prevented over a longer period of time? This could be addressed with pharmacological inhibitors, or better, with cell-specific knock-out mice. The experiments would certainly be of general interest, although not within the scope of this story. Nevertheless, it could be worth discussing the possibilities.

      (6) Assessing Cx43 levels by measuring the immunofluorescence signal (Figure 5E-F) is acceptable, particularly when the aim is to restrict the analysis to SGCs. The modulation of Cx43 expression by ET-1/ETBR plays an important part in the proposed model. Therefore, a complementary analysis of Cx43 expression by quantitative RT-PCR on sorted SGCs would be a valuable addition to the immunofluorescence data. Is this attainable?

      (7) The conclusions "We thus hypothesize that ETBR inhibition in SGCs contributes to axonal regeneration by increasing Cx43 levels, gap junction coupling or hemichannels and facilitating SGC-neuron communication" (lines 303-305) are consistent with the findings but seem in contrast with the effect of aging on gap junction coupling reported by others and cited in line 210: "the number of gap junctions and the dye coupling between these cells increases (Huang et al., 2006)". I am confused by what distinguishes a potential, and supposedly beneficial, increase in coupling after ETBR inhibition, from what is observed in aging.

      (8) I find it difficult to reconcile the results in Figure 5F with the proposed model since (1) injury increases Cx43 levels in both adult and aged mice, (2) the injured aged/vehicle group has a similar level to the uninjured adult group, (3) upon injury, aged+Bosentan is much lower than adult+Bosentan (significance not tested). It seems hard to explain the effect of Bosentan only through the modulation of Cx43 levels. Whether the increase in Cx43 levels following ETBR inhibition actually results in higher SGC-neuron coupling has not been assessed experimentally.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript suggests that inhibiting ETBR via the FDA-approved compound Bosentan can disrupt ET-1-ETBR signalling that they found detrimental to nerve regeneration, thus promoting repair after nerve injury in adult and aged mice.

      Strengths:

      (1) The clinical need to identify molecular and cellular mechanisms that can be targeted to improve repair after nerve injury.

      (2) The proposed mechanism is interesting.

      (3) The methodology is sound.

      Weaknesses:

      (1) The data appear preliminary and the story appears incomplete.

      (2) Lack of causality and clear cellular and molecular mechanism. There are also some loose ends such as the role of connexin 43 in SGCs: how is it related to ET-1- ETBR signalling?

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      This work sets out to elucidate mechanistic intricacies in inflammatory responses in pneumonia in the context of the aging process (Terc deficiency - telomerase functionality).

      Strengths:

      Very interesting, conceptually speaking, approach that is by all means worth pursuing. An overall proper approach to the posited aim.

      We want to thank the reviewer for taking the time to review our manuscript and for providing positive feedback regarding our research question.

      Weaknesses:

      The work is heavily underpowered and may have statistical deficits. This precludes it in its current state from drawing unequivocal conclusions.

      Thank you for this essential and valuable comment. We fully accept that the small sample size of the Tercko/ko mice is a major limitation of our study and transparently discuss this in our manuscript.

      However, due to Animal Welfare regulations, only a reduced number of mice were approved because of the strong burden of disease. Consequently, only three non-infected and five infected mice were available to us. This reduced number of mice presents a clear limitation to our study. However, due to ethical considerations related to animal welfare and sustainability, as well as compliance with German animal welfare regulations, it is not possible to obtain additional Tercko/ko mice to increase the dataset. The animal studies are an important aspect of our study; however, our hypothesis was also investigated at multiple levels, including in an in vitro co-culture model (Figure 5), to ensure comprehensive analysis.

      Thus, we clearly demonstrated that S. aureus pneumonia in Tercko/ko mice leads to a more severe phenotype, orchestrated by the dysregulation of both innate and adaptive immune response.

      Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate heightened susceptibility of Terc-KO mice to S. aureus-induced pneumonia, perform gene expression analysis from the infected lungs, find an elevated inflammatory (NLRP3) signature in some Terc-KO but not control mice, and some reduction in T cell signatures. Based on that, They conclude that disregulated inflammation and T-cell dysfunction play a major role in these phenomena.

      Strengths:

      The strengths of the work include a problem not previously addressed (the role of the Terc component of the telomerase complex) in certain aspects of resistance to bacterial infection and innate (and maybe adaptive) immune function.

      We would like to thank the reviewer for the positive feedback regarding our aim to investigate the impact of Terc deletion on the pulmonary immune response to S. aureus.

      Weaknesses:

      The weaknesses outweigh the strengths, dominantly because conclusions are plagued by flaws in experimental design, by lack of rigorous controls, and by incomplete and inadequate approaches to testing immune function. These weaknesses are as follows

      (1)  Terc-KO mice are a genomic knockout model, and therefore the authors need to carefully consider the impact of this KO on a wide range of tissues. This, however, is not the case. There are no attempts to perform cell transfers or use irradiation chimera or crosses that would be informative.

      We thank the reviewer for bringing up this important point. The aim of our study, however; was to investigate the impact of Terc deletion in the lung and on the response to bacterial pneumonia, rather than to provide a comprehensive characterization of the Tercko/ko model itself. This characterization of different tissues and cell types has already been conducted by previous studies. For instance, studies that characterize the general phenotype of the model (Herrera et al., 1999; Lee et al., 1998; Rudolph et al., 1999) but also investigations that shed light on the impact of Terc deletion on specific cell types such as microglia (Khan et al., 2015) or T cells (Matthe et al., 2022). The impact of Terc deletion on T cells is also discussed in our manuscript in lines 89 to 105. Furthermore, a section about the general phenotype of the Terc deletion model is included in the introduction in lines 126 to 138. Thus we discussed the relevant literature regarding Tercko/ko mice in our manuscript and attempted to provide a more in-depth characterization of the lung by investigating the inflammatory response to infection as well as changes in the gene expression (Figure 2-4).

      (2)  Throughout the manuscript the authors invoke the role of telomere shortening in aging, and according to them, their Terc-KO mice should be one potential model for aging. Yet the authors consistently describe major differences between young Terc-KO and naturally aging old mice, with no discussion of the implications. This further confuses the biological significance of this work as presented.

      Thank you for mentioning this relevant point. We want to apologize for the confusion regarding this matter. While Tercko/ko mice are a well-established model for premature aging, these effects become more apparent with increasing generations (G) and thus, G5 and 6 mice are the most affected by Terc deletion (Lee et al., 1998; Wong et al., 2008).

      Thus, while Tercko/ko mice are a common model for premature aging, this accelerated aging phenotype is predominantly apparent in later-generation Tercko/ko (G5 and 6) or aged Tercko/ko mice (Lee et al., 1998; Wong et al., 2008). Since the aim of this study was to analyze the impact of Terc deletion on the lung and its immune response to bacterial infections instead of the impact of telomere shortening and telomerase dysfunction, young G3 Tercko/ko mice (8 weeks) were used in this study. This is also mentioned in the lines 131-134. In this study, Tercko/ko mice were used not as a model of aging, but rather as a model specifically for Terc deletion. The old WT mice function as a control cohort to observe possible common but also deviating effects between aging and Terc deletion. In our sequencing data, we observe that uninfected young WT mice are very similar to uninfected Tercko/ko mice. Other studies have also reported this lack of major differences between uninfected WT and Tercko/ko mice in the G3 knockout mice (Kang et al., 2018). Conversely, uninfected young WT and Tercko/ko mice exhibited great differences, for instance, regarding the numbers of differentially expressed genes (Supplemental Figure 1H). Thus, differences between naturally aged mice and young G3 Tercko/ko mice are not surprising. To clarify this aspect we reconstructed the paragraph discussing the Tercko/ko mice (lines 126-134). Additionally we added a paragraph explaining the purpose of the naturally aged mice to the lines 134 to 138:

      “As control cohort age-matched young WT mice were utilized. To investigate whether Terc deletion, beyond critical telomere shortening, impacts the pulmonary immune response, we used young Tercko/ko mice. Additionally, naturally aged mice (2 years old) were infected to explore the potential link to a fully developed aging phenotype.”

      (3)  Related to #2, group design for comparisons lacks a clear rationale. The authors stipulate that Terc- KO will mimic natural aging, but in fact, the only significant differences seen between groups in susceptibility to S. aureus are, contrary to the authors' expectation, between young Terc-KO and naturally old mice (Figures 1A and B, no difference between young Terc-KO and young wt); or there are no significant differences at all between groups (Figures 1, C, D,).

      We thank the reviewer for this essential comment. As mentioned above the Tercko/ko mice in this study are not selected to model natural aging. To model telomerase dysfunction and accelerated aging selection of later generation or aged Tercko/ko mice would have been more suitable.

      The lack of statistical significance in some figures is likely due to the heterogeneity of disease phenotype of S. aureus infection in mice, which is a limitation of our study that we discuss in our discussion section in lines 577-583. The phenotype of S. aureus infection can vary greatly within a mouse population, highlighting the limitations of mice as a model for S. aureus infections. To account for this heterogeneity we divided the infected Tercko/ko mice cohort into different degrees of severity based on the clinical score and the presence of bacteria in organs other than the lung (mice with systemic infection).

      Despite the heterogeneity especially within the Tercko/ko mice cohort the differences between the knockout and young as well as old WT mice were striking. Including the fatal infections, 80% of the Tercko/ko mice had a severe course of disease, while none of the WT mice displayed a severe course (Figure 1A, B and Supplemental Figure 1A, B). This hints towards a clear role of Terc in the response to S. aureus infection in mice. Thus while in some figures the differences are not significant, strong trends towards a more severe phenotype of S. aureus infection in the Tercko/ko mice regarding bacterial load, score and inflammatory response could be observed in our study.

      Another example of inadequate group design is when the authors begin dividing their Terc-KO groups by clinical score into animals with or without "systemic infection" (the condition where a bacterium spreads uncontrollably across the many organs and via blood, which should be properly called sepsis), and then compare this sepsis group to other groups (Supplementary Figures 1G; Figure 2; lines 374-376 and 389- 391). This gives them significant differences in several figures, but because they did not clearly indicate where they applied this stratification in the figure legends, the data are somewhat confusing. Most importantly, methodologically it is highly inappropriate to compare one mouse with sepsis to another one without. If Terc-KO mice with sepsis are a comparator group, then their controls have to be wild-type mice with sepsis, who are dealing with the same high bacterial load across the body and are presumably forced to deploy the same set of immune defenses.

      We sincerely appreciate the significant time and effort you have invested in reviewing our manuscript. However, with all due respect, we must point out that the definition of sepsis you have referenced is considered outdated. According to the Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3), sepsis is defined as "a life-threatening organ dysfunction caused by a dysregulated host response to infection" (Marvin Singer, 2016, JAMA). Given this fundamental misunderstanding of our findings, we find the comment regarding the inadequacy of our groups to be both dismissive and lacking in scientific merit. We would like to emphasize that the group size used in our study is consistent with accepted standards in infection research. We strongly reject any insinuations of inadequacy that have been repeatedly mentioned throughout the review.

      In order to provide a nuanced investigation of disease severity in Tercko/ko mice, we added the term “systemic infection” to the figures whenever the mice were divided into groups of mice with and without systemic infection. This is the case for Figure 2A and Supplemental Figure 1C-E. The division into mice with and without systemic infection is also mentioned in the figure legend of Figure 2A in lines 933 to 936 and for Supplemental Figure 1 in lines 1053-1054. We agree that Supplemental Figure 1G is somewhat confusing as the mice with systemic infection are highlighted in this graph but not included as a separate group within our sequencing analysis. We added a sentence to the figure legend clarifying this (lines 1042-1045):

      “Nevertheless, the infected Tercko/ko mice were considered one group for the expression analysis and not split into separate groups for the subsequent analysis.”

      Additionally, we revised the section regarding this grouping in different degrees of severity in our Material and Methods section to clarify that this division was only performed for specific analysis (line 191):

      “…for the indicated analysis.”

      Furthermore, the mice which were classified as systemically infected mice were not septic mice, as mentioned above. Those mice were classified by us as systemically infected based on their clinical score and the presence of bacteria in other organs than the lung as stated in the lines 188-191 and 377-382.

      Bacteremia is a symptom of very severe cases of hospital-acquired pneumonia with a very high mortality (De la Calle et al., 2016).

      Therefore, the systemically infected mice or rather mice with bacteremia display an especially severe pneumonia phenotype, which is distinct from sepsis. The presence of this symptom in our Tercko/ko mice further highlights the clinical relevance of our study. This aspect was added to the manuscript in the lines 569-571.

      “The detection of bacteria in extra pulmonary organs is of particular interest, as bacteremia is a symptom of severe pneumonia and is associated with high mortality (De la Calle et al., 2016).”

      (4)  The authors conclude that disregulated inflammation and T-cell dysfunction play a major role in S. aureus susceptibility. This may or may not be an important observation, because many KO mice are abnormal for a variety of reasons, and until such reasons are mechanistically dissected, the physiological importance of the observation will remain unclear.

      Two points are important here. First, there is no natural counterpart to a Terc-KO, which is a complete loss of a key non-enzymatic component of the telomerase complex starting in utero.

      Second, the authors truly did not examine the key basic features of their model, including the features of basic and induced inflammatory and immune responses. This analysis could be done either using model antigens in adjuvants, defined innate immune stimuli (e.g. TLR, RLR, or NLR agonists), or microbial challenge. The only data provided along these lines are the baseline frequencies of total T cells in the spleen of the three groups of mice examined (not statistically significant, Figure 4B). We do not know if the composition of naïve to memory T cell subsets may have been different, and more importantly, we have no data to evaluate whether recruitment of the immune response (including T cells) to the lung upon microbial challenge is similar or different. So, what are the numbers and percentages of T cells and alveolar macrophages in the lung following S. aureus challenge and are they even comparable or are there issues in mobilizing the T cell response to the site of infection? If, for example, Terc-KO mice do not mobilize enough T cells to the lung during infection, that would explain the paucity in many T-cell- associated genes in their transcriptomic set that the authors report. That in turn may not mean dysfunction of T cells but potentially a whole different set of defects in coordinating the response in Terc-KO mice.

      We thank the reviewer for highlighting these important aspects. Regarding the first point, indeed there is no naturally occurring deletion of Terc in humans. However, studies reported reduced expression of Terc and Tert in the tissues of aged mice and rats (Tarry-Adkins et al., 2021; Zhang et al., 2018). Terc itself has been found to have several important immunomodulatory functions such as the activation of the NF- κB or PI3-kinase pathway (Liu et al., 2019; Wu et al., 2022). As those aforementioned pathways are relevant for the immune response to S. aureus infections, the authors were interested in exploring the impact of Terc deletion on the pulmonary immune response. The potential immunomodulatory functions of Terc are discussed in lines 106-121. To further clarify our rationale we added a sentence to the introduction in lines 121-125.

      “Interestingly, downregulation of Terc and Tert expression in tissues of aged mice and rats has been found (Tarry-Adkins, Aiken, Dearden, Fernandez-Twinn, & Ozanne, 2021; Zhang et al., 2018).

      Therefore, as a potential immunomodulatory factor reduced Terc expression could be connected to age- related pathologies.”

      Regarding the second point, as we focused on the effect of Terc deletion in the lung and its role in S. aureus infection, we investigated inflammatory and immune response parameters relevant to this setting. For instance, inflammation parameters in the lungs of all three mice cohorts were measured to investigate differences in the inflammatory response in the non-infected and infected mice (Figure 2A). Those measurements showed no baseline difference in key inflammatory parameters between young WT and Tercko/ko mice, which is consistent with previous findings (Kang et al., 2018). The inflammatory response to infection with S. aureus in the Tercko/ko mice cohort differed significantly from the other cohorts (Figure 2A), hinting towards a dysregulated inflammatory response due to Terc deletion. Furthermore, we investigated general immune cell frequencies such as dendritic cells, macrophages, and B cells in the spleen of all three mice cohorts to gather a baseline understanding of the general immune cell populations. In our manuscript only total T cell frequencies were included due to its relevance for our data regarding T cells (Figure 4B). This data could show that there was no difference of total amount of T cells in the spleen of all three mice cohorts. For a more detailed insight into our analysis we added the frequencies of the other immune cell populations analyzed in the spleen as a Supplemental Figure 3B-F. Additionally, a figure legend for the graphs was added.

      Therefore, while we did not analyze baseline frequencies of specific populations of T cells, we analyzed and characterized the inflammatory and immune response of our model in a way relevant to our research question.

      The differences observed in T cell marker and TCR gene expression was also partly present between the uninfected and infected Tercko/ko mice such as the complete absence of CD247 expression in infected Tercko/ko, which is however expressed in uninfected mice of this cohort (Figure 4A, C and D). Thus, this effect cannot be solely attributed to an inadequate mobilization of T cells to the lung after infectious challenge. However, we agree that a more detailed insight into recruited immune cells to the lung or frequencies of different T cell populations could contribute to a better understanding of the proposed mechanism and would be an interesting experiment to conduct in further studies. We accept this as a limitation of our study and included it in our discussion section in lines 720-724:

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      (5)  Related to that, immunological analysis is also inadequate. First, the authors pull signatures from the total lung tissue, which is both imprecise and potentially skewed by differences, not in gene expression but in types of cells present and/or their abundance, a feature known to be affected by aging and perhaps by Terc deficiency during infection. Second, to draw any conclusions about immune responses, the authors would have to track antigen-specific T cells, which is possible for a wide range of microbial pathogens using peptide-MHC multimers. This would allow highly precise analysis of phenomena the authors are trying to conclude about. Moreover, it would allow them to confirm their gene expression data in populations of physiological interest

      We thank the reviewer for highlighting this important and relevant point. In our study, we aimed to investigate the role of Terc expression in modulating inflammation and the immune response to S. aureus infection in the lung. To address this, we examined the overall impact of age, genotype, and infection on lung inflammation and gene expression. Therefore, sequencing of total lung tissue was essential for addressing the research question posed. Our findings demonstrate that Tercko/ko mice exhibit a more severe phenotype following S. aureus infection, characterized by an increased bacterial load and heightened lung inflammation (Figures 1 and 2). Furthermore, our data suggest that Terc plays a role in regulating inflammation through activation of the NLRP3 inflammasome, along with the dysregulation of several T cell marker genes (Figures 2, 4, and 5). However, this study lacks a detailed analysis of distinct T cell populations, including antigen-specific T cells, as noted earlier. Investigating these aspects in future studies would be valuable to validate and expand upon our findings. We have incorporated these suggestions into the discussion section (lines 720-724)

      “As total CD4+ T cells were analyzed in this study, it would be useful to investigate specific T cell populations such as memory and effector T cells to elucidate the potential mechanism leading to T cell dysfunctionality in further detail. Additionally, analysis of differences in immune cell recruitment to the lungs between young WT and Tercko/ko mice would be relevant.”

      Nevertheless, our study provides first evidence of a potential connection between T cell functionality and Terc expression.

      Third, the authors co-incubate AM and T cells with S. aureus. There is no information here about the phenotype of T cells used. Were they naïve, and how many S. aureus-specific T cells did they contain? Or were they a mix of different cell types, which we know will change with aging (fewer naïve and many more memory cells of different flavors), and maybe even with a Terc-KO? Naïve T cells do not interact with AM; only effector and memory cells would be able to do so, once they have been primed by contact with dendritic cells bringing antigen into the lymphoid tissues, so it is unclear what the authors are modeling here. Mature primed effector T cells would go to the lung and would interact with AM, but it is almost certain that the authors did not generate these cells for their experiment (or at least nothing like that was described in the methods or the text).

      Thank you for bringing up this important question. For the co-cultivation experiment of T cells and alveolar macrophages, total CD4+ T cells of both young WT and Tercko/ko were used. We did not select for a specific population of T cells. Our sequencing data indicated the complete downregulation of CD247 expression, which is an important part of the T cell receptor, in the lungs of infected Tercko/ko mice (Figure 4A, C and D). Given that this factor is downregulated under chronic inflammatory conditions, we investigated the impact of the inflammatory response in alveolar macrophages on the expression of various T cell-derived cytokines, as well as CD247 expression (Figure 5D, E) (Dexiu et al., 2022). This aspect is also highlighted in the discussion in lines 623-637. Therefore, a co-cultivation model of T cells and alveolar macrophages was established and confronted with heat-killed S. aureus to elicit an inflammatory response of the macrophages. To emphasize this purpose, we have revised our statement about the model setup in lines 517-519 of the manuscript:

      “An overactive inflammatory response could be a potential explanation for the dysregulated TCR signaling.”

      The authors hope this will clarify the intent behind the model setup.

      (6)  Overall, the authors began to address the role of Terc in bacterial susceptibility, but to what extent that specifically involves inflammation and macrophages, T cell immunity, or aging remains unclear at present.

      We thank the reviewer for the helpful and relevant comments. The authors accept the limitations of the presented study such as the reduced number of Tercko/ko mice and the limitations of murine models for S. aureus infection itself and discuss those in the discussion section in the lines 559-561; 577-583; 690-692 and 720-726. However, we hope that our responses have provided sufficient evidence to convince the reviewer that our data supports a clear role for Terc expression in regulating the immune response to bacterial infections, particularly with respect to inflammation and its potential connection to T cell functionality.

    2. eLife assessment

      This is a very interesting study that links inflammatory reactivity and T-cell immunity in pathologies associated with pneumonia in the context of the aging process (telomerase functionality). The authors have relied on results from experiments using a mouse model (Terc-deletion), that is used in studies on aging. The questions are relevant, the methodology is appropriate, and the results represent a set of useful findings. However, on the whole, the evidence is not very strong owing to the low power of the study, some flaws in experimental design, lack of rigorous controls, and inadequate approaches to analyzing immune function, thus making the study incomplete in support of its claims.

    3. Reviewer #1 (Public Review):

      Summary:

      This work sets out to elucidate mechanistic intricacies in inflammatory responses in pneumonia in the context of the aging process (Terc deficiency - telomerase functionality).

      Strengths:

      Very interesting, conceptually speaking, approach that is by all means worth pursuing. An overall proper approach to the posited aim.

      Weaknesses:

      The work is heavily underpowered and may have statistical deficits. This precludes it in its current state from drawing unequivocal conclusions.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors demonstrate heightened susceptibility of Terc-KO mice to S. aureus-induced pneumonia, perform gene expression analysis from the infected lungs, find an elevated inflammatory (NLRP3) signature in some Terc-KO but not control mice, and some reduction in T cell signatures. Based on that, They conclude that disregulated inflammation and T-cell dysfunction play a major role in these phenomena.

      Strengths:

      The strengths of the work include a problem not previously addressed (the role of the Terc component of the telomerase complex) in certain aspects of resistance to bacterial infection and innate (and maybe adaptive) immune function.

      Weaknesses:

      The weaknesses outweigh the strengths, dominantly because conclusions are plagued by flaws in experimental design, by lack of rigorous controls, and by incomplete and inadequate approaches to testing immune function. These weaknesses are as follows

      (1) Terc-KO mice are a genomic knockout model, and therefore the authors need to carefully consider the impact of this KO on a wide range of tissues. This, however, is not the case. There are no attempts to perform cell transfers or use irradiation chimera or crosses that would be informative.

      (2) Throughout the manuscript the authors invoke the role of telomere shortening in aging, and according to them, their Terc-KO mice should be one potential model for aging. Yet the authors consistently describe major differences between young Terc-KO and naturally aging old mice, with no discussion of the implications. This further confuses the biological significance of this work as presented.

      (3) Related to #2, group design for comparisons lacks a clear rationale. The authors stipulate that Terc-KO will mimic natural aging, but in fact, the only significant differences seen between groups in susceptibility to S. aureus are, contrary to the authors' expectation, between young Terc-KO and naturally old mice (Figures 1A and B, no difference between young Terc-KO and young wt); or there are no significant differences at all between groups (Figures 1, C, D,).<br /> Another example of inadequate group design is when the authors begin dividing their Terc-KO groups by clinical score into animals with or without "systemic infection" (the condition where a bacterium spreads uncontrollably across the many organs and via blood, which should be properly called sepsis), and then compare this sepsis group to other groups (Supplementary Figures 1G; Figure 2; lines 374-376 and 389-391). This gives them significant differences in several figures, but because they did not clearly indicate where they applied this stratification in the figure legends, the data are somewhat confusing. Most importantly, methodologically it is highly inappropriate to compare one mouse with sepsis to another one without. If Terc-KO mice with sepsis are a comparator group, then their controls have to be wild-type mice with sepsis, who are dealing with the same high bacterial load across the body and are presumably forced to deploy the same set of immune defenses.

      (4) The authors conclude that disregulated inflammation and T-cell dysfunction play a major role in S. aureus susceptibility. This may or may not be an important observation, because many KO mice are abnormal for a variety of reasons, and until such reasons are mechanistically dissected, the physiological importance of the observation will remain unclear. Two points are important here. First, there is no natural counterpart to a Terc-KO, which is a complete loss of a key non-enzymatic component of the telomerase complex starting in utero. Second, the authors truly did not examine the key basic features of their model, including the features of basic and induced inflammatory and immune responses. This analysis could be done either using model antigens in adjuvants, defined innate immune stimuli (e.g. TLR, RLR, or NLR agonists), or microbial challenge. The only data provided along these lines are the baseline frequencies of total T cells in the spleen of the three groups of mice examined (not statistically significant, Figure 4B). We do not know if the composition of naïve to memory T cell subsets may have been different, and more importantly, we have no data to evaluate whether recruitment of the immune response (including T cells) to the lung upon microbial challenge is similar or different. So, what are the numbers and percentages of T cells and alveolar macrophages in the lung following S. aureus challenge and are they even comparable or are there issues in mobilizing the T cell response to the site of infection? If, for example, Terc-KO mice do not mobilize enough T cells to the lung during infection, that would explain the paucity in many T-cell-associated genes in their transcriptomic set that the authors report. That in turn may not mean dysfunction of T cells but potentially a whole different set of defects in coordinating the response in Terc-KO mice.

      (5) Related to that, immunological analysis is also inadequate. First, the authors pull signatures from the total lung tissue, which is both imprecise and potentially skewed by differences, not in gene expression but in types of cells present and/or their abundance, a feature known to be affected by aging and perhaps by Terc deficiency during infection. Second, to draw any conclusions about immune responses, the authors would have to track antigen-specific T cells, which is possible for a wide range of microbial pathogens using peptide-MHC multimers. This would allow highly precise analysis of phenomena the authors are trying to conclude about. Moreover, it would allow them to confirm their gene expression data in populations of physiological interest

      Third, the authors co-incubate AM and T cells with S. aureus. There is no information here about the phenotype of T cells used. Were they naïve, and how many S. aureus-specific T cells did they contain? Or were they a mix of different cell types, which we know will change with aging (fewer naïve and many more memory cells of different flavors), and maybe even with a Terc-KO? Naïve T cells do not interact with AM; only effector and memory cells would be able to do so, once they have been primed by contact with dendritic cells bringing antigen into the lymphoid tissues, so it is unclear what the authors are modeling here. Mature primed effector T cells would go to the lung and would interact with AM, but it is almost certain that the authors did not generate these cells for their experiment (or at least nothing like that was described in the methods or the text).

      (6) Overall, the authors began to address the role of Terc in bacterial susceptibility, but to what extent that specifically involves inflammation and macrophages, T cell immunity, or aging remains unclear at present.

    1. eLife assessment

      This valuable report describes the control of the activity of the RNA-activated protein kinase, PKR, by the Vaccinia virus K3 protein. A strength of the manuscript is the powerful combination of a yeast-based assay with high-throughput sequencing and its convincing experimental use to characterize large numbers of PKR variants. A minor weakness is that the scope of the screen conducted could still be extended, for example in terms of the segments of PKR included.

    2. Reviewer #1 (Public Review):

      Summary:

      The report describes the control of the activity of the RNA-activated protein kinase, PKR, by the Vaccinia virus K3 protein. Repressive binding of K3 to the kinase prevents phosphorylation of its recognised substrate, EIF2α (the α subunit of the Eukaryotic Initiation Factor 2). The interaction of K3 is probed by saturation mutation within four regions of PKR chosen by modelling the molecules' interaction. They identify K3-resistant PKR variants that recognise that the K3/EIF2α-binding surface of the kinase is malleable. This is reasonably interpreted as indicating the potential adaptability of this antiviral protein to combat viral virulence factors.

      Strengths:

      This is a well-conducted study that probes the versatility of the antiviral response to escape a viral inhibitor. The experimentation is very diligent, generating and screening a large number of variants to recognise the malleability of residues at the interface between PKR and K3.

      Weaknesses:

      These are minor. The protein interaction between PKR and K3 has been previously well-explored through phylogenetic and functional analyses and molecular dynamics studies, as well as with more limited site-directed mutational studies using the same experimental assays. Accordingly, these findings largely reinforce what had been established rather than making major discoveries.

      There are some presumptions:

      It isn't established that the different PKR constructs are expressed equivalently so there is the contingency that this could account for some of the functional differences.

      Details about the confirmation of PKR used to model the interaction aren't given so it isn't clear how accurately the model captures the active kinase state. This is important for the interaction with K3/EIF2α.

      Not all regions identified to form the interface between PKR and K3 were assessed in the experimentation. It isn't clear why residues between positions 332-358 weren't examined, particularly as this would have made this report more complete than preceding studies of this protein interaction.

    3. Reviewer #2 (Public Review):

      Chambers et al. (2024) present a systematic and unbiased approach to explore the evolutionary potential of the human antiviral protein kinase R (PKR) to evade inhibition by a poxviral antagonist while maintaining one of its essential functions.

      The authors generated a library of 426 single-nucleotide polymorphism (SNP)-accessible non-synonymous variants of PKR kinase domain and used a yeast-based heterologous virus-host system to assess PKR variants' ability to escape antagonism by the vaccinia virus pseudo-substrate inhibitor K3. The study identified determinant sites in the PKR kinase domain that harbor K3-resistant variants, as well as sites where variation leads to PKR loss of function. The authors found that multiple K3-resistant variants are readily available throughout the domain interface and are enriched at sites under positive selection. They further found some evidence of PKR resilience to viral antagonist diversification. These findings highlight the remarkable adaptability of PKR in response to viral antagonism by mimicry.

      Significance of the findings:

      The findings are important with implications for various fields, including evolutionary biology, virus-host interfaces, genetic conflicts, and antiviral immunity.

      Strength of the evidence:

      Convincing methodology using state-of-the-art mutational scanning approach in an elegant and simple setup to address important challenges in virus-host molecular conflicts and protein adaptations.

      Strengths:

      ● Systematic and Unbiased Approach:<br /> The study's comprehensive approach to generating and characterizing a large library of PKR variants provides valuable insights into the evolutionary landscape of the PKR kinase domain. By focusing on SNP-accessible variants, the authors ensure the relevance of their findings to naturally occurring mutations.

      ● Identification of Key Sites:<br /> The identification of specific sites in the PKR kinase domain that confer resistance or susceptibility to a poxvirus pseudosubstrate inhibition is a significant contribution.

      ● Evolutionary Implications:<br /> The authors performed meticulous comparative analyses throughout the study between the functional variants from their mutagenesis screen ("prospective") and the evolutionarily-relevant past adaptations ("retrospective").

      ● Experimental Design:<br /> The use of a yeast-based assay to simultaneously assess PKR capacity to induce cell growth arrest and susceptibility/resistance to various VACV K3 alleles is an efficient approach. The combination of this assay with high-throughput sequencing allows for the rapid characterization of a large number of PKR variants.

      Areas for Improvement:

      ● Validation of the screen:<br /> The results would be strengthened by validating results from the screen on a handful of candidate PKR variants, either using a similar yeast heterologous assay, or - even more powerfully - in another experimental system assaying for similar function (cell translation arrest) or protein-protein interaction.

      ● Evolutionary Data:<br /> Beyond residues under positive selection, the screen would allow the authors to also perform a comparative analysis with PKR residues under purifying selection. Because they are assessing one of the most conserved ancestral functions of PKR (i.e. cell translation arrest), it may also be of interest to discuss these highly conserved sites.

      ● Mechanistic Insights:<br /> While the study identifies key sites and residues involved in vaccinia K3 resistance, it could benefit from further investigation into the underlying molecular mechanisms. The study's reliance on a single experimental approach, deep mutational scanning, may introduce biases and limit the scope of the findings. The authors may acknowledge these limitations in the Discussion.

      ● Viral Diversity:<br /> The study focuses on the viral inhibitor K3 from vaccinia. Expanding the analysis to include other viral inhibitors, or exploring the effects of PKR variants on a range of viruses would strengthen and expand the study's conclusions. Would the identified VACV K3-resistant variants also be effective against other viral inhibitors (from pox or other viruses)? or in the context of infection with different viruses? Without such evidence, the authors may check the manuscript is specific about the conclusions.

      Overall Assessment:

      The systematic approach, identification of key sites, and evolutionary implications are all notable strengths. While there is room for further investigation into the mechanistic details and broader viral diversity, the findings are robust and already provide important advancements. The manuscript is well-written and clear, and the figures are informative. Specific minor comments are further shared below.

      Minor revisions addressing the areas for improvement mentioned above are recommended.

    4. Reviewer #3 (Public Review):

      Summary:

      - This study investigated how genetic variation in the human protein PKR can enable sensitivity or resistance to a viral inhibitor from the vaccinia virus called K3.

      - The authors generated a collection of PKR mutants and characterized their activity in a high-throughput yeast assay to identify 1) which mutations alter PKR's intrinsic biochemical activity, 2) which mutations allow for PKR to escape from viral K3, and 3) which mutations allow for escape from a mutant version of K3 that was previously known to inhibit PKR more efficiently.

      - As a result of this work, the authors generated a detailed map of residues at the PKR-K3 binding surface and the functional impacts of single mutation changes at these sites.

      Strengths:

      - Experiments assessed each PKR variant against three different alleles of the K3 antagonist, allowing for a combinatorial view of how each PKR mutant performs in different settings.

      - Nice development of a useful, high-throughput yeast assay to assess PKR activity, with highly detailed methods to facilitate open science and reproducibility.

      - The authors generated a very clean, high-quality, and well-replicated dataset.

      Weaknesses:

      - The authors chose to focus solely on testing residues in or near the PKR-K3 predicted binding interface. As a result, there was only a moderately complex library of PKR mutants tested. The residues selected for investigation were logical, but this limited the potential for observing allosteric interactions or other less-expected results.

      - For residues of interest, some kind of independent validation assay would have been useful to demonstrate that this yeast fitness-based assay is a reliable and quantitative readout of PKR activity.

      - As written, the current version of the manuscript could use more context to help a general reader understand 1) what was previously known about these PKR and K3 variants, 2) what was known about how other genes involved in arms races evolve, or 3) what predictions or goals the authors had at the beginning of their experiment. As a result, this paper mostly provides a detailed catalog of variants and their effects. This will be a useful reference for those carrying out detailed, biochemical studies of PKR or K3, but any broader lessons are limited.

      I felt there was a missed opportunity to connect the study's findings to outside evolutionary genetic information, beyond asking if there was overlap with PKR sites that a single previous study had identified as positively selected. For example, are there any signals of balancing selection for PKR? How much allelic diversity is there within humans, and are people typically heterozygous for PKR variants? Relatedly, although PKR variants were tested in isolation here, would the authors expect their functional impacts to be recessive or dominant, and would this alter their interpretations? On the viral diversity side, how much variation is there among K3 sequences? Is there an elevated evolutionary rate, for example, in K3 at residues that contact PKR sites that can confer resistance? None of these additions are essential, but some kind of discussion or analysis like this would help to connect the yeast-based PKR phenotypic assay presented here back to the real-world context for these genes.

    1. eLife Assessment:

      This fundamental study substantially advances our understanding of the role of different-sized soil invertebrates in shaping the rates of leaf litter decomposition, using an experiment across seasons along an aridity gradient. The authors provide compelling evidence that the summed effects of all invertebrates (with large-sized invertebrates being more active in summer and small-sized invertebrates in winter) on decomposition rates result in similar levels of leaf litter decomposition across seasons. The work will be of broad relevance to ecosystem ecologists interested in soil food webs, and researchers interested in modeling carbon cycles to understand global warming.

    2. Reviewer #1 (Public Review):

      Summary:

      Torsekar et al. use a leaf litter decomposition experiment across seasons, and in an aridity gradient, to provide a careful test of the role of different-sized soil invertebrates in shaping the rates of leaf litter decomposition. The authors found that large-sized invertebrates are more active in the summer and small-sized invertebrates in the winter. The summed effects of all invets then translated into similar levels of decomposition across seasons. The system breaks down in hyper-arid sites.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:<br /> I really enjoyed this manuscript from Torsekar et al on "Contrasting responses to aridity by

      different-sized decomposers cause similar decomposition rates across a precipitation gradient". The authors aimed to examine how climate interacts with decomposers of different size categories to influence litter decomposition. They proposed a new hypothesis: "The opposing climatic dependencies of macrofauna and that of microorganisms and mesofauna should lead to similar overall decomposition rates across precipitation gradients".

      This study emphasizes the importance as well as the contribution of different groups of organisms (micro, meso, macro, and whole community) across different seasons (summer with the following characteristics: hot with no precipitation, and winter with the following characteristics: cooler and wetter winter) along a precipitation gradient. The authors made use of 1050 litter baskets with different mesh sizes to capture decomposers contribution. They proposed a new hypothesis that was aiming to understand the "dryland decomposition conundrum". They combined their decomposition experiment with the sampling of decomposers by using pittfall traps across both experiment seasons. This study was carried out in Israel and based on a single litter species that is native to all seven sites. The authors found that microorganism contribution dominated in winter while macrofauna decomposition dominated the overall decomposition in summer. These seasonality differences combined with the differences in different decomposers groups fluctuation along precipitation resulted in similar overall decomposition rates across sites.<br /> I believe this manuscript has a potential to advance our knowledge on litter decomposition.

      Strengths:

      Well design study with combination of different approaches (methods) and consideration of seasonality to generalize pattern.

      The study expands to current understanding of litter decomposition and interaction between factors affecting the process (here climate and decomposers).

      Weaknesses:

      The study was only based on a single litter species.

      We now discuss the advantages and limitations of this approach in the methods and devote a completely new paragraph to this important point in the discussion (lines 394-401).

      Reviewer #2 (Public Review):

      Summary: Torsekar et al. use a leaf litter decomposition experiment across seasons, and in an aridity gradient, to provide a careful test of the role of different-sized soil invertebrates in shaping the rates of leaf litter decomposition. The authors found that large-sized invertebrates are more active in the summer and small-sized invertebrates in the winter. The summed effects of all invets then translated into similar levels of decomposition across seasons. The system breaks down in hyper-arid sites.

      Strengths: This is a well-written manuscript that provides a complete statistical analysis of a nice dataset. The authors provide a complete discussion of their results in the current literature.

      Weaknesses:

      I have only three minor comments. Please standardize the color across ALL figures (use the same color always for the same thing, and be friendly to color-blind people).

      Thank you for this important suggestion. We have now changed all figures to standardize all colors and chose a more color-blind friendly pallete.

      Fig 1 may benefit from separating the orange line (micro and meso) into two lines that reflect your experimental setup and results. I would mention the dryland decomposition conundrum earlier in the Introduction.

      We based our novel hypotheses on a thorough literature search. Accordingly, decomposition is expected to be positively associated with moisture, regardless of the decomposer body size. Our contribution to theory was to suggest that macro-detritivores may respond very differently to climatic conditions and dominate litter decomposition in warm arid-lands (we listed the reasons in the text). Consequently, we did not distinguish between microorganisms and mesofauna. We assumed that both groups inhabit the litter substrate and have limited adaptation to dry conditions. Our results provide strong evidence that this presumption is likely wrong and that mesofauna respond to climate very differently from micro-decomposers. Yet, we cannot use hindsight understanding to improve our original hypothesis. We now emphasize this important point at the discussion as important future direction. 

      Although we are very appreciative and pleased with the reviewer enthusiasm to highlight the importance of our work as a possible solution to the longstanding dryland decomposition conundrum, we decided not to move it to the introduction. This is because we think that our work is not centred on resolving the DDC but provides more general principles that may lead to a paradigm shift in the way ecologists study nutrient cycling across ecosystems.

      And the manuscript is full of minor grammatical errors. Some careful reading and fixing of all these minor mistakes here and there would be needed.

      We apologize and did our best to find and fix those mistakes

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I really enjoyed this manuscript from Torsekar et al on "Contrasting responses to aridity by different-sized decomposers cause similar decomposition rates across a precipitation gradient". The authors aimed to examine how climate interacts with decomposers of different size categories to influence litter decomposition. They proposed a new hypothesis: "The opposing climatic dependencies of macrofauna and that of microorganisms and mesofauna should lead to similar overall decomposition rates across precipitation gradients".

      This study emphasizes the importance as well as the contribution of different groups of organisms (micro, meso, macro, and whole community) across different seasons (summer with the following characteristics: hot with no precipitation, and winter with the following characteristics: cooler and wetter winter) along a precipitation gradient. The authors made use of 1050 litter baskets with different mesh sizes to capture decomposers contribution. They proposed a new hypothesis that was aiming to understand the "dryland decomposition conundrum". They combined their decomposition experiment with the sampling of decomposers by using pitfall traps across both experiment seasons. This study was carried out in Israel and based on a single litter species that is native to all seven sites. The authors found that microorganism contribution dominated in winter while macrofauna decomposition dominated the overall decomposition in summer. These seasonality differences combined with the differences in different decomposers groups fluctuation along precipitation resulted in similar overall decomposition rates across sites.

      I believe this manuscript has the potential to advance our knowledge on litter decomposition. Below i provide my general and specific comments.

      General comments:

      (1) Study in general is well designed and well thought beforehand,

      (2) Study aims to expand the current understanding of the dryland decomposition conundrum

      (3) The should put a caveat to the fact they only use one litter species and call for examining litter mixture in the same gradient.

      (4) Please check the way you reduce the random effects from your initial model, I have provided a better way to do so in my specific comments

      (5) For Figure 1, authors can check my comment on this and see if they could revise the figure.

      Thank you for the positive feedback and your valuable comments. We have tried to best address all comments and suggestions for improvement and clarification

      Specific comments

      Line # 57 Please write "Theory suggests" instead of "Theory suggest"

      We changed the text as suggested

      Line # 70, please write "Indeed, handful evidence shows" instead of "Indeed, handful evidence show"

      We changed the text as suggested

      Figure 1: I like this conceptual framework. I have a silly question, why is it that the slopes of the whole community at the beginning (between Hyperarid and Arid) is the same as the Macro fauna, I would think the slope should be higher as this is adding up right? and also the same goes for the decomposition of whole community later on. For me this should reflect the adding or summing up (if i am right) then the authors should think about how this could be reflected in the figure.

      We agree with your interpretation that the whole community decomposition reflects the addition by constituent decomposers. The slope of the whole community decomposition between hyper-arid and arid is slightly higher than the one of macro decomposition to reflect the additive effect of macro with meso+micro decomposition. We have now changed the figure slightly to make this point more visible (Line 106).

      Line # 111 Please make "Methods" bold as well to be consistent with others headings.

      We changed the formatting as suggested

      Line #125 and in other lines as well please replace "X" by "x" to denote multiplication.

      We changed the formatting as suggested

      Table 1 Please add "*" to climate like this "Climate*" so that the end note of the table could make sense

      Thank you for this suggestion. We have now added the asterisk referring to the note below the Table.

      Figure 2, please consider putting at line #133, mean annual precipitation (MAP), as such for line # 135 You can directly says The precipitation map ....

      We made both changes as suggested.

      Line # 138 I would not use the different units for the same values. I do understand that you want to emphasize the accuracy but i would write instead 3 +- 0.001 g

      We changed the units as suggested.

      Line # 145, how is the litter basket customized to rest at 1 cm above ground level?

      We have now clarified –that we cut-open windows one centimeter above the cage floor. The cages were positioned on the soil (line 144).

      Lines # 181-183, I like the approach of checking the necessity of having the random effects. However, it has been reported that likelihood ratio test (LRT) are not really reliable to test for random effects. I will suggest you rather use permutations instead. I think the function is confint(MODEL) you need to specify the number of permutation the higher the better but you should start with 99 first and see how the results look like if promising then you can even go to 9999. But it will need computation power and and time.

      Thank you for the suggestion. We now used a simulation-based exact test, instead of a LRT, to examine the random effect, as recommended by the authors from the “lme4” package. As recommended, we used 9999 simulations. The simulation test yielded a similar result to those originally reported (see lines 181-183).

      Line # 187, 188, 188, please do not use capital letter to start mesofauna, macrofauna and whole-community

      We changed the formatting as suggested

      Line # 205 Please add the version number of R in the text.

      We now included the version number as suggested.

      Line # 209-211, could you please check whether "then" is the word you want to use or "than"

      Our bad- we indeed meant “than” and have made the appropriate changes.

      Line # 227 and in other places as well please provide the second degree of freedom of the F test.

      Thank you for this important comment. We have now added the second degree of freedom to the relevant results (lines 229, 232).

      Figure 3 and Figure 4 show some results that are negative, can you please explain what might be the reasons behind this?

      We now explain this important point in the figures’ captions.

      Figure 5 Please add label to the x-axis.

      Thank you-we have now included a label.

      Line # 357, the sentence "... meso-decomposition, like microbial decomposition,...", I don't understand which criteria authors used to classify microbial decomposition as "meso-decomposition"?

      We now remove this potential cause of confusion by using the term ‘meso-decomposition’ to distinguish from microbial decomposition (Line 366).

      Line # 380 Kindly put "per se" in italic.

      We changed the formatting as suggested

      References

      The references format are not consistent. For example for the same journal (say Trends in Ecology and Evolution) the authors sometimes wrote the full name like at line # 36 (and also realize that "vol" should not be written as such) but wrote the abbreviations at line #42

      Our bad- we apologize and carefully checked all references to make sure the style is consistent.

    1. eLife assessment

      This valuable study identifies biallelic variants of DNAH3 in unrelated infertile men and reports infertility in DNAH3 knockout mice. The authors demonstrate that compromised DNAH3 activity decreases the expression of IDA-associated proteins in the spermatozoa of human patients and knockout mice, providing convincing evidence that DNAH3 is a novel pathogenic gene for asthenoteratozoospermia and male infertility. The study will be of substantial interest to clinicians, reproductive counselors, embryologists, and basic researchers working on infertility and assisted reproductive technology.

    2. Reviewer #1 (Public Review):

      Summary:

      Wang and colleagues identify biallelic variants of DNAH3 in four unrelated Han Chinese infertile men through whole-exome sequencing, which contributes to abnormal sperm flagellar morphology and ultrastructure. To investigate the importance of DNAH3 in male infertility, the authors generated crispant Dnah3 knockout (KO) male mice. They observed that KO mice are also infertile, showing a severe reduction in sperm movement with abnormal IDA (inner dynein arms) and mitochondrion structure. Moreover, nonfunctional DNAH3 expression decreased the expression of IDA-associated proteins in the spermatozoa of patients and KO mice, which are involved in the disruption of sperm motility. Interestingly, the infertility of patients and KO mice is rescued by intracytoplasmic sperm injection (ICSI). Taken together, the authors propose that DNAH3 is a novel pathogenic gene for asthenoterozoospermia and male infertility.

      Strengths:

      This work investigates the role of DNAH3 in sperm mobility and male infertility. By using gold-standard molecular biology techniques, the authors demonstrate with exquisite resolution the importance of DNAH3 in sperm morphology, showing strong evidence of its role in male infertility. Overall, this is a very interesting, well-written, and appealing article. All aspects of the study design and methods are well described and appropriate to address the main question of the manuscript. The conclusions drawn are consistent with the analyses conducted and supported by the data.

      Weaknesses:

      The paper is solid, and in its current form, I have not detected relevant weaknesses.

    3. Reviewer #2 (Public Review):

      Wang et al. investigated the role of dynein axonemal heavy chain 3 (DNAH3) in male infertility. They found that variants of DNAH3 were present in four infertile men, and the deficiency of DNAH3 in sperm affects sperm mobility. Additionally, they showed that Dnah3 knockout male mice are infertile. Furthermore, they demonstrated that DNAH3 influences inner dynein arms by regulating several DNAH proteins. Importantly, they showed that intracytoplasmic sperm injection (ICSI) can rescue the infertility in Dnah3 knockout mice and two patients with DNAH3 variants.

      Strengths:

      The conclusions of this paper are well-supported by data.

      Weaknesses:

      The sample/patient size is small; however, the findings are consistent with those of a recent study on DNAH3 in male infertility involving 432 patients.

    4. Reviewer #3 (Public Review):

      Summary:

      (1) To further explore the genetic basis of asthenoteratozoospermia, the authors performed whole-exome sequencing analyses among infertile males affected by asthenoteratozoospermia. Four unrelated Han Chinese patients were found to carry biallelic variations of DNAH3, a gene encoding IDA-associated protein.<br /> (2) To verify the function of IDA associated protein DNAH3, the authors generated a Dnah3-KO mouse model and revealed that the loss of DNAH3 leads to severe male infertility as a result of the severe reduction in sperm movement with the abnormal IDA and mitochondrion structures.<br /> (3) Mechanically, they confirmed decreased expression of IDA-associated proteins (including DNAH1, DNAH6 and DNALI1) in the spermatozoa from patients with DNAH3 mutations and Dnah3-KO male mice.<br /> (4) Then, they also found that male infertility caused by DNAH3 deficiency could be rescued by intracytoplasmic sperm injection (ICSI) treatment in humans and mice.

      Strengths:

      (1) In addition to existing research, the authors provided novel variants of DNAH3 as important factors leading to asthenoteratozoospermia. This further expands the spectrum of pathogenic variants in asthenoteratozoospermia.<br /> (2) By mechanistic studies, they found that DNAH3 deficiency led to decreased expression of IDA-associated proteins, which may be used to explain the disruption of sperm motility and reduced fertility caused by DNAH3 deficiency.<br /> (3) Then, successful ICSI outcomes were observed in patients with DNAH3 mutations and Dnah3 KO mice, which will provide an important reference for genetic counselling and clinical treatment of male infertility.

    5. Author response:

      The following is the authors’ response to the original reviews.

      (1) Combined Public Reviews:

      Strengths:

      This work investigates the role of DNAH3 in sperm mobility and male infertility and utilised gold-standard molecular biology techniques, showing strong evidence of its role in male infertility. All aspects of the study design and methods are well described and appropriate to address the main question of the manuscript. The conclusions drawn are consistent with the analyses conducted and supported by the data.

      We extend our sincere gratitude to the expert reviewers for their valuable comments and insightful suggestions.

      Weaknesses:

      (1.1) The manuscript lacks a comparison with previous studies on DNAH3 in the Discussion section.

      We thank the reviewers' comments.

      Recently, Meng et al. identified bi-allelic variants in DNAH3 from patients diagnosed with asthenoteratozoospermia, revealing multiple morphological defects and a disrupted "9+2" arrangement in the patients' sperm (https://doi.org/10.1093/hropen/hoae003, PMID: 38312775). Furthermore, they generated Dnah3 KO mice, which were infertile, and exhibited moderate morphological abnormalities with a normally structured “9 + 2” microtubule arrangement. In our study, we also observed similar phenotypic differences between the phenotypes of DNAH3-deficient patients and Dnah3 KO mice. These findings indicate that DNAH3 may play crucial yet distinct roles in human and mouse male reproduction. Additionally, our TEM analysis demonstrated a notable absence of IDAs in sperm from both DNAH3-deficent patients and Dnah3 KO mice, resembling the findings of Meng et al. To further investigate, we conducted immunofluorescent staining and western blotting to assess the levels of IDA-associated proteins (DNAH1, DNAH6 and DNALI1) and ODA-associated proteins (DNAH8, DNAH17 and DNAI1) in sperm samples from both our DNAH3-deficient patients and Dnah3 KO mice. Our data revealed a reduction in IDA-associated protein levels and comparable ODA-associated protein levels in comparison to normal controls and WT mice, respectively, thus corroborating the TEM observations. These results suggest that DNAH3 is involved in sperm flagellar development in human and mice, specifically through its role in the assembly of IDAs.

      Intriguingly, in our study, none of the patients with DNAH3 deficiency reported experiencing any of the principal symptoms associated with PCD. Additionally, our Dnah3 KO mice exhibited normal ciliary development in the lung, brain, eye, and oviduct. Similarly, Meng et al. did not mention any PCD symptoms in their DNAH3-deficient patients, and their Dnah3 KO mice also demonstrated normal ciliary morphology in the trachea and brain. These combined observations suggest that DNAH3 may play a more significant role in sperm flagellar development than in other motile cilia functions. Given that DNAH3 is expressed in ciliary tissues, its role in these tissues remains intriguing and could be elucidated through sequencing of larger cohorts of individuals with PCD.

      We have added these discussions in line 267 to 283, and line 300 to 303.

      (1.2) The variants of DNAH3 in four infertile men were identified through whole-exome sequencing. Providing an overview of the WES data would be beneficial to offer additional insights into whether other variants may contribute the infertility. This could also help explain why ICSI only works for two out of four patients with DNAH3 variants.

      We thank the reviewer's helpful suggestions.

      We have deposited the raw whole-exome sequencing data in the National Genomics Data Center (NGDC) (https://ngdc.cncb.ac.cn/, accession number: HRA007467). The clean reads, sequencing depth, sequencing coverage, and mapping quality of the WES on the patients are listed below (Table R1). A summary of WES has been presented in Table S1.

      Author response table 1.

      Quality of whole exome sequencing on infertile men.

      The variants identified through WES were annotated and filtered using Exomiser. Next, the variants were screened to obtain candidate variants based on the following criteria: (1) the allele frequency in the East Asian population was less than 1% in any database, including the ExAC Browser, gnomAD, and the 1000 Genomes Project; (2) the variants affected coding exons or canonical splice sites; (3) the variants were predicted to be possibly pathogenic or damaging.

      Following filtering and screening, the numbers of candidate variants obtained were as follows: Patient 1: 98, Patient 2: 101, Patient 3: 67, and Patient 4: 91(Table S1). Subsequently, we utilized the Human Protein Atlas (HPA) database (https://www.proteinatlas.org/) and Mouse Genome Informatics (MGI) database (https://informatics.jax.org/) to analyze the expression patterns of corresponding genes. Variants whose corresponding genes were not expressed in the human or mouse testis were excluded from further consideration. We also consulted OMIM database and reviewed relevant literature to exclude variants associated with diseases unrelated to male infertility. Additionally, considering the assumption of a recessive inheritance pattern, we excluded all monoallelic variants. Ultimately, only bi-allelic variants in DNAH3 (NG_052617.1, NM_017539.2, NP_060009.1) remained, suggesting as the pathogenic variants responsible for the infertility of the patients (Table S1). These DNAH3 variants were verified by Sanger sequencing on DNA from the patients' families.

      We have added the overview of the WES in Table S1 and supplemented the analysis process of WES data in line 100 to 106, and line 348 to 360.

      Additionally, we did not identify any pathogenic variants that associated with fertilization failure and early embryonic development in the two patients with failed ICSI outcomes. Therefore, these different ICSI outcomes might be attributed to additional unexplained factors from the female partners.

      (1.3) Quantification of images would help substantiate the conclusions, particularly in Figures 2, 3, 4, and 6. Improved images in Figures 3A, 4B, and 4C, would help increase confidence in the claims made.

      In response to reviewer’s valuable suggestions. We presume that the reviewer means quantification of images in Figure S6, but not Figure 6.

      We have compiled statistics for results shown in Figures 2, 3, 4, and S6. Specifically:

      - The percentages of abnormal flagellar morphology in normal control and patients, associated with the observations in Figure 2A, have been shown in Figure S1A.

      - The percentages of aberrant axonemal ultrastructure in different cross-sections of sperm from in normal control and patients, correspond to the findings in Figure 3A, have been presented in Figure S1B.

      - The percentages of abnormal flagellar morphology in WT mice and Dnah3 KO mice have been shown in Figure S7A.

      - The percentages of aberrant axonemal arrangement in different cross-sections of sperm from WT mice and Dnah3 KO mice, corresponding to the findings in Figure 4B, have been presented in Figure S7C.

      - The percentages of microtubule doublets presenting IDAs in sperm from WT mice and Dnah3 KO mice, related to Figure 4B, have been detailed in Figure S7D.

      - The percentages of malformed mitochondria in the midpiece of sperm from WT mice and Dnah3 KO mice, associated with the observations in Figure 4C, have been presented in Figure S7E.

      Moreover, we have revised Figures 3A, 4B, and 4C by replacing the unclear TEM images.

      (2) Reviewer #1 (Recommendations for The Authors):

      (2.1) Please add reference(s) that support what is claimed in lines 83-84.

      We are very grateful for the reviewer's careful comments, we have added a reference that describing the homology and expression of DNAH3.

      (2.2) In line 286, change "suggested" to "suggest".

      Thanks for the reviewer's comments. We have corrected the grammar.

      (2.3) Please add reference(s) that support what is claimed in lines 359-360.

      According to the reviewer’s suggestions, we have included references detailing the STA-PUT velocity sedimentation for isolation of single human and mouse testicular cells.

      (2.4) In line 365, change "in" to "into".

      Thanks for the reviewer’s careful comments, we have corrected this word.

      (2.5) In Figure 7, I suggest changing "patients" to "wife or partners of patient". Given that the results are indeed from the spouses of the infertile men, I suggest making this small change to keep the consistency and clarity of what the authors did.

      In response to reviewer’s kind suggestions, we have replaced “Patient” by “partners of Patient” and revised Figure 7.

      (3) Reviewer #2 (Recommendations for The Authors):

      (3.1) A summary of the WES data would be needed (i.e. number of reads, mapping quality, etc). As mentioned in the public review, it would be beneficial to present a summary of all variants identified in the data and clarify whether DNAH3 is the only gene that contains variants and whether these variants have been validated.

      Many thanks for reviewer’s kind suggestions.

      The clean reads, sequencing depth, sequencing coverage, and mapping quality of the WES on the patients are listed (see author response table 1) A summary of WES has been presented in Table S1.

      The variants identified through WES were annotated and filtered using Exomiser. Next, the variants were screened to obtain candidate variants based on the following criteria: (1) the allele frequency in the East Asian population was less than 1% in any database, including the ExAC Browser, gnomAD, and the 1000 Genomes Project; (2) the variants affected coding exons or canonical splice sites; (3) the variants were predicted to be possibly pathogenic or damaging.

      Following filtering and screening, the numbers of candidate variants obtained were as follows: Patient 1: 98, Patient 2: 101, Patient 3: 67, and Patient 4: 91(Table S1). Subsequently, we utilized the Human Protein Atlas (HPA) database (https://www.proteinatlas.org/) and Mouse Genome Informatics (MGI) database (https://informatics.jax.org/) to analyze the expression patterns of corresponding genes. Variants whose corresponding genes were not expressed in the human or mouse testis were excluded from further consideration. We also consulted OMIM database and reviewed relevant literature to exclude variants associated with diseases unrelated to male infertility. Additionally, considering the assumption of a recessive inheritance pattern, we excluded all monoallelic variants. Ultimately, only bi-allelic variants in DNAH3 (NG_052617.1, NM_017539.2, NP_060009.1) remained, suggesting as the pathogenic variants responsible for the infertility of the patients (Table S1). These DNAH3 variants were verified by Sanger sequencing on DNA from the patients' families.

      We have added the overview of the WES in Table S1 and supplemented the analysis process of WES data in line 100 to 106, and line 348 to 360.

      (3.2) It would be beneficial to the scientific community if the raw data of WES could be uploaded to a public data repository, such as GEO.

      According to the reviewer's suggestion, we have deposited the raw whole-exome sequencing data in the National Genomics Data Center (NGDC) (https://ngdc.cncb.ac.cn/, accession number: HRA007467) and described its availability in the "Data Availability" section.

      (3.3) In line 115, it is not clear how the prediction was made. Clarifying them by adding citations or describing methods that predict these pathways/functions would help strengthen it.

      Thanks for the reviewer's comments.

      SIFT, PolyPhen-2, MutationTaster and CADD assess the deleteriousness of genetic variants by considering genomic features and evolutionary constraint of the surrounding sequence or structural and chemical property altercations by the amino acid substitutions. We have added websites and references of these tools in the manuscript (line 116 to 118).

      Here are the principles of these tools.

      - The SIFT considers the position at which the change occurred and the type of amino acid change, and then to predict whether an amino acid substitution in a protein will affect protein function [https://sift.bii.a-star.edu.sg/, PMID: 12824425].

      - The PolyPhen-2 predicts the impact of an amino acid substitution on a human protein by considering several features, including sequence, phylogenetic, and structural information [http://genetics.bwh.harvard.edu/pph2/, PMID: 20354512].

      - The MutationTaster utilizes a Bayes classifier to predict the functional consequences of amino acid substitutions, intronic and synonymous changes, short insertions/deletions (indels), etc. [https://www.mutationtaster.org/, PMID: 24681721].

      - The CADD scores are based on diverse genomic features derived from surrounding sequence context, gene model annotations, evolutionary constraint, epigenetic measurements, and functional predictions [https://cadd.gs.washington.edu/, PMID: 30371827].

      (4) Reviewer #3 (Recommendations for The Authors):

      (4.1) Please ensure that all gene names used in your manuscript have been approved by the HUGO nomenclature committee. For example, "c.3590C>T (p.P1197L)" should be described as "c.3590C>T (Pro1197Leu)".

      In response to the reviewer's suggestion, we have improved all the names of gene and variants according to the HUGO nomenclature committee and HGVS Variant Nomenclature Committee, respectively.

      (4.2) For Table 1, the authors should provide the rates of abnormal sperm morphologies using the sperm cells from normal male controls.

      Thanks for the reviewer’s careful comments. Consistent with the WHO laboratory manual (World Health Organization. WHO laboratory manual for the examination and processing of human semen. World Health Organization, 2021.), our routine semen analysis establishes 4% as the minimum rate of sperm with normal morphology but does not define the maximum rate of various tail defects. However, we reviewed the routine semen analysis on the normal controls in our study, and the approximate distribution of sperm with various flagellar in the normal controls was as follows: normal flagella, 78.6%; absent flagella, 1.7%; short flagella, 0.6%; coiled flagella, 12.5%; bent flagella, 7.9%; irregular flagella, 1.8%.

      (4.3) In Table 2, "Mutation Tester" or "Mutation Taster"?

      We thank the reviewer’s comments. It should be "MutationTaster", and we have corrected this mistake in Table 2 and the manuscript.

      (4.4) In Figure 2B, the bars for patient 1 should be aligned. 

      Following the reviewer's valuable suggestion, we have ensured consistent scar bar alignment in Figure 2B and implemented this alignment throughout all other figures.

      (4.5) In Figure 3A, what about the ultrastructure for sperm heads in DNAH3 deficient sperm cell? The authors previously mentioned abnormalities in sperm head morphologies (Figure 2B) in patients with DNAH3 mutations.

      We thank the reviewers for their kind comments. A small fraction of abnormal sperm head of our patients was captured under TEM, manifested by round head with loose chromatin (Author response image 1)

      Author response image 1.

      Ultrastructure of sperm head from DNAH3-deficient infertile men. TEM analysis revealed a fraction of round head with loose chromatin in patients harboring DNAH3 variants. Scale bars, 200 nm.

      (4.6) In Figure S6, the authors should provide the rates of abnormal sperm morphologies for Dnah3 KO male mice.

      In response to the reviewer's valuable suggestion, we have quantified morphological defects in spermatozoa from both Dnah3 KO and WT mice. Compared to about 17% morphological abnormalities in sperm from WT mice, the morphological abnormalities in sperm from Dnah3 KO mice were about 37%. The results are presented in the revised Figure S7.

    1. eLife assessment

      This important study provides convincing evidence that both psychiatric dimensions (e.g. anhedonia, apathy, or depression) and chronotype (i.e., being a morning or evening person) influence effort-based decision-making. This is of importance to researchers and clinicians alike, who may make inferences about behaviour and cognition without taking into account whether the individual may be tested or observed out-of-sync with their phenotype. The current study can serve as a starting point for more targeted investigation of the relationship between chronotype, altered decision making and psychiatric illness.

    2. Reviewer #1 (Public Review):

      Summary:

      This study uses an online cognitive task to assess how reward and effort are integrated in a motivated decision-making task. In particular the authors were looking to explore how neuropsychiatric symptoms, in particular, apathy and anhedonia, and circadian rhythms affect behavior in this task. Amongst many results, they found that choice bias (the degree to which integrated reward and effort affect decisions) is reduced in individuals with greater neuropsychiatric symptoms, and late chronotypes (being an 'evening person').

      Strengths:

      The authors recruited participants to perform the cognitive task both in and out of sync with their chronotypes, allowing for the important insight that individuals with late chronotypes show a more reduced choice bias when tested in the morning.<br /> Overall, this is a well-designed and controlled online experimental study. The modelling approach is robust, with care being taken to both perform and explain to the readers the various tests used to ensure the models allow the authors to sufficiently test their hypotheses.

      Weaknesses:

      This study was not designed to test the interactions of neuropsychiatric symptoms and chronotypes on decision making, and thus can only make preliminary suggestions regarding how symptoms, chronotypes and time-of-assessment interact.

    3. Reviewer #2 (Public Review):

      Summary:

      The study combines computational modeling of choice behavior with an economic, effort-based decision-making task to assess how willingness to exert physical effort for a reward varies as a function of individual differences in apathy and anhedonia, or depression, as well as chronotype. They find an overall reduction in effort selection that scales with apathy, anhedonia and depression. They also find that later chronotypes are less likely to choose effort than earlier chronotypes and, interestingly, an interaction whereby later chronotypes are especially unwilling to exert effort in the morning versus the evening.

      Strengths:

      This study uses state-of-the-art tools for model fitting and validation and regression methods which rule out multicollinearity among symptom measures and Bayesian methods which estimate effects and uncertainty about those estimates. The replication of results across two different kinds of samples is another strength. Finally, the study provides new information about the effects not only of chronotype but also chronotype by timepoint interactions which are previously unknown in the subfield of effort-based decision-making.

      Weaknesses:

      The study has few weaknesses. The biggest drawback is that it does not provide evidence for the idea that a match between chronotype and delay matters is especially relevant for people with depression or continuous measures like anhedonia and apathy. It is unclear whether disorders further interact with chronotype and time of day to determine a bias against effort. On the other hand, the study does provide evidence that future studies should consider such interactions when examining questions about effort expenditure in psychiatric disorders.

    4. Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Mehrhof and Nord study a large dataset of participants collected online (n=958 after exclusions) who performed a simple effort-based choice task. They report that the level of effort and reward influence choices in a way that is expected from prior work. They then relate choice preferences to neuropsychiatric syndromes and, in a smaller sample (n<200), to people's circadian preferences, i.e., whether they are a morning-preferring or evening-preferring chronotype. They find relationships between the choice bias (a model parameter capturing the likelihood to accept effort-reward challenges, like an intercept) and anhedonia and apathy, as well as chronotype. People with higher anhedonia and apathy and an evening chronotype are less likely to accept challenges (more negative choice bias). People with an evening chronotype are also more reward sensitive and more likely to accept challenges in the evening, compared to the morning.

      Strengths:

      This is an interesting and well-written manuscript which replicates some known results and introduces a new consideration related to chronotype relationships which have not been explored before. It uses a large sample size and includes analyses related to transdiagnostic as well as diagnostic criteria.

      Weaknesses:

      The authors do not explore how chronotype and depression are related (does one mediate the effect of the other etc). Both variables are included in the same model in the revised article now which is a great improvement, but it also means psychopathology and circadian rhythms are treated as distinct phenomena and their relationship in predicting effort-reward preferences is not examined.

    5. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study provides solid evidence that both psychiatric dimensions (e.g. anhedonia, apathy, or depression) and chronotype (i.e., being a morning or evening person) influence effort-based decision-making. Notably, the current study does not elucidate whether there may be interactive effects of chronotype and psychiatric dimensions on decision-making. This work is of importance to researchers and clinicians alike, who may make inferences about behaviour and cognition without taking into account whether the individual may be tested or observed out-of-sync with their phenotype.

      We thank the three reviewers for their comments, and the Editors at eLife. We have taken the opportunity to revise our manuscript considerably from its original form, not least because we feel a number of the reviewers’ suggested analyses strengthen our manuscript considerably (in one instance even clarifying our conclusions, leading us to change our title)—for which we are very appreciative indeed. 

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This study uses an online cognitive task to assess how reward and effort are integrated in a motivated decision-making task. In particular the authors were looking to explore how neuropsychiatric symptoms, in particular apathy and anhedonia, and circadian rhythms affect behavior in this task. Amongst many results, they found that choice bias (the degree to which integrated reward and effort affects decisions) is reduced in individuals with greater neuropsychiatric symptoms, and late chronotypes (being an 'evening person').

      Strengths:

      The authors recruited participants to perform the cognitive task both in and out of sync with their chronotypes, allowing for the important insight that individuals with late chronotypes show a more reduced choice bias when tested in the morning.<br /> Overall, this is a well-designed and controlled online experimental study. The modelling approach is robust, with care being taken to both perform and explain to the readers the various tests used to ensure the models allow the authors to sufficiently test their hypotheses.

      Weaknesses:

      This study was not designed to test the interactions of neuropsychiatric symptoms and chronotypes on decision making, and thus can only make preliminary suggestions regarding how symptoms, chronotypes and time-of-assessment interact.

      We appreciate the Reviewer’s positive view of our research and agree with their assessment of its weaknesses; the study was not designed to assess chronotype-mental health interactions. We hope that our new title and contextualisation makes this clearer. We respond in more detail point-by-point below.

      Reviewer #2 (Public Review):

      Summary:

      The study combines computational modeling of choice behavior with an economic, effort-based decision-making task to assess how willingness to exert physical effort for a reward varies as a function of individual differences in apathy and anhedonia, or depression, as well as chronotype. They find an overall reduction in effort selection that scales with apathy and anhedonia and depression. They also find that later chronotypes are less likely to choose effort than earlier chronotypes and, interestingly, an interaction whereby later chronotypes are especially unwilling to exert effort in the morning versus the evening.

      Strengths:

      This study uses state-of-the-art tools for model fitting and validation and regression methods which rule out multicollinearity among symptom measures and Bayesian methods which estimate effects and uncertainty about those estimates. The replication of results across two different kinds of samples is another strength. Finally, the study provides new information about the effects not only of chronotype but also chronotype by timepoint interactions which are previously unknown in the subfield of effort-based decision-making.

      Weaknesses:

      The study has few weaknesses. One potential concern is that the range of models which were tested was narrow, and other models might have been considered. For example, the Authors might have also tried to fit models with an overall inverse temperature parameter to capture decision noise. One reason for doing so is that some variance in the bias parameter might be attributed to noise, which was not modeled here. Another concern is that the manuscripts discuss effort-based choice as a transdiagnostic feature - and there is evidence in other studies that effort deficits are a transdiagnostic feature of multiple disorders. However, because the present study does not investigate multiple diagnostic categories, it doesn't provide evidence for transdiagnosticity, per se.

      We appreciate Reviewer 2’s assessment of our research and agree generally with its weaknesses. We have now addressed the Reviewer’s comments regarding transdiagnosticity in the discussion of our revised version and have addressed their detailed recommendations below (see point-by-point responses).

      In addition to the below specific changes, in our Discussion section, we now have also added the following (lines 538 – 540):

      “Finally, we would like to note that as our study is based on a general population sample, rather than a clinical one. Hence, we cannot speak to transdiagnosticity on the level of multiple diagnostic categories.”

      Reviewer #3 (Public Review):

      Summary:

      In this manuscript, Mehrhof and Nord study a large dataset of participants collected online (n=958 after exclusions) who performed a simple effort-based choice task. They report that the level of effort and reward influence choices in a way that is expected from prior work. They then relate choice preferences to neuropsychiatric syndromes and, in a smaller sample (n<200), to people's circadian preferences, i.e., whether they are a morning-preferring or evening-preferring chronotype. They find relationships between the choice bias (a model parameter capturing the likelihood to accept effort-reward challenges, like an intercept) and anhedonia and apathy, as well as chronotype. People with higher anhedonia and apathy and an evening chronotype are less likely to accept challenges (more negative choice bias). People with an evening chronotype are also more reward sensitive and more likely to accept challenges in the evening, compared to the morning.

      Strengths:

      This is an interesting and well-written manuscript which replicates some known results and introduces a new consideration related to potential chronotype relationships which have not been explored before. It uses a large sample size and includes analyses related to transdiagnostic as well as diagnostic criteria. I have some suggestions for improvements.

      Weaknesses:

      (1) The novel findings in this manuscript are those pertaining to transdiagnostic and circadian phenotypes. The authors report two separate but "overlapping" effects: individuals high on anhedonia/apathy are less willing to accept offers in the task, and similarly, individuals tested off their chronotype are less willing to accept offers in the task. The authors claim that the latter has implications for studying the former. In other words, because individuals high on anhedonia/apathy predominantly have a late chronotype (but might be tested early in the day), they might accept less offers, which could spuriously look like a link between anhedonia/apathy and choices but might in fact be an effect of the interaction between chronotype and time-of-testing. The authors therefore argue that chronotype needs to be accounted for when studying links between depression and effort tasks.

      The authors argue that, if X is associated with Y and Z is associated with Y, X and Z might confound each other. That is possible, but not necessarily true. It would need to be tested explicitly by having X (anhedonia/apathy) and Z (chronotype) in the same regression model. Does the effect of anhedonia/apathy on choices disappear when accounting for chronotype (and time-of-testing)? Similarly, when adding the interaction between anhedonia/apathy, chronotype, and time-of-testing, within the subsample of people tested off their chronotype, is there a residual effect of anhedonia/apathy on choices or not?

      If the effect of anhedonia/apathy disappeared (or got weaker) while accounting for chronotype, this result would suggest that chronotype mediates the effect of anhedonia/apathy on effort choices. However, I am not sure it renders the direct effect of anhedonia/apathy on choices entirely spurious. Late chronotype might be a feature (induced by other symptoms) of depression (such as fatigue and insomnia), and the association between anhedonia/apathy and effort choices might be a true and meaningful one. For example, if the effect of anhedonia/apathy on effort choices was mediated by altered connectivity of the dorsal ACC, we would not say that ACC connectivity renders the link between depression and effort choices "spurious", but we would speak of a mechanism that explains this effect. The authors should discuss in a more nuanced way what a significant mediation by the chronotype/time-of-testing congruency means for interpreting effects of depression in computational psychiatry.

      We thank the Reviewer for pointing out this crucial weakness in the original version of our manuscript. We have now thought deeply about this and agree with the Reviewer that our original results did not warrant our interpretation that reported effects of anhedonia and apathy on measures of effort-based decision-making could potentially be spurious. At the Reviewer’s suggestion, we decided to test this explicitly in our revised version—a decision that has now deepened our understanding of our results, and changed our interpretation thereof.  

      To investigate how the effects of neuropsychiatric symptoms and the effects of circadian measures relate to each other, we have followed the Reviewer’s advice and conducted an additional series of analyses (see below). Surprisingly (to us, but perhaps not the Reviewer) we discovered that all three symptom measures (two of anhedonia, one of apathy) have separable effects from circadian measures on the decision to expend effort (note we have also re-named our key parameter ‘motivational tendency’ to address this Reviewer’s next comment that the term ‘choice bias’ was unclear). In model comparisons (based on leave-one-out information criterion which penalises for model complexity) the models including both circadian and psychiatric measures always win against the models including either circadian or psychiatric measures. In essence, this strengthens our claims about the importance of measuring circadian rhythm in effort-based tasks generally, as circadian rhythm clearly plays an important role even when considering neuropsychiatric symptoms, but crucially does not support the idea of spurious effects: statistically, circadian measures contributes separably from neuropsychiatric symptoms to the variance in effort-based decision-making. We think this is very interesting indeed, and certainly clarifies (and corrects the inaccuracy in) our original interpretation—and can only express our thanks to the Reviewer for helping us understand our effect more fully.

      In response to these new insights, we have made numerous edits to our manuscript. First, we changed the title from “Overlapping effects of neuropsychiatric symptoms and circadian rhythm on effort-based decision-making” to “Both neuropsychiatric symptoms and circadian rhythm alter effort-based decision-making”. In the remaining manuscript we now refrain from using the word ‘overlapping’ (which could be interpreted as overlapping in explained variance), and instead opted to describe the effects as parallel. We hope our new analyses, title, and clarified/improved interpretations together address the Reviewer’s valid concern about our manuscript’s main weakness.

      We detail these new analyses in the Methods section as follows (lines 800 – 814):

      “4.5.2. Differentiating between the effects of neuropsychiatric symptoms and circadian measures on motivational tendency

      To investigate how the effects of neuropsychiatric symptoms on motivational tendency (2.3.1) relate to effects of chronotype and time-of-day on motivational tendency we conducted exploratory analyses. In the subsamples of participants with an early or late chronotype (including additionally collected data), we first ran Bayesian GLMs with neuropsychiatric questionnaire scores (SHAPS, DARS, AES respectively) predicting motivational tendency, controlling for age and gender. We next added an interaction term of chronotype and time-of-day into the GLMs, testing how this changes previously observed neuropsychiatric and circadian effects on motivational tendency. Finally, we conducted a model comparison using LOO, comparing between motivational tendency predicted by a neuropsychiatric questionnaire, motivational tendency predicted by chronotype and time-of-day, and motivational tendency predicted by a neuropsychiatric questionnaire and time-of-day (for each neuropsychiatric questionnaire, and controlling for age and gender).”

      Results of the outlined analyses are reported in the results section as follows (lines 356 – 383):

      “2.5.2.1 Neuropsychiatric symptoms and circadian measures have separable effects on motivational tendency

      Exploratory analyses testing for the effects of neuropsychiatric questionnaires on motivational tendency in the subsamples of early and late chronotypes confirmed the predictive value of the SHAPS (M=-0.24, 95% HDI=[-0.42,-0.06]), the DARS (M=-0.16, 95% HDI=[-0.31,-0.01]), and the AES (M=-0.18, 95% HDI=[-0.32,-0.02]) on motivational tendency.

      For the SHAPS, we find that when adding the measures of chronotype and time-of-day back into the GLMs, the main effect of the SHAPS (M=-0.26, 95% HDI=[-0.43,-0.07]), the main effect of chronotype (M=-0.11, 95% HDI=[-0.22,-0.01]), and the interaction effect of chronotype and time-of-day (M=0.20, 95% HDI=[0.07,0.34]) on motivational tendency remain. Model comparison by LOOIC reveals motivational tendency is best predicted by the model including the SHAPS, chronotype and time-of-day as predictors, followed by the model including only the SHAPS. Note that this approach to model comparison penalizes models for increasing complexity.

      Repeating these steps with the DARS, the main effect of the DARS is found numerically, but the 95% HDI just includes 0 (M=-0.15, 95% HDI=[-0.30,0.002]). The main effect of chronotype (M=-0.11, 95% HDI=[-0.21,-0.01]), and the interaction effect of chronotype and time-of-day (M=0.18, 95% HDI=[0.05,0.33]) on motivational tendency remain. Model comparison identifies the model including the DARS and circadian measures as the best model, followed by the model including only the DARS.

      For the AES, the main effect of the AES is found (M=-0.19, 95% HDI=[-0.35,-0.04]). For the main effect of chronotype, the 95% narrowly includes 0 (M=-0.10, 95% HDI=[-0.21,0.002]), while the interaction effect of chronotype and time-of-day (M=0.20, 95% HDI=[0.07,0.34]) on motivational tendency remains. Model comparison identifies the model including the AES and circadian measures as the best model, followed by the model including only the AES.”

      We have now edited parts of our Discussion to discuss and reflect these new insights, including the following.

      Lines 399 – 402:

      “Various neuropsychiatric disorders are marked by disruptions in circadian rhythm, such as a late chronotype. However, research has rarely investigated how transdiagnostic mechanisms underlying neuropsychiatric conditions may relate to inter-individual differences in circadian rhythm.”

      Lines 475 – 480:

      “It is striking that the effects of neuropsychiatric symptoms on effort-based decision-making largely are paralleled by circadian effects on the same neurocomputational parameter. Exploratory analyses predicting motivational tendency by neuropsychiatric symptoms and circadian measures simultaneously indicate the effects go beyond recapitulating each other, but rather explain separable parts of the variance in motivational tendency.”

      Lines 528 – 532:

      “Our reported analyses investigating neuropsychiatric and circadian effects on effort-based decision-making simultaneously are exploratory, as our study design was not ideally set out to examine this. Further work is needed to disentangle separable effects of neuropsychiatric and circadian measures on effort-based decision-making.”

      Lines 543 – 550:

      “We demonstrate that neuropsychiatric effects on effort-based decision-making are paralleled by effects of circadian rhythm and time-of-day. Exploratory analyses suggest these effects account for separable parts of the variance in effort-based decision-making. It unlikely that effects of neuropsychiatric effects on effort-based decision-making reported here and in previous literature are a spurious result due to multicollinearity with chronotype. Yet, not accounting for chronotype and time of testing, which is the predominant practice in the field, could affect results.”

      (2) It seems that all key results relate to the choice bias in the model (as opposed to reward or effort sensitivity). It would therefore be helpful to understand what fundamental process the choice bias is really capturing in this task. This is not discussed, and the direction of effects is not discussed either, but potentially quite important. It seems that the choice bias captures how many effortful reward challenges are accepted overall which maybe captures general motivation or task engagement. Maybe it is then quite expected that this could be linked with questionnaires measuring general motivation/pleasure/task engagement. Formally, the choice bias is the constant term or intercept in the model for p(accept), but the authors never comment on what its sign means. If I'm not mistaken, people with higher anhedonia but also higher apathy are less likely to accept challenges and thus engage in the task (more negative choice bias). I could not find any discussion or even mention of what these results mean. This similarly pertains to the results on chronotype. In general, "choice bias" may not be the most intuitive term and the authors may want to consider renaming it. Also, given the sign of what the choice bias means could be flipped with a simple sign flip in the model equation (i.e., equating to accepting more vs accepting less offers), it would be helpful to show some basic plots to illustrate the identified differences (e.g., plotting the % accepted for people in the upper and lower tertile for the SHAPS score etc).

      We apologise that this was not made clear previously: the meaning and directionality of “choice bias” is indeed central to our results. We also thank the Reviewer for pointing out the previousely-used term “choice bias” itself might not be intuitive. We have now changed this to ‘motivational tendency’ (see below) as well as added substantial details on this parameter to the manuscript, including additional explanations and visualisations of the model as suggested by the Reviewer (new Figure 3) and model-agnostic results to aid interpretation (new Figure S3). Note the latter is complex due to our staircasing procedure (see new figure panel D further detailing our staircasing procedure in Figure 2). This shows that participants with more pronounced anhedonia are less likely to accept offers than those with low anhedonia (Fig. S3A), a model-agnostic version of our central result.

      Our changes are detailed below:

      After careful evaluation we have decided to term the parameter “motivational tendency”, hoping that this will present a more intuitive description of the parameter.

      To aid with the understanding and interpretation of the model parameters, and motivational tendency in particular, we have added the following explanation to the main text:

      Lines 149 – 155:

      “The models posit efforts and rewards are joined into a subjective value (SV), weighed by individual effort (and reward sensitivity (parameters. The subjective value is then integrated with an individual motivational tendency (a) parameter to guide decision-making. Specifically, the motivational tendency parameter determines the range at which subjective values are translated to acceptance probabilities: the same subjective value will translate to a higher acceptance probability the higher the motivational tendency.”

      Further, we have included a new figure, visualizing the model. This demonstrates how the different model parameters contribute to the model (A), and how different values on each parameter affects the model (B-D).

      We agree that plotting model agnostic effects in our data may help the reader gain intuition of what our task results mean. We hope to address this with our added section on “Model agnostic task measures relating to questionnaires”. We first followed the reviewer’s suggestion of extracting subsamples with higher and low anhedonia (as measured with the SHAPS, highest and lowest quantile) and plotted the acceptance proportion across effort and reward levels (panel A in figure below). However, due to our implemented task design, this only shows part of the picture: the staircasing procedure individualises which effort-reward combination a participant is presented with. Therefore, group differences in choice behaviour will lead to differences in the development of the staircases implemented in our task. Thus, we plotted the count of offered effort-reward combinations for the subsamples of participants with high vs. low SHAPS scores by the end of the task, averaged across staircases and participants.

      As the aspect of task development due to the implemented staircasing may not have been explained sufficiently in the main text, we have included panel (D) in figure 2.

      Further, we have added the following figure reference to the main text (lines 189 – 193):

      “The development of offered effort and reward levels across trials is shown in figure 2D; this shows that as participants generally tend to accept challenges rather than reject them, the implemented staircasing procedure develops toward higher effort and lover reward challenges.”

      To statistically test effects of model-agnostic task measures on the neuropsychiatric questionnaires, we performed Bayesian GLMs with the proportion of accepted trials predicted by SHAPS and AES. This is reported in the text as follows.

      Supplement, lines 172 – 189:

      “To explore the relationship between model agnostic task measures to questionnaire measures of neuropsychiatric symptoms, we conducted Bayesian GLMs, with the proportion of accepted trials predicted by SHAPS scores, controlling for age and gender. The proportion of accepted trials averaged across effort and reward levels was predicted by the Snaith-Hamilton Pleasure Scale (SHAPS) sum scores (M=-0.07; 95%HDI=[-0.12,-0.03]) and the Apathy Evaluation Scale (AES) sum scores (M=-0.05; 95%HDI=[-0.10,-0.002]). Note that this was not driven only by higher effort levels; even confining data to the lowest two effort levels, SHAPS has a predictive value for the proportion of accepted trials: M=-0.05; 95%HDI=[-0.07,-0.02].<br /> A visualisation of model agnostic task measures relating to symptoms is given in Fig. S4, comparing subgroups of participants scoring in the highest and lowest quartile on the SHAPS. This shows that participants with a high SHAPS score (i.e., more pronounced anhedonia) are less likely to accept offers than those with a low SHAPS score (Fig. S4A). Due to the implemented staircasing procedure, group differences can also be seen in the effort-reward combinations offered per trial. While for both groups, the staircasing procedure seems to devolve towards high effort – low reward offers, this is more pronounced in the subgroup of participants with a lower SHAPS score (Fig S4B).”

      (3) None of the key effects relate to effort or reward sensitivity which is somewhat surprising given the previous literature and also means that it is hard to know if choice bias results would be equally found in tasks without any effort component. (The only analysis related to effort sensitivity is exploratory and in a subsample of N=56 per group looking at people meeting criteria for MDD vs matched controls.) Were stimuli constructed such that effort and reward sensitivity could be separated (i.e., are uncorrelated/orthogonal)? Maybe it would be worth looking at the % accepted in the largest or two largest effort value bins in an exploratory analysis. It seems the lowest and 2nd lowest effort level generally lead to accepting the challenge pretty much all the time, so including those effort levels might not be sensitive to individual difference analyses?

      We too were initially surprised by the lack of effect of neuropsychiatric symptoms on reward and effort sensitivity. To address the Reviewer’s first comment, the nature of the ‘choice bias’ parameter (now motivational tendency) is its critical importance in the context of effort-based decision-making: it is not modelled or measured explicitly in tasks without effort (such as typical reward tasks), so it would be impossible to test this in tasks without an effort component. 

      For the Reviewer’s second comment, the exploratory MDD analysis is not our only one related to effort sensitivity: the effort sensitivity parameter is included in all of our central analyses, and (like reward sensitivity), does not relate to our measured neuropsychiatric symptoms (e.g., see page 15). Note most previous effort tasks do not include a ‘choice bias’/motivational tendency parameter, potentially explaining this discrepancy. However, our model was quantitatively superior to models without this parameter, for example with only effort- and reward-sensitivity (page 11, Fig. 3).

      Our three model parameters (reward sensitivity, effort sensitivity, and choice bias/motivational tendency) were indeed uncorrelated/orthogonal to one another (see parameter orthogonality analyses below), making it unlikely that the variance and effect captured by our motivational tendency parameter (previously termed “choice bias”) should really be attributed to reward sensitivity. As per the Reviewer’s suggestion, we also examined whether the lowest two effort levels might not be sensitive to individual differences; in fact, we found out proportion of accepted trials on the lowest effort levels alone was nevertheless predicted by anhedonia (see ceiling effect analyses below).

      Specifically, in terms of parameter orthogonality:

      When developing our task design and computational modelling approach we were careful to ensure that meaningful neurocomputational parameters could be estimated and that no spurious correlations between parameters would be introduced by modelling. By conducting parameter recoveries for all models, we showed that our modelling approach could reliably estimate parameters, and that estimated parameters are orthogonal to the other underlying parameters (as can be seen in Figure S1 in the supplement). It is thus unlikely that the variance and effect captured by our motivational tendency parameter (previously termed “choice bias”) should really be attributed to reward sensitivity.

      And finally, regarding the possibility of a ceiling effect for low effort levels:

      We agree that visual inspection of the proportion of accepted results across effort and reward values can lead to the belief that a ceiling effect prevents the two lowest effort levels from capturing any inter-individual differences. To test whether this is the case, we ran a Bayesian GLM with the SHAPS sum score predicting the proportion of accepted trials (controlling for age and gender), in a subset of the data including only trials with an effort level of 1 or 2. We found the SHAPS has a predictive value for the proportion of accepted trials in the lowest two effort levels: M=-0.05; 95%HDI=[-0.07,-0.02]). This is noted in the text as follows.

      Supplement, lines 175 – 180:

      “The proportion of accepted trials averaged across effort and reward levels was predicted by the Snaith-Hamilton Pleasure Scale (SHAPS) sum scores (M=-0.07; 95%HDI=[-0.12,-0.03]) and the Apathy Evaluation Scale (AES) sum scores (M=-0.05; 95%HDI=[-0.10,-0.002]). Note that this was not driven only by higher effort levels; even confining data to the lowest two effort levels, SHAPS has a predictive value for the proportion of accepted trials: M=-0.05; 95%HDI=[-0.07,-0.02].”

      (4) The abstract and discussion seem overstated (implications for the school system and statements on circadian rhythms which were not measured here). They should be toned down to reflect conclusions supported by the data.

      We thank the Reviewer for pointing this out, and have now removed these claims from the abstract and Discussion; we hope they now better reflect conclusions supported by these data directly.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Suggestions for improved or additional experiments, data or analyses.

      - For a non-computational audience, it would be useful to unpack the influence of the choice bias on behavior, as it is less clear how this would affect decision-making than sensitivity to effort or reward. Perhaps a figure showing accept/reject decisions when sensitivities are held and choice bias is high would be beneficial.

      We thank the Reviewer for suggesting additional explanations of the choice bias parameter to aid interpretation for non-computational readers; as per the Reviewer’s suggestion, we have now included additional explanations and visualisations (Figure 3) to make this as clear as possible. Please note also that, in response to one of the other Reviewers and after careful considerations, we have decided to rename the “choice bias” parameter to “motivational tendency”, hoping this will prove more intuitive.

      To aid with the understanding and interpretation of this and the other model parameters, we have added the following explanation to the main text.

      Lines 149 – 155:

      “The models posit efforts and rewards are joined into a subjective value (SV), weighed by individual effort (and reward sensitivity (parameters. The subjective value is then integrated with an individual motivational tendency (a) parameter to guide decision-making. Specifically, the motivational tendency parameter determines the range at which subjective values are translated to acceptance probabilities: the same subjective value will translate to a higher acceptance probability the higher the motivational tendency.”

      Additionally, we add the following explanation to the Methods section.

      Lines 698 – 709:

      First, a cost function transforms costs and rewards associated with an action into a subjective value (SV):

      with and for reward and effort sensitivity, and ℛ and 𝐸 for reward and effort. Higher effort and reward sensitivity mean the SV is more strongly influenced by changes in effort and reward, respectively (Fig. 3B-C). Hence, low effort and reward sensitivity mean the SV, and with that decision-making, is less guided by effort and reward offers, as would be in random decision-making.

      This SV is then transformed to an acceptance probability by a softmax function:

      with for the predicted acceptance probability and 𝛼 for the intercept representing motivational tendency. A high motivational tendency means a subjects has a tendency, or bias, to accept rather than reject offers (Fig. 3D).

      Our new figure (panels A-D in figure 3) visualizes the model. This demonstrates how the different model parameters come at play in the model (A), and how different values on each parameter affects the model (B-D).

      - The early and late chronotype groups have significant differences in ages and gender. Additional supplementary analysis here may mitigate any concerns from readers.

      The Reviewer is right to notice that our subsamples of early and late chronotypes differ significantly in age and gender, but it important to note that all our analyses comparing these two groups take this into account, statistically controlling for age and gender. We regret that this was previously only mentioned in the Methods section, so this information was not accessible where most relevant. To remedy this, we have amended the Results section as follows.

      Lines 317 – 323:

      “Bayesian GLMs, controlling for age and gender, predicting task parameters by time-of-day and chronotype showed effects of chronotype on reward sensitivity (i.e. those with a late chronotype had a higher reward sensitivity; M= 0.325, 95% HDI=[0.19,0.46]) and motivational tendency (higher in early chronotypes; M=-0.248, 95% HDI=[-0.37,-0.11]), as well as an interaction between chronotype and time-of-day on motivational tendency (M=0.309, 95% HDI=[0.15,0.48]).”

      (2) Recommendations for improving the writing and presentation.

      - I found the term 'overlapping' a little jarring. I think the authors use it to mean both neuropsychiatric symptoms and chronotypes affect task parameters, but they are are not tested to be 'separable', nor is an interaction tested. Perhaps being upfront about how interactions are not being tested here (in the introduction, and not waiting until the discussion) would give an opportunity to operationalize this term.

      We agree with the Reviewer that our previously-used term “overlapping” was not ideal: it may have been misleading, and was not necessarily reflective of the nature of our findings. We now state explicitly that we are not testing an interaction between neuropsychiatric symptoms and chronotypes in our primary analyses. Additionally, following suggestions made by Reviewer 3, we ran new exploratory analyses to investigate how the effects of neuropsychiatric symptoms and circadian measures on motivational tendency relate to one another. These results in fact show that all three symptom measures have separable effects from circadian measures on motivational tendency. This supports the Reviewer’s view that ‘overlapping’ was entirely the wrong word—although it nevertheless shows the important contribution of circadian rhythm as well as neuropsychiatric symptoms in effort-based decision-making. We have changed the manuscript throughout to better describe this important, more accurate interpretation of our findings, including replacing the term “overlapping”. We changed the title from “Overlapping effects of neuropsychiatric symptoms and circadian rhythm on effort-based decision-making” to “Both neuropsychiatric symptoms and circadian rhythm alter effort-based decision-making”.

      To clarify the intention of our primary analyses, we have added the following to the last paragraph of the introduction.

      Lines 107 – 112:

      “Next, we pre-registered a follow-up experiment to directly investigate how circadian preference interacts with time-of-day on motivational decision-making, using the same task and computational modelling approach. While this allows us to test how circadian effects on motivational decision-making compare to neuropsychiatric effects, we do not test for possible interactions between neuropsychiatric symptoms and chronobiology.”

      We detail our new analyses in the Methods section as follows.

      Lines 800 – 814:

      “4.5.2 Differentiating between the effects of neuropsychiatric symptoms and circadian measures on motivational tendency

      To investigate how the effects of neuropsychiatric symptoms on motivational tendency (2.3.1) relate to effects of chronotype and time-of-day on motivational tendency we conducted exploratory analyses. In the subsamples of participants with an early or late chronotype (including additionally collected data), we first ran Bayesian GLMs with neuropsychiatric questionnaire scores (SHAPS, DARS, AES respectively) predicting motivational tendency, controlling for age and gender. We next added an interaction term of chronotype and time-of-day into the GLMs, testing how this changes previously observed neuropsychiatric and circadian effects on motivational tendency. Finally, we conducted a model comparison using LOO, comparing between motivational tendency predicted by a neuropsychiatric questionnaire, motivational tendency predicted by chronotype and time-of-day, and motivational tendency predicted by a neuropsychiatric questionnaire and time-of-day (for each neuropsychiatric questionnaire, and controlling for age and gender).”

      Results of the outlined analyses are reported in the Results section as follows.

      Lines 356 – 383:

      “2.5.2.1 Neuropsychiatric symptoms and circadian measures have separable effects on motivational tendency

      Exploratory analyses testing for the effects of neuropsychiatric questionnaires on motivational tendency in the subsamples of early and late chronotypes confirmed the predictive value of the SHAPS (M=-0.24, 95% HDI=[-0.42,-0.06]), the DARS (M=-0.16, 95% HDI=[-0.31,-0.01]), and the AES (M=-0.18, 95% HDI=[-0.32,-0.02]) on motivational tendency.

      For the SHAPS, we find that when adding the measures of chronotype and time-of-day back into the GLMs, the main effect of the SHAPS (M=-0.26, 95% HDI=[-0.43,-0.07]), the main effect of chronotype (M=-0.11, 95% HDI=[-0.22,-0.01]), and the interaction effect of chronotype and time-of-day (M=0.20, 95% HDI=[0.07,0.34]) on motivational tendency remain. Model comparison by LOOIC reveals motivational tendency is best predicted by the model including the SHAPS, chronotype and time-of-day as predictors, followed by the model including only the SHAPS. Note that this approach to model comparison penalizes models for increasing complexity.

      Repeating these steps with the DARS, the main effect of the DARS is found numerically, but the 95% HDI just includes 0 (M=-0.15, 95% HDI=[-0.30,0.002]). The main effect of chronotype (M=-0.11, 95% HDI=[-0.21,-0.01]), and the interaction effect of chronotype and time-of-day (M=0.18, 95% HDI=[0.05,0.33]) on motivational tendency remain. Model comparison identifies the model including the DARS and circadian measures as the best model, followed by the model including only the DARS.

      For the AES, the main effect of the AES is found (M=-0.19, 95% HDI=[-0.35,-0.04]). For the main effect of chronotype, the 95% narrowly includes 0 (M=-0.10, 95% HDI=[-0.21,0.002]), while the interaction effect of chronotype and time-of-day (M=0.20, 95% HDI=[0.07,0.34]) on motivational tendency remains. Model comparison identifies the model including the AES and circadian measures as the best model, followed by the model including only the AES.”

      In addition to the title change, we edited our Discussion to discuss and reflect these new insights, including the following.

      Lines 399 – 402:

      “Various neuropsychiatric disorders are marked by disruptions in circadian rhythm, such as a late chronotype. However, research has rarely investigated how transdiagnostic mechanisms underlying neuropsychiatric conditions may relate to inter-individual differences in circadian rhythm.”

      Lines 475 – 480:

      “It is striking that the effects of neuropsychiatric symptoms on effort-based decision-making largely are paralleled by circadian effects on the same neurocomputational parameter. Exploratory analyses predicting motivational tendency by neuropsychiatric symptoms and circadian measures simultaneously indicate the effects go beyond recapitulating each other, but rather explain separable parts of the variance in motivational tendency.”

      Lines 528 – 532:

      “Our reported analyses investigating neuropsychiatric and circadian effects on effort-based decision-making simultaneously are exploratory, as our study design was not ideally set out to examine this. Further work is needed to disentangle separable effects of neuropsychiatric and circadian measures on effort-based decision-making.”

      Lines 543 – 550:

      “We demonstrate that neuropsychiatric effects on effort-based decision-making are paralleled by effects of circadian rhythm and time-of-day. Exploratory analyses suggest these effects account for separable parts of the variance in effort-based decision-making. It unlikely that effects of neuropsychiatric effects on effort-based decision-making reported here and in previous literature are a spurious result due to multicollinearity with chronotype. Yet, not accounting for chronotype and time of testing, which is the predominant practice in the field, could affect results.”

      - A minor point, but it could be made clearer that many neurotransmitters have circadian rhythms (and not just dopamine).

      We agree this should have been made clearer, and have added the following to the Introduction.

      Lines 83 – 84:

      “Bi-directional links between chronobiology and several neurotransmitter systems have been reported, including dopamine47.

      (47) Kiehn, J.-T., Faltraco, F., Palm, D., Thome, J. & Oster, H. Circadian Clocks in the Regulation of Neurotransmitter Systems. Pharmacopsychiatry 56, 108–117 (2023).”

      - Making reference to other studies which have explored circadian rhythms in cognitive tasks would allow interested readers to explore the broader field. One such paper is: Bedder, R. L., Vaghi, M. M., Dolan, R. J., & Rutledge, R. B. (2023). Risk taking for potential losses but not gains increases with time of day. Scientific reports, 13(1), 5534, which also includes references to other similar studies in the discussion.

      We thank the Reviewer for pointing out that we failed to cite this relevant work. We have now included it in the Introduction as follows.

      Lines 97 – 98:

      “A circadian effect on decision-making under risk is reported, with the sensitivity to losses decreasing with time-of-day66.

      (66) Bedder, R. L., Vaghi, M. M., Dolan, R. J. & Rutledge, R. B. Risk taking for potential losses but not gains increases with time of day. Sci Rep 13, 5534 (2023).”

      (3) Minor corrections to the text and figures.

      None, clearly written and structured. Figures are high quality and significantly aid understanding.

      Reviewer #2 (Recommendations For The Authors):

      I did have a few more minor comments:

      - The manuscript doesn't clarify whether trials had time limits - so that participants might fail to earn points - or instead they did not and participants had to continue exerting effort until they were done. This is important to know since it impacts on decision-strategies and behavioral outcomes that might be analyzed. For example, if there is no time limit, it might be useful to examine the amount of time it took participants to complete their effort - and whether that had any relationship to choice patterns or symptomatology. Or, if they did, it might be interesting to test whether the relationship between choices and exerted effort depended on symptoms. For example, someone with depression might be less willing to choose effort, but just as, if not more likely to successfully complete a trial once it is selected.

      We thank the Reviewer for pointing out this important detail in the task design, which we should have made clearer. The trials did indeed have a time limit which was dependent on the effort level. To clarify this in the manuscript, we have made changes to Figure 2 and the Methods section. We agree it would be interesting to explore whether the exerted effort in the task related to symptoms. We explored this in our data by predicting the participant average proportion of accepted but failed trials by SHAPS score (controlling for age and gender). We found no relationship: M=0.01, 95% HDI=[-0.001,0.02]. However, it should be noted that the measure of proportion of failed trials may not be suitable here, as there are only few accepted but failed trials (M = 1.3% trials failed, SD = 3.50). This results from several task design characteristics aimed at preventing subjects from failing accepted trials, to avoid confounding of effort discounting with risk discounting. As an alternative measure, we explored the extent to which participants went “above and beyond” the target in accepted trials. Specifically, considering only accepted and succeeded trials, we computed the factor by which the required number of clicks was exceeded (i.e., if a subject clicked 15 times when 10 clicks were required the factor would be 1.3), averaging across effort and reward level. We then conducted a Bayesian GLM to test whether this subject wise click-exceedance measure can be predicted by apathy or anhedonia, controlling for age and gender. We found neither the SHAPS (M=-0.14, 95% HDI=[-0.43,0.17]) nor the AES (M=0.07, 95% HDI=[-0.26,0.41]) had a predictive value for the amount to which subjects exert “extra effort”. We have now added this to the manuscript.

      In Figure 2, which explains the task design in the results section, we have added the following to the figure description.

      Lines 161 – 165:

      “Each trial consists of an offer with a reward (2,3,4, or 5 points) and an effort level (1,2,3, or 4, scaled to the required clicking speed and time the clicking must be sustained for) that subjects accept or reject. If accepted, a challenge at the respective effort level must be fulfilled for the required time to win the points.”

      In the Methods section, we have added the following.

      Lines 617 – 622:

      “We used four effort-levels, corresponding to a clicking speed at 30% of a participant’s maximal capacity for 8 seconds (level 1), 50% for 11 seconds (level 2), 70% for 14 seconds (level 3), and 90% for 17 seconds (level 4). Therefore, in each trial, participants had to fulfil a certain number of mouse clicks (dependent on their capacity and the effort level) in a specific time (dependent on the effort level).”

      In the Supplement, we have added the additional analyses suggested by the Reviewer.

      Lines 195 – 213:

      “3.2 Proportion of accepted but failed trials

      For each participant, we computed the proportion of trial in which an offer was accepted, but the required effort then not fulfilled (i.e., failed trials). There was no relationship between average proportion of accepted but failed trials and SHAPS score (controlling for age and gender): M=0.01, 95% HDI=[-0.001,0.02]. However, there are intentionally few accepted but failed trials (M = 1.3% trials failed, SD = 3.50). This results from several task design characteristics aimed at preventing subjects from failing accepted trials, to avoid confounding of effort discounting with risk discounting.”

      “3.3 Exertion of “extra effort”

      We also explored the extent to which participants went “above and beyond” the target in accepted trials. Specifically, considering only accepted and succeeded trials, we computed the factor by which the required number of clicks was exceeded (i.e., if a subject clicked 15 times when 10 clicks were required the factor would be 1.3), averaging across effort and reward level. We then conducted a Bayesian GLM to test whether this subject wise click-exceedance measure can be predicted by apathy or anhedonia, controlling for age and gender. We found neither the SHAPS (M=-0.14, 95% HDI=[-0.43,0.17]) nor the AES (M=0.07, 95% HDI=[-0.26,0.41]) had a predictive value for the amount to which subjects exert “extra effort”.”

      - Perhaps relatedly, there is evidence that people with depression show less of an optimism bias in their predictions about future outcomes. As such, they show more "rational" choices in probabilistic decision tasks. I'm curious whether the Authors think that a weaker choice bias among those with stronger depression/anhedonia/apathy might be related. Also, are choices better matched with actual effort production among those with depression?

      We think this is a very interesting comment, but unfortunately feel our manuscript cannot properly speak to it: as in our response to the previous comment, our exploratory analysis linking the proportion of accepted but failed trials to anhedonia symptoms (i.e. less anhedonic people making more optimistic judgments of their likelihood of success) did not show a relationship between the two. However, this null finding may be the result of our task design which is not laid out to capture such an effect (in fact to minimize trials of this nature). We have added to the Discussion section.

      Lines 442 – 445:

      “It is possible that a higher motivational tendency reflects a more optimistic assessment of future task success, in line with work on the optimism bias95; however our task intentionally minimized unsuccessful trials by titrating effort and reward; future studies should explore this more directly.

      (95) Korn, C. W., Sharot, T., Walter, H., Heekeren, H. R. & Dolan, R. J. Depression is related to an absence of optimistically biased belief updating about future life events. Psychological Medicine 44, 579–592 (2014).”

      - The manuscript does not clarify: How did the Authors ensure that each subject received each effort-reward combination at least once if a given subject always accepted or always rejected offers?

      We have made the following edit to the Methods section to better explain this aspect of our task design.

      Lines 642 – 655:

      “For each subject, trial-by-trial presentation of effort-reward combinations were made semi-adaptively by 16 randomly interleaved staircases. Each of the 16 possible offers (4 effort-levels x 4 reward-levels) served as the starting point of one of the 16 staircase. Within each staircase, after a subject accepted a challenge, the next trial’s offer on that staircase was adjusted (by increasing effort or decreasing reward). After a subject rejected a challenge, the next offer on that staircase was adjusted by decreasing effort or increasing reward. This ensured subjects received each effort-reward combination at least once (as each participant completed all 16 staircases), while individualizing trial presentation to maximize the trials’ informative value. Therefore, in practice, even in the case of a subject rejecing all offers (and hence the staircasing procedures always adapting by decreasing effort or increasing reward), the full range of effort-reward combinations will be represented in the task across the startingpoints of all staircases (and therefore before adaption takeplace).”

      - The word "metabolic" is misspelled in Table 1

      - Figure 2 is missing panel label "C"

      - The word "effort" is repeated on line 448.

      We thank the Reviewer for their attentive reading of our manuscript and have corrected the mistakes mentioned.

      Reviewer #3 (Recommendations For The Authors):

      It is a bit difficult to get a sense of people's discounting from the plots provided. Could the authors show a few example individuals and their fits (i.e., how steep was effort discounting on average and how much variance was there across individuals; maybe they could show the mean discount function or some examples etc)

      We appreciate very much the Reviewer's suggestion to visualise our parameter estimates within and across individuals. We have implemented this in Figure .S2

      It would be helpful if correlations between the various markers used as dependent variables (SHAPS, DARS, AES, chronotype etc) could plotted as part of each related figure (e.g., next to the relevant effects shown).

      We agree with the Reviewer that a visual representation of the various correlations between dependent variables would be a better and more assessable communication than our current paragraph listing the correlations. We have implemented this by adding a new figure plotting all correlations in a heat map, with asterisks indicating significance.

      The authors use the term "meaningful relationship" - how is this defined? If undefined, maybe consider changing (do they mean significant?)

      We understand how our use of the term “(no) meaningful relationship” was confusing here. As we conducted most analyses in a Bayesian fashion, this is a formal definition of ‘meaningful’: the 95% highest density interval does not span across 0. However, we do not want this to be misunderstood as frequentist “significance” and agree clarity can be improved here, To avoid confusion, we have amended the manuscript where relevant (i.e., we now state “we found a (/no) relationship / effect” rather than “we found a meaningful relationship”.

      The authors do not include an inverse temperature parameter in their discounting models-can they motivate why? If a participant chose nearly randomly, which set of parameter values would they get assigned?

      Our decision to not include an inverse temperature parameter was made after an extensive simulation-based investigation of different models and task designs. A series of parameter recovery studies including models with an inverse temperature parameter revealed the inverse temperature parameter could not be distinguished from the reward sensitivity parameter. Specifically, inverse temperature seemed to capture the variance of the true underlying reward sensitivity parameter, leading to confounding between the two. Hence, including both reward sensitivity and inverse temperature would not have allowed us to reliably estimate either parameter. As our pre-registered hypotheses related to the reward sensitivity parameter, we opted to include models with the reward sensitivity parameter rather than the inverse temperature parameter in our model space. We have now added these simulations to our supplement.

      Nevertheless, we believe our models can capture random decision-making. The parameters of effort and reward sensitivity capture how sensitive one is to changes in effort/reward level. Hence, random decision-making can be interpreted as low effort and reward sensitivity, such that one’s decision-making is not guided by changes in effort and reward magnitude. With low effort/reward sensitivity, the motivational tendency parameter (previously “choice bias”) would capture to what extend this random decision-making is biased toward accepting or rejecting offers.

      The simulation results are now detailed in the Supplement.

      Lines 25 – 46:

      “1.2.1 Parameter recoveries including inverse temperature

      In the process of task and model space development, we also considered models incorportating an inverse temperature paramater. To this end, we conducted parameter recoveries for four models, defined in Table S3.

      Parameter recoveries indicated that, parameters can be recovered reliably in model 1, which includes only effort sensitivity ( ) and inverse temperature as free parameters (on-diagonal correlations: .98 > r > .89, off-diagonal correlations: .04 > |r| > .004). However, as a reward sensitivity parameter is added to the model (model 2), parameter recovery seems to be compromised, as parameters are estimated less accurately (on-diagonal correlations: .80 > r > .68), and spurious correlations between parameters emerge (off-diagonal correlations: .40 > |r| > .17). This issue remains when motivational tendency is added to the model (model 4; on-diagonal correlations: .90 > r > .65; off-diagonal correlations: .28 > |r| > .03), but not when inverse temperature is modelled with effort sensitivity and motivational tendency, but not reward sensitivity (model 3; on-diagonal correlations: .96 > r > .73; off-diagonal correlations: .05 > |r| > .003).

      As our pre-registered hypotheses related to the reward sensitivity parameter, we opted to include models with the reward sensitivity parameter rather than the inverse temperature parameter in our model space.”

      And we now discuss random decision-making specifically in the Methods section.

      Lines 698 – 709:

      “First, a cost function transforms costs and rewards associated with an action into a subjective value (SV):

      with and for reward and effort sensitivity, and  and  for reward and effort. Higher effort and reward sensitivity mean the SV is more strongly influenced by changes in effort and reward, respectively (Fig. 3B-C). Hence, low effort and reward sensitivity mean the SV, and with that decision-making, is less guided by effort and reward offers, as would be in random decision-making.

      This SV is then transformed to an acceptance probability by a softmax function:

      with for the predicted acceptance probability and  for the intercept representing motivational tendency. A high motivational tendency means a subjects has a tendency, or bias, to accept rather than reject offers (Fig. 3D).”

      The pre-registration mentions effects of BMI and risk of metabolic disease-those are briefly reported the in factor loadings, but not discussed afterwards-although the authors stated hypotheses regarding these measures in their preregistration. Were those hypotheses supported?

      We reported these results (albeit only briefly) in the factor loadings resulting from our PLS regression and results from follow-up GLMs (see below). We have now amended the Discussion to enable further elaboration on whether they confirmed our hypotheses (this evidence was unclear, but we have subsequently followed up in a sample with type-2 diabetes, who also show reduced motivational tendency).

      Lines 258 – 261:

      “For the MEQ (95%HDI=[-0.09,0.06]), MCTQ (95%HDI=[-0.17,0.05]), BMI (95%HDI=[-0.19,0.01]), and FINDRISC (95%HDI=[-0.09,0.03]) no relationship with motivational tendency was found, consistent with the smaller magnitude of reported component loadings from the PLS regression.”

      We have added the following paragraph to our discussion.

      Lines 491 – 502:

      “To our surprise, we did not find statistical evidence for a relationship between effort-based decision-making and measures of metabolic health (BMI and risk for type-2 diabetes). Our analyses linking BMI to motivational tendency reveal a numeric effect in line with our hypothesis: a higher BMI relating to a lower motivational tendency. However, the 95% HDI for this effect narrowly included zero (95%HDI=[-0.19,0.01]). Possibly, our sample did not have sufficient variance in metabolic health to detect dimensional metabolic effects in a current general population sample. A recent study by our group investigates the same neurocomputational parameters of effort-based decision-making in participants with type-2 diabetes and non-diabetic controls matched by age, gender, and physical activity105. We report a group effect on the motivational tendency parameter, with type-2 diabetic patients showing a lower tendency to exert effort for reward.”

      “(105) Mehrhof, S. Z., Fleming, H. A. & Nord, C. A cognitive signature of metabolic health in effort-based decision-making. Preprint at https://doi.org/10.31234/osf.io/4bkm9 (2024).”

      R-values are indicated as a range (e.g., from 0.07-0.72 for the last one in 2.1 which is a large range). As mentioned above, the full correlation matrix should be reported in figures as heatmaps.

      We agree with the Reviewer that a heatmap is a better way of conveying this information – see Figure 1 in response to their previous comment.  

      The answer on whether data was already collected is missing on the second preregistration link. Maybe this is worth commenting on somewhere in the manuscript.

      This question appears missing because, as detailed in the manuscript, we felt that technically some data *was* already collected by the time our second pre-registration was posted. This is because the second pre-registration detailed an additional data collection, with the goal of extending data from the original dataset to include extreme chronotypes and increase precision of analyses. To avoid any confusion regarding the lack of reply to this question in the pre-registration, we have added the following disclaimer to the description of the second pre-registration:

      “Please note the lack of response to the question regarding already collected data. This is because the data collection in the current pre-registration extends data from the original dataset to increase the precision of analyses. While this original data is already collected, none of the data collection described here has taken place.”

      Some referencing is not reflective of the current state of the field (e.g., for effort discounting: Sugiwaka et al., 2004 is cited). There are multiple labs that have published on this since then including Philippe Tobler's and Sven Bestmann's groups (e.g., Hartmann et al., 2013; Klein-Flügge et al., Plos CB, 2015).

      We agree absolutely, and have added additional, more recent references on effort discounting.

      Lines 67 – 68:

      “Higher costs devalue associated rewards, an effect referred to as effort-discounting33–37.”

      (33) Sugiwaka, H. & Okouchi, H. Reformative self-control and discounting of reward value by delay or effort1. Japanese Psychological Research 46, 1–9 (2004).

      (34) Hartmann, M. N., Hager, O. M., Tobler, P. N. & Kaiser, S. Parabolic discounting of monetary rewards by physical effort. Behavioural Processes 100, 192–196 (2013).

      (35) Klein-Flügge, M. C., Kennerley, S. W., Saraiva, A. C., Penny, W. D. & Bestmann, S. Behavioral Modeling of Human Choices Reveals Dissociable Effects of Physical Effort and Temporal Delay on Reward Devaluation. PLOS Computational Biology 11, e1004116 (2015).

      (36) Białaszek, W., Marcowski, P. & Ostaszewski, P. Physical and cognitive effort discounting across different reward magnitudes: Tests of discounting models. PLOS ONE 12, e0182353 (2017).

      (37) Ostaszewski, P., Bąbel, P. & Swebodziński, B. Physical and cognitive effort discounting of hypothetical monetary rewards. Japanese Psychological Research 55, 329–337 (2013).

      There are lots of typos throughout (e.g., Supplementary martial, Mornignness etc)

      We thank the Reviewer for their attentive reading of our manuscript and have corrected our mistakes.

      In Table 1, it is not clear what the numbers given in parentheses are. The figure note mentions SD, IQR, and those are explicitly specified for some rows, but not all.

      After reviewing Table 1 we understand the comment regarding the clarity of the number in parentheses. In our original manuscript, for some variables, numbers were given per category (e.g. for gender and ethnicity), rather than per row, in which case the parenthetical statistic was indicated in the header row only. However, we now see that the clarity of the table would have been improved by adding the reported statistic for each row—we have corrected this.

      In Figure 1C, it would be much more helpful if the different panels were combined into one single panel (using differently coloured dots/lines instead of bars).

      We agree visualizing the proportion of accepted trials across effort and reward levels in one single panel aids interpretability. We have implemented it in the following plot (now Figure 2C).

      In Sections 2.2.1 and 4.2.1, the authors mention "mixed-effects analysis of variance (ANOVA) of repeated measures" (same in the preregistration). It is not clear if this is a standard RM-ANOVA (aggregating data per participant per condition) or a mixed-effects model (analysing data on a trial-by-trial level). This model seems to only include within-subjects variable, so it isn't a "mixed ANOVA" mixing within and between subjects effects.

      We apologise that our use of the term "mixed-effects analysis of variance (ANOVA) of repeated measures" is indeed incorrectly applied here. We aggregate data per participant and effort-by-reward combination, meaning there are no between-subject effects tested. We have corrected this to “repeated measures ANOVA”.

      In Section 2.2.2, the authors write "R-hats>1.002" but probably mean "R-hats < 1.002". ESS is hard to evaluate unless the total number of samples is given.

      We thank the Reviewer for noticing this mistake and have corrected it in the manuscript.

      In Section 2.3, the inference criterion is unclear. The authors first report "factor loadings" and then perform a permutation test that is not further explained. Which of these factors are actually needed for predicting choice bias out of chance? The permutation test suggests that the null hypothesis is just "none of these measures contributes anything to predicting choice bias", which is already falsified if only one of them shows an association with choice bias. It would be relevant to know for which measures this is the case. Specifically, it would be relevant to know whether adding circadian measures into a model that already contains apathy/anhedonia improves predictive performance.

      We understand the Reviewer’s concerns regarding the detail of explanation we have provided for this part of our analysis, but we believe there may have been a misunderstanding regarding the partial least squares (PLS) regression. Rather than identifying a number of factors to predict the outcome variable, a PLS regression identifies a model with one or multiple components, with various factor loadings of differing magnitude. In our case, the PLS regression identified a model with one component to best predict our outcome variable (motivational tendency, which in our previous various we called choice bias). This one component had factor loadings of our questionnaire-based measures, with measures of apathy and anhedonia having highest weights, followed by lesser weighted factor loadings by measures of circadian rhythm and metabolic health. The permutation test tests whether this component (consisting of the combination of factor loadings) can predict the outcome variable out of sample.

      We hope we have improved clarity on this in the manuscript by making the following edits to the Results section.

      Lines 248 – 251:

      “Permutation testing indicated the predictive value of the resulting component (with factor loadings described above) was significant out-of-sample (root-mean-squared error [RMSE]=0.203, p=.001).”

      Further, we hope to provide a more in-depth explanation of these results in the Methods section.

      Lines 755 – 759:

      “Statistical significance of obtained effects (i.e., the predictive accuracy of the identified component and factor loadings) was assessed by permutation tests, probing the proportion of root-mean-squared errors (RMSEs) indicating stronger or equally strong predictive accuracy under the null hypothesis.”

      In Section 2.5, the authors simply report "that chronotype showed effects of chronotype on reward sensitivity", but the direction of the effect (higher reward sensitivity in early vs. late chronotype) remains unclear.

      We thank the Reviewer for pointing this out. While we did report the direction of effect, this was only presented in the subsequent parentheticals and could have been made much clearer. To assist with this, we have made the following addition to the text.

      Lines 317 – 320:

      “Bayesian GLMs, controlling for age and gender, predicting task parameters by time-of-day and chronotype showed effects of chronotype on reward sensitivity (i.e. those with a late chronotype had a higher reward sensitivity; M= 0.325, 95% HDI=[0.19,0.46])”

      In Section 4.2, the authors write that they "implemented a previously-described procedure using Prolific pre-screeners", but no reference to this previous description is given.

      We thank the Reviewer for bringing our attention to this missing reference, which has now been added to the manuscript.

      In Supplementary Table S2, only the "on-diagonal correlations" are given, but off-diagonal correlations (indicative of trade-offs between parameters) would also be informative.

      We agree with the Reviewer that off-diagonal correlations between underlying and recovered parameters are crucial to assess confounding between parameters during model estimation. We reported this in figure S1D, where we present the full correlation matric between underlying and recovered parameters in a heatmap. We have now noticed that this plot was missing axis labels, which have been added now.

      I found it somewhat difficult to follow the results section without having read the methods section beforehand. At the beginning of the Results section, could the authors briefly sketch the outline of their study? Also, given they have a pre-registration, could the authors introduce each section with a statement of what they expected to find, and close with whether the data confirmed their expectations? In the current version of the manuscript, many results are presented without much context of what they mean.

      We agree a brief outline of the study procedure before reporting the results would be beneficial to following the subsequently text and have added the following to the end of our Introduction.

      Lines 101 – 106:

      “Here, we tested the relationship between motivational decision-making and three key neuropsychiatric syndromes: anhedonia, apathy, and depression, taking both a transdiagnostic and categorical (diagnostic) approach. To do this, we validate a newly developed effort-expenditure task, designed for online testing, and gamified to increase engagement. Participants completed the effort-expenditure task online, followed by a series of self-report questionnaires.”

      We have added references to our pre-registered hypotheses at multiple points in our manuscript.

      Lines 185 – 187:

      “In line with our pre-registered hypotheses, we found significant main effects for effort (F(1,14367)=4961.07, p<.0001) and reward (F(1,14367)=3037.91, p<.001), and a significant interaction between the two (F(1,14367)=1703.24, p<.001).”

      Lines 215 – 221:

      “Model comparison by out-of-sample predictive accuracy identified the model implementing three parameters (motivational tendency a, reward sensitivity , and effort sensitivity ), with a parabolic cost function (subsequently referred to as the full parabolic model) as the winning model (leave-one-out information criterion [LOOIC; lower is better] = 29734.8; expected log posterior density [ELPD; higher is better] = -14867.4; Fig. 31ED). This was in line with our pre-registered hypotheses.”

      Lines 252 – 258:

      “Bayesian GLMs confirmed evidence for psychiatric questionnaire measures predicting motivational tendency (SHAPS: M=-0.109; 95% highest density interval (HDI)=[-0.17,-0.04]; AES: M=-0.096; 95%HDI=[-0.15,-0.03]; DARS: M=-0.061; 95%HDI=[-0.13,-0.01]; Fig. 4A). Post-hoc GLMs on DARS sub-scales showed an effect for the sensory subscale (M=-0.050; 95%HDI=[-0.10,-0.01]). This result of neuropsychiatric symptoms predicting a lower motivational tendency is in line with our pre-registered hypothesis.”

      Lines 258 – 263:

      “For the MEQ (95%HDI=[-0.09,0.06]), MCTQ (95%HDI=[-0.17,0.05]), BMI (95%HDI=[-0.19,0.01]), and FINDRISC (95%HDI=[-0.09,0.03]) no meaningful relationship with choice biasmotivational tendency was found, consistent with the smaller magnitude of reported component loadings from the PLS regression. This null finding for dimensional measures of circadian rhythm and metabolic health was not in line with our pre-registered hypotheses.”

      Lines 268 – 270:

      “For reward sensitivity, the intercept-only model outperformed models incorporating questionnaire predictors based on RMSE. This result was not in line with our pre-registered expectations.”

      Lines 295 – 298:

      “As in our transdiagnostic analyses of continuous neuropsychiatric measures (Results 2.3), we found evidence for a lower motivational tendency parameter in the MDD group compared to HCs (M=-0.111, 95% HDI=[ -0.20,-0.03]) (Fig. 4B). This result confirmed our pre-registered hypothesis.”

      Lines 344 – 355:

      “Late chronotypes showed a lower motivational tendency than early chronotypes (M=-0.11, 95% HDI=[-0.22,-0.02])—comparable to effects of transdiagnostic measures of apathy and anhedonia, as well as diagnostic criteria for depression. Crucially, we found motivational tendency was modulated by an interaction between chronotype and time-of-day (M=0.19, 95% HDI=[0.05,0.33]): post-hoc GLMs in each chronotype group showed this was driven by a time-of-day effect within late, rather than early, chronotype participants (M=0.12, 95% HDI=[0.02,0.22], such that late chronotype participants showed a lower motivational tendency in the morning testing sessions, and a higher motivational tendency in the evening testing sessions; early chronotype: 95% HDI=[-0.16,0.04]) (Fig. 5A). These results of a main effect and an interaction effect of chronotype on motivational tendency confirmed our pre-registered hypothesis.”

      Lines 390 – 393:

      “Participants with an early chronotype had a lower reward sensitivity parameter than those with a late chronotype (M=0.27, 95% HDI=[0.16,0.38]). We found no effect of time-of-day on reward sensitivity (95%HDI=[-0.09,0.11]) (Fig. 5B). These results were in line with our pre-registered hypotheses.”

    1. eLife assessment

      This study describes fundamental findings related to early disruptions in disinhibitory modulation exerted by VIP+ interneurons, in CA1 in a transgenic model of Alzheimer's disease pathology. The authors provide a compelling analysis at the cellular, synaptic, network, and behavioral levels on how these changes correlate and might be related to behavioral impairments during these early stages of AD pathology.

    2. Reviewer #1 (Public Review):

      Summary:

      The work in the manuscript utilized patch-clamp techniques to explore the electrophysiological characteristics of VIP interneurons in the early stages of AD using the 3xTg mouse model. The study revealed that VIP interneurons exhibited prolonged action potentials and reduced firing rates. These changes could not be attributed to modifications in input signals or morphological transformations. The authors attributed aberrant VIP activity to the accumulation of beta-amyloid in those interneurons.

      The decreased frequency of VIP inhibitory events were associated with no observed changes in excitatory drive to these interneurons. Consequently, heightened activity in the general population of CA1 interneurons was observed during a decision-making task and an object recognition test. In light of these findings, the authors concluded that the altered firing patterns of VIP interneurons may initiate early-stage dysfunction in hippocampal CA1 circuits, potentially influencing the progression of AD pathology.

      Strengths:

      Overall the work is novel and moves the field of Alzheimer's disease forward in a significant way. The manuscript reports a novel concept of aberrant activity in VIP interneurons during the early stages of AD thus contributing to dysfunctions of the CA1 microcircuit. This results in enhancement of the inhibitory tone on the primary cells of CA1. Thus, the disinhibition by VIP interneurons of Principal Cells is dampened. The manuscript was skillfully composed, the study was of strong scientific rigor featuring well-designed experiments. Necessary controls were present. Both sexes were included.

      Major limitations were not adequately addressed in the revised manuscript

      (1) The authors attributed aberrant circuit activity to accumulation of "Abeta intracellularly" inside IS-3 cells. That is problematic. 6E10 antibody recognizes amyloid plaques in addition to Amyloid Precursor Protein (APP) as well as the C99 fragment. There are no plaques at the ages 3xTg mice were examined. Lack of plaques was addressed in revised manuscript. The staining shown in Fig. 1a is of APP/C99 inside neurons, not abeta accumulations in neurons. At the ages of 3-6 months, 3xTg mice start producing and releasing extracellular abeta oligomers and potentially tau oligomers as well (Takeda et al., 2013 PMID: 23640054; Takeda et al., 2015 PMID: 26458742 and others). Emerging literature suggests that extracellular not intracellular abeta and tau oligomers disrupt circuit function. Thus, a more likely explanation of extracellular abeta and tau oligomers disrupting the activity of VIP neurons is plausible. Presence of intracellular abeta is currently controversial in the field and needs to be discussed as such. Some of the references added in the revised version of the manuscript are erroneously cited. The authors provide no original data in support of "intracellular" abeta.

      (2) Authors suggest that their animals do not exhibit loss of synaptic connections and show Fig. 3d in support of that suggestion. However, imaging with confocal microscopy of 70 micron thick sections would not allow resolution of pre- and post-synaptic terminals. More sensitive measures such as electron microscopy or array tomography are the appropriate techniques to pursue. It is important for the authors to either remove that data from the manuscript or address/discuss the limitations of their technique in the discussion section. There is a possibility of loss of synaptic connections in their mouse model at the ages examined. Discussion of that possibility and of the limitations of the methodology used is missing.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Strengths: 

      Overall the work is novel and moves the field of Alzheimer's disease forward in a significant way. The manuscript reports a novel concept of aberrant activity in VIP interneurons during the early stages of AD thus contributing to dysfunctions of the CA1 microcircuit. This results in the enhancement of the inhibitory tone on the primary cells of CA1. Thus, the disinhibition by VIP interneurons of Principal Cells is dampened. The manuscript was skillfully composed, and the study was of strong scientific rigor featuring well-designed experiments. Necessary controls were present. Both sexes were included.

      We express our gratitude to the reviewer for their keen appreciation of our efforts and their enthusiasm for the outcomes of this research.

      Limitations:

      (1) The authors attributed aberrant circuit activity to the accumulation of "Abeta intracellularly" inside IS-3 cells. That is problematic. 6E10 antibody recognizes amyloid plaques in addition to Amyloid Precursor Protein (APP) as well as the C99 fragment. There are no plaques at the ages 3xTg mice were examined. Thus, the staining shown in Figure 1a is of APP/C99 inside neurons, not abeta accumulations in neurons. At the ages of 3-6 months, 3xTg starts producing abeta oligomers and potentially tau oligomers as well (Takeda et al., 2013 PMID: 23640054; Takeda et al., 2015 PMID: 26458742 and others). Emerging literature suggests that abeta and tau oligomers disrupt circuit function. Thus, a more likely explanation of abeta and tau oligomers disrupting the activity of VIP neurons is plausible.

      The Reviewer correctly points out that 3xTg-AD mice typically do not exhibit plaques before 6 months of age, with limited amounts even up to 12 months, particularly in the hippocampus. To the best of our knowledge, the 6E10 antibody binds to an epitope in APP (682-687) that is also present in the Abeta (3-8) peptide. Consequently, 6E10 detects full-length APP, α-APP (soluble alpha-secretase-cleaved APP), and Abeta (LaFerla et al., 2007). Nonetheless, we concur with the Reviewer's observation that the detected signal includes Abeta oligomers and the C99 fragment, which is currently considered an early marker of AD pathology (Takasugi et al., 2023; Tanuma et al., 2023). Studies have demonstrated intracellular accumulation of C99 in 3-month-old 3xTg mice (Lauritzen et al., 2012), and its binding to the Kv7 potassium channel family, which results in inhibiting their activity (Manville and Abbott, 2021). If a similar mechanism operates in IS-3 cells, it could explain the changes in their firing properties observed in our study. Consequently, we have revised the manuscript to include this crucial information in both the Results and Discussion sections.

      (2) Authors suggest that their animals do not exhibit loss of synaptic connections and show Figure 3d in support of that suggestion. However, imaging with confocal microscopy of 70micron thick sections would not allow the resolution of pre- and post-synaptic terminals. More sensitive measures such as electron microscopy or array tomography are the appropriate techniques to pursue. It is important for the authors to either remove that data from the manuscript or address the limitations of their technique in the discussion section. There is a possibility of loss of synaptic connections in their mouse model at the ages examined.

      We appreciate the Reviewer’s perspective on the techniques used for imaging synaptic connections. While we acknowledge the limitations of confocal microscopy for resolving pre- and post-synaptic structures in thick sections, we respectfully disagree regarding the exclusive suitability of electron microscopy (EM). Our approach involved confocal 3D image acquisition using a 63x objective at 0.2 um lateral resolution and 0.25 Z-step, providing valuable quantitative insights into synaptic bouton density. Despite the challenges posed by thick sections, this method together with automatic analysis allows for careful quantification. Although EM offers unparalleled resolution, it presents challenges in quantification. We have included the important details regarding image acquisition and analysis in the revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The submitted manuscript by Michaud and Francavilla et al., is a very interesting study describing early disruptions in the disinhibitory modulation exerted by VIP+ interneurons in CA1, in a triple transgenic model of Alzheimer's disease. They provide a comprehensive analysis at the cellular, synaptic, network, and behavioral level on how these changes correlate and might be related to behavioral impairments during these early stages of the disease.

      Main findings:

      - 3xTg mice show early Aß accumulation in VIP-positive interneurons.

      - 3xTg mice show deficits in a spatially modified version of the novel object recognition test. - 3xTg mice VIP cells present slower action potentials and diminished firing frequency upon current injection.

      - 3xTg mice show diminished spontaneous IPSC frequency with slower kinetics in Oriens / Alveus interneurons.

      - 3xTg mice show increased O/A interneuron activity during specific behavioral conditions. - 3xTg mice show decreased pyramidal cell activity during specific behavioral conditions.

      Strengths:

      This study is very important for understanding the pathophysiology of Alzheimer´s disease and the crucial role of interneurons in the hippocampus in healthy and pathological conditions.

      We are thankful to the reviewer for their insightful recognition of our efforts and their enthusiasm for the results of this research.

      Weaknesses:

      Although results nicely suggest that deficits in VIP physiological properties are related to the differences in network activity, there is no demonstration of causality.

      We completely agree with the reviewer's observation regarding the lack of demonstration of causality in our results. Investigating causality in the relationship between deficits in VIP physiological properties and differences in network activity is indeed a crucial aspect of this project. However, achieving this goal will require a significant amount of time and dedicated manipulations in a new mouse model (VIP-Cre-3xTg). We appreciate the importance of this line of investigation and consider it as a priority for our future research endeavors.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Limitations:

      (1) The authors should describe their model and state the age at which these mice start depositing amyloid plaques and neurofibrillary tangles. Readers might not be familiar with this model. It is also important to mention that circuit disruptions are assessed prior to plaque and tangle formation.

      We have included a detailed description of the 3xTg-AD mouse model in the Introduction section, including information on the age at which amyloid plaques and neurofibrillary tangles begin to appear. Additionally, we have clarified that circuit disruptions were assessed before the formation of plaques and tangles. These details have been added to both the Introduction and the Results sections to ensure clarity for readers unfamiliar with the model.

      (2) Ns are presented in Supplemental Table 1. Units are presented in a note to Supplementary Table 1. It would be advisable to specify Ns and units as the data is being presented in the results section or figure legends for easy access.

      We have now included the Ns (sample sizes), specifying the number of cells or sections and the number of experimental animals, directly within the Results section and in the figure legends. This ensures that readers have immediate access to this information without needing to refer to the supplementary materials.

      (3) Several typos require correction:

      a. "mamory" - Line 22, page 5.

      b. The term "Interneurons" is abbreviated as both "INs" and "IN" throughout the manuscript. The author should consistently choose one abbreviation.

      We have corrected the typo "mamory" to "memory" on line 22, page 5. Additionally, we have standardized the abbreviation for "Interneurons" to "INs" throughout the manuscript for consistency.

      (4) Note 2 in Supplementary Table 1 states that animals of both sexes with equal distribution were used throughout the study. It would be best for the reader to assess the data distribution based on sex. Thus, it is advisable for the authors to depict male and female data points as distinct symbols throughout the figures.

      Unfortunately, we do not have detailed sex-disaggregated data for all datasets, which limits our ability to depict male and female data points separately across all figures. Therefore, we have opted to pool data from both sexes for a more comprehensive analysis. We believe this approach maintains the robustness of our findings.

      Reviewer #2 (Recommendations for the authors):

      Major Points:

      - To keep the logical line of reasoning and to be able to interpret the results, it would be important to use the same metrics when comparing the population activity of O/A interneurons and principal cells in the different behavioral conditions.

      We have revised Figures 4 and 5 to enhance the coherence in data presentation. This includes using consistent metrics for comparing the population activity of both O/A interneurons and principal cells across different behavioral conditions. These changes ensure a clearer and more logical interpretation of the results.

      - Although results nicely suggest that deficits in VIP physiological properties are related to the differences in network activity, there is no demonstration of causality. Would it be possible to test if manipulating VIP neurons one could obtain such specific results? Alternatively, it could be discussed more in detail how the decrease in disinhibition could lead to the changes in network activity demonstrated here.

      We agree with the reviewer that establishing causality between VIP neuron deficits and changes in network activity would be very important. However, demonstrating causality would require a new line of investigation, involving the use of specific mouse models to selectively manipulate VIP neurons. This is an exciting direction that we plan to prioritize in our future research. For this study, we have included a discussion on the potential mechanisms by which decreased disinhibition might lead to the observed changes in network activity. Specifically, we propose that in young adult 3xTg-AD mice, the altered firing of I-S3 cells may lead to enhanced inhibition of principal cells. This could shift the excitation/inhibition balance, input integration and firing output of principal cells thereby impacting overall network activity. These points are discussed in detail in the revised Discussion section.

      - On the same lines the correlations showed in the manuscript, would be more robust if there was an in vivo demonstration that 3xTg mice indeed show decreased activity in vivo. The same experiments could also clarify if VIP cells in control animals are more active at the time of decision-making and during object exploration as suggested in the manuscript.

      Thank you for your comment. In response to the point raised, we would like to highlight that we have recently documented the increased activity of VIP-INs in the D-zone of the T-maze and during object exploration in a study published in Cell Reports (Tamboli et al., 2024). This publication is now referenced in our manuscript to support our findings. Regarding the in vivo activity of 3xTg mice, our observations indicated no significant differences in major behavioral patterns such as locomotion, rearing, and exploration of the T-maze when comparing Tg and non-Tg mice. These findings are presented in detail in Figure 4c and Supplementary Fig. 5. We believe these data support the robustness of our correlations by demonstrating that the overall behavioral activity of 3xTg mice is comparable to that of non-transgenic controls, thus focusing attention on the specific roles of VIP-INs in early prodromal state of AD pathology.

      Minor Points:

      - Figure 1c: Heading of VIP-Tg should have capital letters.

      Thank you for pointing that out. We have corrected the heading to "VIP-Tg" with capital letters in Figure 1c.

      - Figure 1d: The finding that no change was observed in the percentage of VIP+/CR+ is based on three animals and 3-4 slices per mouse. However, the result of VIP+CR+ in tg-mice has an outlier that might bias the results. I would suggest increasing the number of animals to confirm these results.

      Thank you for your insightful suggestion. We addressed the potential impact of the outlier in the VIP+/CR+ cell density analysis by recalculating the results after removing the outlier using the interquartile range method. This reanalysis revealed a statistically significant difference in the VIP+/CR+ cell density between non-Tg and Tg mice, which we have now detailed in the Results section. Despite this, we have chosen to retain the outlier in our final presentation to accurately represent the biological variability observed in our sample. We agree that increasing the number of animals would further validate these findings and will consider this in future studies.

      - Figure 3d: Would it be possible to identify the recorded interneurons? Is it expected that most of those are OLM cells?

      Thank you for your question. We were unable to fully recover all recorded cells using biocytin staining. However, for those cells with preserved axonal structures, we identified both OLM and bistratified cells, which are the primary targets of I-S3 cells. We have now included this information in the Results section to clarify the types of interneurons identified.

      - Figure 3: Why quantify VGat terminals instead of quantification of VIP-GFP terminals? Combined with the Calretinine labeling it would be more useful to indicate that no changes were observed at the morphological bouton level specifically in disinhibitory interneurons. Please also describe which imageJ plugin was used for the quantification.

      Thank you for your question. Our primary objective was to quantify the synaptic terminals of CR+ INs in the CA1 O/A region, which are predominantly formed by I-S3 cells. Therefore, VGaT and CR co-localization was used to guide this analysis. GFP expression in axonal boutons can sometimes be inconsistent and less reliable for precise quantification. For this analysis, we utilized the “Analyze Particles” function in ImageJ, combined with watershed segmentation, which is now specified in the Methods section.

      -  Figure 4g: How was the statistical test performed? If data was averaged across mice, please add error bars and data points in the figure.

      Thank you for your question. To compare the alternation percentage between non-Tg and Tg mice, we used Fisher’s Exact test as detailed in Supplementary Table 1. In this analysis, we considered each animal's choice individually, comparing the preference for correct versus incorrect choices between the two groups. Since Fisher’s Exact test is designed for analyzing qualitative data rather than quantitative data, averaging across mice was not applicable, and therefore, we did not include error bars or data points in the figure.

      - Figure 4h: To conclude that the increase in activity is larger in the 3xTg mice, there should be a statistical comparison for the magnitude of change between the decision and the stem zone for control and 3xTg mice. To show that there is no significant difference in this measurement in the control mice is insufficient.

      Thank you for your suggestion. We performed a statistical comparison of the magnitude of change in activity between the stem zone and the D-zone for non-Tg and 3xTg mice, as recommended. Our analysis showed no significant difference in this magnitude of change between the two genotypes. These results have now been included in the Results section. However, we would like to highlight an important finding regarding the nature of these changes. In the 3xTg mice, there was a consistent increase in the activity of O/A INs when entering the Dzone. In contrast, non-Tg mice displayed a range of responses, including both increases and decreases in activity. This indicates a higher reliability in the firing of O/A INs in the D-zone of 3xTg mice. Our recent study suggests that VIP-INs are particularly active in the D-zone (Tamboli et al., 2024). Therefore, the absence or reduced input from VIP-INs in 3xTg mice may lead to the observed higher engagement of O/A INs in this zone. We believe this observation is crucial for understanding the differential yet nuanced changes in neural dynamics in these mice.

      - In the methods, it is stated that there was a pre-selection of animals depending on learning performance. Would it be possible to also show the data from animals that did not properly learn? Alternatively, it would be useful to plot the correlation between performance in this test and the difference between activity in the stem and the decision-making zone. The reason to ask for this is that there is a trend for control animals to show reduced alternations (50 vs 80%, although not significant, it is a big difference). Considering that there is also a trend in control animals to show increased activity in the decision-making zone, it would be important to confirm that this is not only due to differences in performance. The current statistical procedure does not allow discarding this.

      In this study, we excluded from the analysis the animals that refused to explore the T-maze and spent all their time in the stem corner, or refused to explore the objects and stayed in the open field maze (OFM) corner. These exclusions applied to both non-Tg (n = 6) and Tg (n = 5) groups, indicating that low exploratory activity is not necessarily linked to AD-related mutations. During the T-maze test, we also observed several animals that made incorrect choices (4 out of 9 non-Tg and 1 out of 6 Tg mice). However, due to the low number of animals making incorrect choices, we were unable to form a separate group for analysis based on incorrect choices. These details are now provided in the Methods section.

      - Figure 4i. It is not clear when exactly cell activity was measured. If it was during the entire recording time, I think it would be interesting to see if the activity of O/A interneurons is different specifically during interaction with the object in 3xTg mice.

      Cell activity was indeed measured throughout the entire recording session and analyzed in relation to animal behavior (immobility to walking; Fig. 4d,e), and periods specifically related to interaction with objects were extracted for analysis (Figure 4i).

      - Why was the object modulation measured during a different task in which both objects were the same? The figure is misleading in that sense, as it suggests the experiment was the same as for the other panels with two different objects. It would be important to correct this if the authors want to correlate the deficits in NOR in 3xTg mice and changes in IN activity.

      The study specifically investigated object-modulated neural activity during the Sampling phase. Therefore, two identical objects were placed in the arena for animal exploration. As mentioned above, due to several animals failing to explore the OFM and objects on the second day, they were excluded from the analysis, preventing the conduct of the novel-object exploration Test Trial. Both non-Tg and Tg mice showed a lack of exploration in the OFM and Tmaze, for reasons that remain unclear. Consequently, we opted to present robust data on neural activity during the initial sampling of two identical objects. However, further investigation is needed to understand how this activity relates to deficits observed in the classical NOR test.

      - Figure. 5c-f. I would strongly suggest performing the same quantification and displaying similar figures for the fiber photometry experiments in interneurons and principal cells. It would help to interpret the data.

      We have taken the reviewer's suggestion into account and standardized the data analysis and presentation. Figures 4d, e and 5c, d now depict the walk-induced activity in INs and PCs, respectively. Figures 4h and 5f compare activity between the stem and D-zone in the T-maze. Additionally, Figures 4j and 5h illustrate the object modulation of INs and PCs, respectively.

      - Although velocity and mobility were quantified, it would be important to show also that they are not different during those times when activity was dissimilar, as in the decision zone.

      We have analyzed these data and found no significant differences between the two genotypes in terms of velocity and mobility during these periods. This analysis is now presented in Supplementary Figure 5e, f and detailed in the Results section.

      - Figure 5g-h. Similarly, I would suggest using the same metrics in order to correlate the results from interneuron and principal cell activity photometry.

      We have updated this figure to align with the presentation of interneurons (Figure 4j) and included RMS analysis to emphasize lower variance in object modulation of PCs as an indicator of increased network inhibition.

      - Was object modulation variance also different for INs depending on the mouse phenotype?

      We conducted this additional analysis but did not find any significant difference.

      - Figure S4: would it be possible to identify the postsynaptic partners?

      As mentioned above, for those cells with preserved axonal structures, we identified both OLM and bistratified cells. We have now included this information in the Results section to clarify the types of interneurons identified.

    1. eLife assessment

      This study presents valuable findings on an unresolved question of cerebellar physiology: Do synapses between Purkinje cells and granule cells, made by the ascending part of the granule cells' axon, have different properties than those made by parallel fibers? The authors conducted patch-clamp recordings on rat cerebellar slices and found a new type of plasticity in the synapses of the ascending part of granule cell axons. The experiments are well-designed with appropriate controls, and the study provides solid evidence for the new form of cerebellar synaptic plasticity.

    2. Reviewer #1 (Public Review):

      In this study, the authors address a fundamental unresolved question in cerebellar physiology: do synapses between granule cells (GCs) and Purkinje cells (PCs) made by the ascending part of the axon (AA) have different synaptic properties to those made by parallel fibers? This is an important question because GCs integrate sensorimotor information from many brain areas with a precise and complex topography.

      The authors argue that GCs located close to the PCs essentially contact PC dendrites through the ascending part of their axon. They demonstrate that high-frequency (100 Hz) joint stimulation of distant parallel fibers and local GCs potentiates AA-PC synapses, while parallel fiber-PC synapses are depressed. On the basis of paired pulse ratio analysis, they concluded that evoked plasticity was postsynaptic. When individual pathways are stimulated alone, no LTP is observed. This associative plasticity appears to be sensitive to timing, as stimulation of parallel fibers first results in depression, while stimulation of the AA pathway has no effect. NMDA, mGluR1 and GABAA receptors are involved in this plasticity.

      Overall, associative modulation of synaptic transmission is convincing, and the experiments carried out support this conclusion.

      One of its weaknesses is that it contradicts the numerous experiments conducted by many groups that have studied plasticity at this connection (e.g. Bouvier et al 2016, Piochon et al 2016, Binda et al, 2016, Schonewille et al 2021). According to the literature, high-frequency stimulation of parallel fibers leads to postsynaptic potentiation under many different experimental conditions (blocked or unblocked inhibition, stimulation protocols, internal solution composition). This discrepancy was not investigated experimentally.

      Another weakness is the lack of evidence that AAs have been stimulated. Indeed, without filling the PC with fluorescent dye or biocytin during the experiment, and without reconstructing the anatomical organization, it is difficult to assess whether the stimulating pipette is actually positioned in the GC cluster that potentially contacts the PC with AAs. Although the idea that AAs repeatedly contact the same Purkinje cell has been propagated, to the reviewer's knowledge, no direct demonstration of this hypothesis has yet been published. In fact, what has been demonstrated (Walter et al 2009; Spaeth et al 2022) is that GCs have a higher probability of being connected to nearby PCs, but not necessarily associated with AAs.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors describe a form of synaptic plasticity at synapses from granule cells onto Purkinje cells in the mouse cerebellum, which is specific to synapses from granule cells close to the cell body but not to distal ones. This plasticity is induced by the paired or associative stimulation of the two types of synapses because it is not observed with stimulation of one type of synapse alone. In addition, this form of plasticity is dependent on the order in which the stimuli are presented and is dependent on NMDA receptors, metabotropic glutamate receptors and to some degree on GABAA receptors.

      Strengths:

      The focus of the authors on the properties of two different synapse-types on cerebellar Purkinje cells is interesting and relevant, given previous results that ascending and parallel fiber synapses might be functionally different and undergo different forms of plasticity (although it hasn't been proven here that the two types of synapses are indeed ascending vs parallel fiber synapses). Nevertheless, the interaction between proximal vs. distal stimulation driven synapse types during plasticity is important for understanding cerebellar function. The demonstration of timing and order-dependent potentiation of only one pathway, and not another, after associative stimulation of both pathways, changes our understanding of potential plasticity mechanisms. In addition, this observation opens up many new questions on underlying intracellular mechanisms as well as on its relevance for cerebellar learning.

      Weaknesses:

      A concern with this study is that all recordings demonstrate "rundown", a progressive decrease in the amplitude of the EPSC, starting during the baseline period and continuing after the plasticity-induction stimulus. The issues that are causing rundown are not known and may or may not be related to the cellular processes involved in synaptic plasticity. This concern applies in particular to all the experiments where there is a decrease in synaptic strength. However, a key finding of this paper is the associative potentiation of one pathway, which is clearly different from all conditions where there is a decrease in synaptic strength and raises confidence in the authors' conclusions.

      In addition, there is some inconsistency with previous results; specifically, that no PF-LTP was induced by PF-alone repeated stimulation.

      It remains for future work to identify what these two synapse types, distinguished by the stimulation location, actually are, and where they are on the Purkinje cell dendritic tree. What this specific timing rule is important for is also something that remains to be discovered. Its potential relevance for plasticity and learning will depend on what information these AA vs PF synapses carry, and why their association is meaningful for the circuit and for a behavior. Overall, this study opens up many new questions for the field.

    4. Reviewer #3 (Public Review):

      Summary:

      Granule cells' axons bifurcate to form parallel fibers (PFs) and ascending axons (AAs). While the significance of PFs on cerebellar plasticity is widely acknowledged, the importance of AAs remains unclear. In the current paper, Conti and Auger conducted electrophysiological experiments in rat cerebellar slices and identified a new form of synaptic plasticity in the AA-Purkinje cell (PC) synapses.

      Strengths:

      The authors applied simultaneous stimulation of AAs and PFs and recorded from PCs and discovered that the strength of AA-PC synapses and PF-PC synapses change in opposite directions: while AA-PC EPSCs increased, PFs-EPSCs decreased. This finding suggests that synaptic responses to AAs and PFs in PCs are jointly regulated, working as an additional mechanism to integrate motor/sensory input. The existence of such plasticity mechanisms may offer new perspectives in studying and modeling cerebellum-dependent behavior. Overall, the experiments are performed well.

      Weaknesses:

      There are two weaknesses. First, the baseline of electrophysiological recordings is influenced significantly by run-down, limiting the interpretability of the data. Because the amplitude of AA-EPSCs is relatively small, the run-down may have masked some of the changes in EPSCs. However, the authors managed this difficulty using appropriate controls and statistical analysis. Second, while the authors show AA-LTP depends on mGluR, NMDA receptors, and GABA-A receptors, which cell types express these receptors and how they contribute to plasticity is not clarified. Cell-type-specific knockdown of these receptors may clarify this point in future studies.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, the authors address a fundamental unresolved question in cerebellar physiology: do synapses between granule cells (GCs) and Purkinje cells (PCs) made by the ascending part of the axon (AA) have different synaptic properties from those made by parallel fibers? This is an important question, as GCs integrate sensorimotor information from numerous brain areas with a precise and complex topography.

      Summary:

      The authors argue that CGs located close to PCs essentially contact PC dendrites via the ascending part of their axons. They demonstrate that joint high-frequency (100 Hz) stimulation of distant parallel fibers and local CGs potentiates AA-PC synapses, while parallel fiber-PC synapses are depressed. On the basis of paired-pulse ratio analysis, they concluded that evoked plasticity was postsynaptic. When individual pathways were stimulated alone, no LRP was observed. This associative plasticity appears to be sensitive to timing, as stimulation of parallel fibers first results in depression, while stimulation of the AA pathway has no effect. NMDA, mGluR1 and GABAA receptors are involved in this plasticity.

      Strengths:

      Overall, the associative modulation of synaptic transmission is convincing, and the experiments carried out support this conclusion. However, weaknesses limit the scope of the results.

      Weaknesses:

      One of the main weaknesses of this study is the suggestion that high-frequency parallel-fiber stimulation cannot induce long term potentiation unless combined with AA stimulation. Although we acknowledge that the stimulation and recording conditions were different from those of other studies, according to the literature (e.g. Bouvier et al 2016, Piochon et al 2016, Binda et al, 2016, Schonewille et al 2021 and others), high-frequency stimulation of parallel fibers leads to long-term postsynaptic potentiation under many different experimental conditions (blocked or unblocked inhibition, stimulation protocols, internal solution composition). Furthermore, in vivo experiments have confirmed that high-frequency parallel fibers are likely to induce long-term potentiation (Jorntell and Ekerot, 2002; Wang et al, 2009).

      This article provides further evidence that long-term plasticity (LTP and LTD) at this connection is a complex and subtle mechanism underpinned by many different transduction pathways. It would therefore have been interesting to test different protocols or conditions to explain the discrepancies observed in this dataset.

      Even though this is not the main result of this study, we acknowledge that the control experiments done on PF stimulation add a puzzling result to an already contradictory literature. High frequency parallel fibre stimulation (in isolation) has been shown to induce long term potentiation in vitro, but not always, and most importantly, this has been shown in vivo. This was the reason for choosing that particular stimulation protocol. Examination of in vitro studies, however, show that the results are variable and even contradictory. Most were done in the presence of GABAA receptor antagonists, including the SK channel blocker Bicuculline, whereas in the study by Binda (2016), LTP was blocked by GABAA receptor inhibition. In some studies also, LTP was under the control of NMDAR activation only, whereas in Binda (2016), it was under the control of mGluR activation. Moreover, most experiments were done in mice, whereas our study was done in rats. Our results reveal multiple mechanisms working together to produce plasticity, which are highly sensitive to in vitro conditions. We designed our experiments to be close to the physiological conditions, with inhibition preserved and a physiological chloride gradient. It is likely that experimental differences have given rise to the variability of the results and our inability to reproduce PF-LTP, but it was not the aim of this study to dissect the subtleties of the different experimental protocols and models.

      We have modified the Discussion to cover that point fully.

      Another important weakness is the lack of evidence that the AAs were stimulated. Indeed, without filling the PC with fluorescent dye or biocytin during the experiment, and without reconstructing the anatomical organization, it is difficult to assess whether the stimulating pipette is positioned in the GC cluster that is potentially in contact with the PC with the AAs. According to EM microscopy, AAs account for 3% of the total number of synapses in a PC, which could represent a significant number of synapses. Although the idea that AAs repeatedly contact the same Purkinje cell has been propagated, to the best of the review author's knowledge, no direct demonstration of this hypothesis has yet been published. In fact, what has been demonstrated (Walter et al 2009; Spaeth et al 2022) is that GCs have a higher probability of being connected to nearby PCs, but are not necessarily associated with AAs.

      We fully agree with the reviewer that we have not identified morphologically ascending axon synapses, and we stress this fact both in the first paragraph of the Results section, and again at the beginning of Discussion. Our point is mainly topographical, given the well documented geometrical organisation of the cerebellar cortex. Strictly speaking, inputs are local (including AAs) or distal (PFs). Similarly, the studies by Isope and Barbour (2002) and Walter et al. (2009), just like Sims and Hartell (2005 and 2006), have coined the term ‘ascending axon’ when drawing conclusions about locally stimulated inputs. Moreover, our results do not rely on or assume multiple contacts, stronger connections, or higher probability of connections between ascending axons and Purkinje cells. Our results only demonstrate a different plasticity outcome for the two types of inputs. Therefore, our manuscript could be rephrased with the terms ‘local’ and ‘distal’ granule cell inputs, but this would have no more implication for the results or the computation performed in Purkinje cells. However, in our experience, these terms are more confusing, and consistent with the literature, we do not wish to make this modification. However, we have modified the abstract of the manuscript to clarify this point.

      Reviewer #2 (Public Review):

      Summary:

      The authors describe a form of synaptic plasticity at synapses from granule cells onto Purkinje cells in the mouse cerebellum, which is specific to synapses proximal to the cell body but not to distal ones. This plasticity is induced by the paired or associative stimulation of the two types of synapses because it is not observed with stimulation of one type of synapse alone. In addition, this form of plasticity is dependent on the order in which the stimuli are presented, and is dependent on NMDA receptors, metabotropic glutamate receptors and to some degree on GABAA receptors. However, under all experimental conditions described, there is a progressive weakening or run-down of synaptic strength. Therefore, plasticity is not relative to a stable baseline, but relative to a process of continuous decline that occurs whether or not there is any plasticity-inducing stimulus.

      As highlighted by the reviewer, we observed a postsynaptic rundown of the EPSC amplitude for both input pathways. Rundown could be mistaken for a depression of synaptic currents, not for a potentiation, and the progressive decrease of the EPSC amplitude during the course of an experiment leads to an underestimate of the absolute potentiation. We have taken the view to provide a strong set of control data rather than selecting experiments based on subjective criteria or applying a cosmetic compensation procedure. We have conducted control experiments with no induction (n = 17), which give a good indication of the speed and amplitude of the rundown. Comparison shows a highly significant potentiation of the ascending axon EPSC. Depression of the parallel fibre EPSC, on the other hand, was not significantly different from rundown, and we have not spoken of parallel fibre long term depression. The data show thus very clearly that ascending axon and parallel fibre synapses behave differently following the costimulation protocol.

      Strengths:

      The focus of the authors on the properties of two different synapse-types on cerebellar Purkinje cells is interesting and relevant, given previous results that ascending and parallel fiber synapses might be functionally different and undergo different forms of plasticity. In addition, the interaction between these two synapse types during plasticity is important for understanding cerebellar function. The demonstration of timing and order-dependent potentiation of only one pathway, and not another, after associative stimulation of both pathways, changes our understanding of potential plasticity mechanisms. In addition, this observation opens up many new questions on underlying intracellular mechanisms as well as on its relevance for cerebellar learning and adaptation.

      Weaknesses and suggested improvements:

      A concern with this study is that all recordings demonstrate "rundown", a progressive decrease in the amplitude of the EPSC, starting during the baseline period and continuing after the plasticity-induction stimulus. In the absence of a stable baseline, it is hard to know what changes in strength actually occur at any set of synapses. Moreover, the issues that are causing rundown are not known and may or may not be related to the cellular processes involved in synaptic plasticity. This concern applies in particular to all the experiments where there is a decrease in synaptic strength.

      We have provided an answer to that point directly below the summary paragraph. We will just add here that if the phenomenon causing rundown was involved in plasticity, it should affect plasticity of both inputs, which was not the case, clearly distinguishing the ascending axon and parallel fibre inputs.

      The authors should consider changes in the shape of the EPSC after plasticity induction, as in Fig 1 (orange trace) as this could change the interpretation.

      Figure 1 shows an average response composed of evoked excitatory and inhibitory synaptic currents. The third section of Supplementary material (supplementary figure 3) shows that this complex shape is given by an EPSC followed by a delayed disynaptic IPSC. We would like to point out that while separating EPSC from IPSC might appear difficult from average traces due to the averaged jitter in the onset of the synaptic currents, boundaries are much clearer when analysing individual traces. In the same section we discuss the results of experiments in which transient applications of SR 95531 before and after the induction protocol allowed us to measure the EPSC, while maintaining the same experimental conditions during induction. Analysis of the kinetics of the EPSCs during SR application at the beginning and end of experiments, showed that there is no change in the time to peak of both AA and PF response. The decay time of AA- and PF-EPSCs are slightly longer at the end of the experiment, even if the difference is not significant for AA inputs. This analysis has been added to the Supplementary material. Our analysis, that uses as template the EPSCs kinetics measured at the beginning and at the end of the experiments, takes directly into account these changes. The results show clearly that the presence of disynaptic inhibition doesn’t significantly affect the measure of the peak EPSC after the induction protocol nor the estimate of plasticity.

      In addition, the inconsistency with previous results is surprising and is not explained; specifically, that no PF-LTP was induced by PF-alone repeated stimulation.

      In our experimental conditions, PF-LTP was not induced when stimulating PF only, the condition that reproduces experiments in the literature. As discussed in our response to reviewer 1, a close look at the literature, however, reveals variabilities and contradictions behind seemingly similar results. They reveal intricate mechanisms working together to produce plasticity, which are sensitive to in vitro conditions. We designed our experiments to be close to physiological conditions, with inhibition preserved and a physiological chloride gradient. It is likely that experimental differences have given rise to the variability of the results and our inability to observe PF-LTP. We have modified the Discussion section to cover that point thoroughly in the context of past results. 

      The authors test the role of NMDARs, GABAARs and mGluRs in the phenotype they describe. The data suggest that the form of plasticity described here is dependent on any one of the three receptors. However, the location of these receptors varies between the Purkinje cells, granule cells and interneurons. The authors do not describe a convincing hypothetical model in which this dependence can be explained. They suggest that there is crosstalk between AA and PF synapses via endocannabinoids downstream of mGluR or NO downstream of NMDARs. However, it is not clear how this could lead to the long-term potentiation that they describe. Also, there is no long-lasting change in paired-pulse ratio, suggesting an absence of changes in presynaptic release.

      We suggest in the result section that the transient change in paired pulse ratio (PPR) is linked to a transient presynaptic effect, but there was no significant long term change of the PPR, suggesting that the long term effects observed are linked to postsynaptic changes. We now stress this point in the Results and Discussion sections.

      Concerning the involvement of multiple molecular pathways, investigators often tested for the involvement of NMDAR or mGluRs in cerebellar plasticity, rarely both. Here we showed that both pathways are involved. The conjunctive requirement for NMDAR and mGluR activation could easily be explained based on the dependence of cerebellar LTP and LTD on the concentrations of both NO and postsynaptic calcium (Coesman et al., 2004; Safo and Regehr, 2005; Bouvier et al., 2016; Piochon et al., 2016).

      We also observed an effect of GABAergic inhibition. GABAergic inhibition was elegantly shown by Binda (2016) to regulate calcium entry together with mGluRs, and control plasticity induction. A similar mechanism could contribute to our results, although inhibition might have additional effects. We have modified the Discussion of the manuscript to clarify the pathways involved in plasticity and added a diagram to highlight the links between the different molecular pathways, potential cross talk mechanisms, and the location of receptors.

      Is the synapse that undergoes plasticity correctly identified? In this study, since GABAergic inhibition is not blocked for most experiments, PF stimulation can result in both a direct EPSC onto the Purkinje cell and a disynaptic feedforward IPSC. The authors do address this issue with Supplementary Fig 3, where the impact of the IPSC on the EPSC within the EPSC/IPSC sequence is calculated. However, a change in waveform would complicate this analysis. An experiment with pharmacological blockade will make the interpretation more robust. The observed dependence of the plasticity on GABAA receptors is an added point in favor of the suggested additional experiments.

      We did consider that due to long recording times there might be kinetic changes, and that’s the reason why the experiments of Supplementary figure 3 were done with pharmacological blockade of GABAAR with SR, both before and again after LTP induction. The estimate of the amplitude of the EPSC is based on the actual kinetics of the response at both times.

      A primary hypothesis of this study is that proximal, or AA, and distal, or PF, synapses are different and that their association is specifically what drives plasticity. The alternative hypothesis is that the two synapse-types are the same. Therefore, a good control for pairing AA with PF would be to pair AA with AA and PF with PF, thereby demonstrating that pairing with each other is different from pairing with self.

      Pairing AA with AA would be difficult because stimulation of AA can only be made from a narrow band below the PC and we would likely end up stimulating overlapping sets of synapses. However, Figure 5 shows the effect of stimulating PF and PF, while also mimicking the sparse and dense configuration of the control experiment. It shows that sparse PF do not behave like AA. Sims and Hartell (2006) also made an experiment with sparse PF inputs and observed clear differences between sparse local (AA) and sparse distal (PF) synapses.

      It is hypothesized that the association of a PF input with an AA input is similar to the association of a PF input with a CF input. However, the two are very different in terms of cellular location, with the CF input being in a position to directly interact with PF-driven inputs. Therefore, there are two major issues with this hypothesis: 1) how can subthreshold activity at one set of synapses affect another located hundreds of micrometers away on the same dendritic tree? 2) There is evidence that the CF encodes teaching/error or reward information, which is functionally meaningful as a driver of plasticity at PF synapses. The AA synapse on one set of Purkinje cells is carrying exactly the same information as the PF synapses on another set of Purkinje cells further up and down the parallel fiber beam. It is suggested that the two inputs carry sensory vs. motor information, which is why this form of plasticity was tested. However, the granule cells that lead to both the AA and PF synapses are receiving the same modalities of mossy fiber information. Therefore, one needs to presuppose different populations of granule cells for sensory and motor inputs or receptive field and contextual information. As a consequence, which granule cells lead to AA synapses and which to PF synapses will change depending on which Purkinje cell you're recording from. And that's inconsistent with there being a timing dependence of AA-PF pairing in only one direction. Overall, it would be helpful to discuss the functional implications of this form of plasticity.

      We do not hypothesise that association of the AA and PF inputs is similar to the association of PF and climbing fibre inputs. We compare them because it is the other known configuration triggering associative plasticity in Purkinje cells. It is indeed interesting to observe that even if the inputs are very small compared to the powerful climbing fibre input, they can be effective at inducing plasticity. Physiologically, the climbing fibre signal has been linked to error and reward signals, but reward signals are also encoded by granule cell inputs (Wagner et al., 2017). We have modified the discussion to make sure that we do not suggest equivalence with CF induced LTD.

      Moreover, we fully agree that AA and PF synapses made up by a given granule cell carry the same information, and cannot encode sensory and motor information at the same time. AA synapses from a local granule cell deliver information about the local receptive field, but PF synapses from the same granule cell will deliver contextual information about that receptive field to distant Purkinje cells. In the context of sensorimotor learning, movement is learnt with respect to a global context, not in isolation, therefore learning a particular association must be relevant. The associative plasticity we describe here could help explain this functional association. We have clarified the discussion.

      Reviewer #3 (Public Review):

      Granule cells' axons bifurcate to form parallel fibers (PFs) and ascending axons (AAs). While the significance of PFs on cerebellar plasticity is widely acknowledged, the importance of AAs remains unclear. In the current paper, Conti and Auger conducted electrophysiological experiments in rat cerebellar slices and identified a new form of synaptic plasticity in the AA-Purkinje cell (PC) synapses. Upon simultaneous stimulation of AAs and PFs, AA-PC EPSCs increased, while PFs-EPSCs decreased. This suggests that synaptic responses to AAs and PFs in PCs are jointly regulated, working as an additional mechanism to integrate motor/sensory input. This finding may offer new perspectives in studying and modeling cerebellum-dependent behavior. Overall, the experiments are performed well. However, there are two weaknesses. First, the baseline of electrophysiological recordings is influenced significantly by run-down, making it difficult to interpret the data quantitatively. The amplitude of AA-EPSCs is relatively small and the run-down may mask the change. The authors should carefully reexamine the data with appropriate controls and statistics. Second, while the authors show AA-LTP depends on mGluR, NMDA receptors, and GABA-A receptors, which cell types express these receptors and how they contribute to plasticity is not clarified. The recommended experiments may help to improve the quality of the manuscript.

      As highlighted by the reviewer and developed above in response to reviewer 2, we observed a postsynaptic rundown of the EPSC amplitude. Rundown could be mistaken for a depression of synaptic currents, not for a potentiation. Moreover, we have conducted control experiments with no induction (n = 17), which give a good indication of the speed and amplitude of the rundown, and provide a baseline. Comparison shows a highly significant potentiation of the ascending axon EPSC, relative to baseline and relative to these control experiments. Depression of the parallel fibre EPSC on the other hand was not significantly different from rundown. For that reason we have not spoken of parallel fibre long term depression. The data, however, show that ascending axon and parallel fibre synapses behave very differently following the costimulation protocol.

      We have discussed above in our response to reviewer 2 the potential involvement of mGluRs, NMDARs and GABAARs. We have clarified the discussion of the pathways involved in plasticity and added a diagram to highlight the links between the different molecular pathways, potential cross talk mechanisms, and the location of receptors.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      - If Chloride concentration cannot be modified, recordings should be performed at the Chloride reversal potential to avoid strong bias in amplitude measurements (e.g. in Figures 3 and 5 outward current was observed while not visible in Figures 1 and 4.

      The balance between excitation and inhibition dictates whether there is a visible outward component, and this varies with the connections tested. Careful control experiments with SR application presented in supplementary figure 3 show that the delay of the IPSC does not significantly affect measurement of the peak amplitude of the EPSC. The reversal potential for Clin our study (-85 mV), chosen to reproduce the physiological gradient in Purkinje cells, is too low to record from Purkinje cells at this potential in good conditions as it activates the hyperpolarisation activated cation current Ih, generating huge inward currents.

      - It is not clear whether, during the current clamp, the potential was maintained at -65 mV throughout the induction protocol.

      The potential was set and maintained around -65mV during the induction protocol. The method section has been amended to specify that point.

      - Experiments using GABAB or endocannabinoid antagonists would have been interesting to assess the role of presynaptic plasticity occluding postsynaptic plasticity.

      We are not sure why the reviewer suggested these particular experiments to test for the role of presynaptic plasticity. GABAB and endocannabinoid receptor activation both have presynaptic effects at granule cell to Purkinje cell synapses. They decrease release probability, and as a result increase the paired pulse ratio (Dittman and Regehr, 1997; Safo and Regehr, 2005). Here we only observed a transient decrease of the paired pulse ratio. Additionally, presynaptic endocannabinoid receptor activation, linked to postsynaptic mGluR1 activation and release of endocannabinoids, was shown to be required for induction of postsynaptic PF-LTD (Safo and Regehr, 2005). This effect required climbing fibre stimulation and mGluR activation. Here we show that mGluR1 inhibition did not inhibit the PF depression nor affect the transient change in PPR. Therefore there is no indication that activation of these receptors could induce a pre-synaptic depression occluding postsynaptic plasticity.

      - To give credit to this new plasticity in contradiction with many previous studies, induction pathways should be addressed more deeply.

      As developed earlier in response to the public review, this study does not contradict previous studies, expect maybe that by Binda et al., (2016), conducted on mice. From our point of view, our study in fact reconciles past results which have alternatively involved the mGluR or NMDAR pathways, whereas the molecular downstream pathways they recruit can easily cooperate. We aim to describe a new phenomenon and we cannot cover the mechanistic dissection which has been performed to date on plasticity in the cerebellar cortex.

      - The quality of the figures could be enhanced by modifying the dashed line.

      We have made the dashed line more discrete.

      Reviewer #2 (Recommendations For The Authors):

      - Is there cross-talk between the two synaptic pathways?

      In order to explain the associative nature of AA-LTP we suggest that a signal is generated at the AA input during the induction protocol only when the PF input is also stimulated, i.e. a form of cross-talk takes place between the two synaptic territories. We have not tested for cross-talk during control conditions but we discuss the fact that given the size of the Purkinje cell dendritic tree, the size of the inputs and their geometrical configuration, it is highly unlikely. We discuss possible cross-talk mechanisms.

      - Clarification question: "While the peak amplitude of the first response in the pair of stimulations showed a progressive decline, the peak amplitude of the second response of both AA and PF underwent either LTP or LTD respectively..." Does this mean that all LTP/LTD figures show the amplitude of the second EPSC in the paired pulse stimulation, and that the first EPSC has a different response? If so, this should be mentioned in the Methods section and implications discussed.

      All figures show both the amplitude of the first and second EPSCs in the pair of stimulations. In Figure 1A, 3A, 4A and 5B the paired stimulation protocol is depicted with colours and symbols used in the associated graphs, with closed symbols for the first and open symbols for the second EPSC. Figure legends have been amended to clarify this point. The average values given in the Results section and figure legends relate to the first EPSC only for clarity. As can be seen from the figures, long term plasticity affected the first and second EPSC in a very similar manner. However, individual symbols show that during a transient period, the first and second EPSCs are differentially affected by the induction protocol, resulting in a transient change of the PPR.

      Minor suggestions:

      - It would be helpful to have a reference for the statement that 1-2% of stimulated fibers come from nearby GCs when stimulation is distal.

      We have modified the text to explain our calculation based on the data of Pichitpornchai et al., 1994. P4 result section.

      - Does the shading over the plasticity time course traces come from the standard error of the mean?

      Shading over the plasticity time course plots shows the standard error of the mean. This is now clearly stated in figure legends.

      Reviewer #3 (Recommendations For The Authors):

      Major points:

      (1) Whether the plasticity between AAs and PCs is regulated by the post-synaptic or pre-synaptic mechanisms should be addressed or discussed. Based on the results of PPR (mostly unchanged after induction), the post-synaptic mechanism may be more significant. Supplemental Figure 2C shows a trend toward a positive correlation between AALTP and the number of spikes, suggesting intracellular calcium levels in the post-synaptic Purkinje cells may be important. Whether this is true or not can be directly tested by the addition of BAPTA in the recording pipettes.

      The absence of a long lasting effect on the paired pulse ratio (PPR) indicates that postsynaptic mechanisms are involved in long term changes. This is in line with the dependence of plasticity induced with similar protocols on the concentrations of NO and postsynaptic calcium, both affecting postsynaptic targets, as developed in our response to reviewer 2. BAPTA interferes with calcium and mGluR signalling, and could be used to further confirm the involvement of a postsynaptic mechanism, however, we did not wish to pursue further the dissection of the signalling cascade. We have modified the Results and Discussion sections to include a discussion of pre and postsynaptic mechanisms.

      (2) Most results from the plasticity experiments are shown as average/sem and do not include individual data, making ithard to appreciate the magnitude of the changes. The authors could show the individual data at some time points (e.g. 5 min before and 30 min after induction), plot bar-graphs (Figure 2C with individual data), or boxplots to compare different conditions and perform statistics.

      Individual data points are now visible for plasticity induction in Figure 2C and Supplementary Figure 2 for a number of conditions. Statistics have been performed as detailed in the text and legend of Fig 2.

      (3) In addressing point #2, it is strongly recommended that the authors include the values for controls without inductionbecause AA/PF-EPSCs undergo significant run-down. In most experiments, the authors compare the magnitude of plasticity with baseline changes in Supplemental Figure 1. This should not be appropriate for some experiments, such as Figures 3 & 4, where pharmacological treatments are performed. The authors should carefully consider including the appropriate controls from baseline recording to rule out significant confound by the run-down.

      We agree that control experiments without stimulation (no Stim) are only appropriate controls for the initial synchronous stimulation and AA and PF only experiments (Fig 1). All the other experiments were compared to the synchronous stimulation experiments, not to control No Stim. The synchronous stimulation protocol is strictly the same as that applied in experiments with pharmacological treatments and the appropriate control to test whether treatments affected plasticity. This is now systematically specified in the Results section.

      (4) The authors recorded mixed EPSC/IPSCs and used a fitting approach to extract EPSCs. Applying AMPA-receptor blockers to check that extracted IPSCs are correctly predicted may solidify the reliability of the approach. An additional concern is that this approach can only be used if the waveform of EPSC/IPSC does not change with plasticity. The authors should compare the waveforms between conditions to address this point.

      Fits were not used to extract EPSCs. EPSCs were isolated by blocking IPSCs with SR95531, and the IPSCs were then extracted by subtraction from the mixed EPSC/IPSC. Fits were then done of the isolated EPSC and the extracted IPSC. This procedure was applied both at the start of the experiment and at the end to avoid changes in kinetics that would influence measurements. A section of supplementary material is devoted to this analysis. Isolating IPSCs using AMPAR blockers is not possible as IPSCs are disynaptic. AMPAR blockers would fully suppress inhibition.

      (5) While the AA-LTP depends on NMDA-Rs, which cell type is responsible is not clear. Recording NMDA components in AA/PF-EPSCs should be informative in addressing this point. Cesana et al suggested that AA induces significant activation of NMDA-Rs in Golgi cells (PMID: 23884948). Whether AA stimuli could significantly evoke NMDA current in the experimental condition used in this paper could provide essential information.

      The granule cell to Purkinje cell EPSCs are devoid of an NMDAR component (Llano et al., 1991), and there is no postsynaptic NMDARs at granule cell to PC synapses, but a proportion of presynaptic boutons show the presence of NMDARs (Bidoret et al, 2009). This is now stated clearly on p8.  Presynaptic NMDAR have been involved in LTP and LTD of parallel fibre synapses (Casado et al., 2002; Bouvier et al., 2016; Schonewille et al., 2021), and linked to the activation of NOS in granule cell axons. However, we do not know whether presynaptic NMDARs are also present at AA synapses. NMDAR and NOS are also expressed by molecular layer interneurons, and have sometimes been involved in LTD induction (Kono et al., 2019), although this is disputed. In the paper by Cesana (2013), white matter stimulation activated mossy fibre inputs to granule cells, and as a consequence, granule cell to Golgi cell disynaptic EPSCs. The authors identified AA synapses on the basolateral dendrites of Golgi cells, and showed NMDAR activation associated with the mossy fibre to granule cell EPSC. Granule cell to Golgi cell synapses were shown to activate both postsynaptic AMPA and NMDA receptors (Dieudonné, 1999). But to our knowledge, Golgi cells do not express NOS. Therefore it is unlikely that activation of NMDARs in Golgi cells is linked to synaptic plasticity in Purkinje cells.

      (6) Pharmacological experiments in Figure 3 show that AA-LTP is dependent on mGluR. The authors mentioned that it could be explained by the presence and absence of mGluRs in PFs and AAs, respectively. This is an important and reasonable possibility and should be tested. The authors could simply check whether slow EPSCs can be recorded by the AA activation.

      Activation of the mGluR slow EPSC by AA stimulation would reveal the presence of mGluRs at AA inputs. We know, however, that sparse PF stimulation does not activate the mGluR slow EPSC nor endocannabinoid release unless glutamate transporters are blocked (Marcaggi and Attwell., 2005). This is thought to reflect insufficient glutamate buildup in the sparse configuration to activate mGluR1s. AA inputs are sparsely distributed and are not expected to activate the slow EPSC either, and this is confirmed by our own experiments (CA personal communication). However, mGluR1 mediated Ca2+ release from stores shows a higher sensitivity to glutamate than the slow EPSC (Canepari and Ogden, 2006) and might take place with sparse inputs, but Ca2+ signals have not been investigated in this configuration. Therefore the absence of the slow EPSC is not sufficient proof that mGluR1s are not activated and not present at AA synapses. This is now further discussed p12.

      Minor points:

      (1) The authors should describe how they adjusted the stimulation strength for both AAs and PFs.

      Adjustment of the stimulation intensity is now described in the Methods section.

      (2) A rationale explaining why the authors chose the current induction protocol (synchronous stimulation of both inputs) should be included. This will help the readers to understand the background of the study.

      Papers by Sims and Hartell (2005, 2006) and experimental evidence indicated that AA and PF inputs may have different properties, and as a result may play different roles. Moreover, based on the morphology of the cerebellar granule cell and Purkinje cell, AA and PF inputs can carry different information to a given Purkinje cell. We reasoned that co-presentation of the inputs might represent an important piece of information for the circuit, signalling functional association, and lead to plasticity, as seen for motor command and sensory feedback in cerebellar-like structures, or for PF and climbing fibre. We have tried to convey that rational in the abstract and introduction.

      (3) Supplemental Figure 2B: the x-axis may be labeled incorrectly, Is the x-axis of the top graph for PF PF-EPSC? Thex-axis for the bottom graphs should be the summation of AA- and PF-EPSCs.

      This has been corrected.

      (4) "mglur1" on page 10 should be mGluR1.

      This has been corrected.

    1. eLife assessment

      This study presents important findings on the differential activity of noradrenergic and dopaminergic input to dorsal hippocampus CA1 in head-fixed mice traversing a runway in a virtual environment that is familiar or novel. The data are rigorously analysed, and the observed divergence in the dynamics of activity in the dopaminergic and noradrenergic axons is solid. Future studies, using specific manipulations of the two distinct midbrain inputs combined with behavioral testing, are required to strengthen the claim that distinct signals to the hippocampus cause distinct behavioral effects.

    2. Reviewer #1 (Public Review):

      Summary:

      Heer and Sheffield used 2 photon imaging to dissect the functional contributions of convergent dopamine and noradrenaline inputs to the dorsal hippocampus CA1 in head restrained mice running down a virtual linear path. Mice were trained to collect water reward at the end of the track and on test days, calcium activity was recorded from dopamine (DA) axons originating in ventral tegmental area (VTA, n=7) and noradrenaline axons from the locus coeruleus (LC, n=87) under several conditions. When mice ran laps in a familiar environment, VTA DA axons exhibited ramping activity along the track that correlated with distance to reward and velocity to some extent, while LC input activity remained constant across the track, but correlated invariantly with velocity and time to motion onset. A subset of recordings taken when the reward was removed showed diminished ramping activity in VTA DA axons, but no changes in the LC axons, confirming that DA axon activity is locked to reward availability. When mice were subsequently introduced to a new environment, the ramping to reward activity in the DA axons disappeared, while LC axons showed a dramatic increase in activity lasting 90s (6 laps) following the environment switch. In the final analysis, the authors sought to disentangle LC axon activity induced by novelty vs. behavioral changes induced by novelty by removing periods in which animals were immobile, and established that the activity observed in the first 2 laps reflected novelty-induced signal in LC axons.

      The revised manuscript included additional evidence of increased (but transient) signal in LC axons after a transition to a novel environment during periods of immobility, and also that a change from dark to familiar environment induces a peak in LC axon activity, showing that LC input to dCA1 may not solely signal novelty.

      Strengths:

      The results presented in this manuscript provide insights into the specific contributions of catecholaminergic input to the dorsal hippocampus CA1 during spatial navigation in a rewarded virtual environment, offering a detailed analysis at the resolution of single axons. The data analysis is thorough and possible confounding variables and data interpretation are carefully considered.

      Weaknesses:

      Aspects of the methodology, data analysis, and interpretation diminish the overall significance of the findings, as detailed below.

      The LC axonal recordings are well powered, but the DA axonal recordings are severely underpowered, with recordings taken from a mere 7 axons (compare to 87 LC axons). Additionally, 2 different calcium indicators with differential kinetics and sensitivity to calcium changes (GCaMP6S and GCaMP7b) were used (n=3, n=4 respectively) and the data pooled. This makes it very challenging to draw any valid conclusions from the data, particularly in the novelty experiment. The surprising lack of novelty-induced DA axon activity may be a false negative. Indeed, at least 1 axon (axon 2) appears to be showing novelty-induced rise in activity in Figure 3C. Changes in activity in 4/7 axons are also referred to as a 'majority' occurrence in the manuscript, which again is not an accurate representation of the observed data

      The authors conducted analysis on recording data exclusively from periods of running in the novelty experiment to isolate the effects of novelty from novelty-induced changes in behavior. However, if the goal is to distinguish between changes in locus coeruleus (LC) axon activity induced by novelty and those induced by motion, analyzing LC axon activity during periods of immobility would enhance the robustness of the results.

      The authors attribute the ramping activity of the DA axons to the encoding of the animals' position relative to reward. However, given the extensive data implicating the dorsal CA1 in timing, and the remarkable periodicity of the behavior, the fact that DA axons could be signalling temporal information should be considered.

      The authors should explain and justify the use of a longer linear track (3m, as opposed to 2m in the DAT-cre mice) in the LC axon recording experiments.

      AFTER REVISIONS:

      The authors have addressed my concerns in a thorough manner. The reviewer also appreciates the increased transparency of reporting in the revised manuscript.

      Listed below are some remaining comments.<br /> The increase in LC activity with any change in environment (from familiar to novel or from dark to familiar) suggests that LC input acts not solely as a novelty signal, but as a general arousal or salience signal in response to environmental changes. Based on this, I have a couple of questions:

      • Is the overall claim that LC input to the dHC signals novelty still valid based on observed findings - as claimed throughout the manuscript?<br /> • Would the omission of a reward be considered a salient change in the environment that activates LC signals, or is the LC not involved with processing reward-related information? Has the activity of LC and VTA axons been analysed in the seconds following reward presentation and/or omission?

    3. Reviewer #2 (Public Review):

      Summary:

      The authors used 2-photon Ca2+-imaging to study the activity of ventral tegmental area (VTA) and locus coeruleus (LC) axons in the CA1 region of the dorsal hippocampus in head-fixed male mice moving on linear paths in virtual reality (VR) environments.

      The main findings were as follows:<br /> - In a familiar environment, activity of both VTA axons and LC axons increased with the mice's running speed on the Styrofoam wheel, with which they could move along a linear track through a VR environment.<br /> - VTA, but not LC, axons showed marked reward position-related activity, showing a ramping-up of activity when mice approached a learned reward position.<br /> - In contrast, activity of LC axons ramped up before initiation of movement on the Styrofoam wheel.<br /> - In addition, exposure to a novel VR environment increased LC axon activity, but not VTA axon activity.

      Overall, the study shows that the activity of catecholaminergic axons from VTA and LC to dorsal hippocampal CA1 can partly reflect distinct environmental, behavioral and cognitive factors. Whereas both VTA and LC activity reflected running speed, VTA, but not LC axon activity reflected approach of a learned reward and LC, but not VTA, axon activity reflected initiation of running and novelty of the VR environment.

      I have no specific expertise with respect to 2-photon imaging, so cannot evaluate the validity of the specific methods used to collect and analyse 2-photon calcium imaging data of axonal activity.

      Strengths:

      (1) Using a state-of-the-art approach to record separately the activity of VTA and LC axons with high temporal resolution in awake mice moving through virtual environments, the authors provide convincing evidence that activity of VTA and LC axons projecting to dorsal CA1 reflect partly distinct environmental, behavioral and cognitive factors.

      (2) The study will help a) to interpret previous findings on how hippocampal dopamine and norepinephrine or selective manipulations of hippocampal LC or VTA inputs modulate behavior and b) to generate specific hypotheses on the impact of selective manipulations of hippocampal LC or VTA inputs on behavior.

      Comments on revised version:

      I thank the authors for including a sample size justification.

      The justification is based on previous studies using similar sample sizes to characterize behavioral correlates of LC and VTA activity and on practical reasons. I note that to improve reproducibility, it would be preferable to have predefined target sample sizes based on predefined plans for statistical analysis.

    4. Reviewer #3 (Public Review):

      Summary:

      Heer and Sheffield provide a well-written manuscript that clearly articulates the theoretical motivation to investigate specific catecholaminergic projections to dorsal CA1 of the hippocampus during a reward-based behavior. Using 2-photon calcium imaging in two groups of cre transgenic mice, the authors examine activity of VTA-CA1 dopamine and LC-CA1 noradrenergic axons during reward seeking in a linear track virtual reality (VR) task. The authors provide a descriptive account of VTA and LC activities during walking, approach to reward, and environment change. Their results demonstrate LC-CA1 axons are activated by walking onset, modulated by walking velocity, and heighten their activity during environment change. In contrast, VTA-CA1 axons were most activated during approach to reward locations. Together the authors provide a functional dissociation between these catecholamine projections to CA1. A major strength to their approach is the methodological rigor of 2-photon recording, data processing, and analysis approaches to accommodate their unequal LC-CA1 and VTA-CA1 sample sizes. These important systems neuroscience studies provide solid evidence that will contribute to the broader field of navigation and memory.

      Weaknesses:

      The conclusions of this manuscript are mostly well supported by the data. However, increasing the sample size of the VTA-CA1 group and using experimental methods that are identical among LC-CA1 and VTA-CA1 groups would help to fully support the author's conclusions.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Please reorder the supplementary figures in the order they are referred to in the Results section for ease of reading. Supp Fig 5 b - should read 'Mean normalized fluorescence of LC ROIs (n = 87) during immobile periods aligned to the switch from familiar to novel environment.’

      We thank the reviewer for highlighting these issues and have reordered the supplementary figures and edited the figure legends appropriately.

      Reviewer #2 (Recommendations For The Authors):

      The authors should include sample size justifications (e.g. based on previous studies, considerations of statistical power, practical considerations, or a combination of these factors).

      In response to this concern, we have added a statement to the “Imaging Sessions” section of the methods. Here we highlight sample sizes were largely based on previous studies and/or limited by the difficulty of recordings and the limited number of visible axons per imaging session.

      Reviewer #3 (Recommendations For The Authors):

      The addition of Supp. Fig 5 partially addresses my previous point 3. However, the claim of dissociation between VTA-CA1 and LC-CA1 would be strengthened by showing that VTA-CA1 axons do not respond to the darkness -> familiar environment in Supp Fig 5. This is particularly important given that (1) the additional 2 VTA-CA1 axons in the revision were not recorded during transitions to novel environments and (2) the overall concern of the reviewers that the low n and heterogeneity of the VTA-CA1 dataset may lead to a false negative. Providing VTA-CA1 data for the darkness -> familiar environment would provide a within-manuscript replication that these axons are not responding to environment changes; a major claim of this manuscript.

      While we agree that data of VTA-CA1 axons during the switch from darkness to the familiar environment would provide additional evidence that these axons are not responding to environment changes, unfortunately, VTA axons were not recorded during the switch from familiar to novel.

    1. eLife assessment

      The authors present 16 new well-preserved specimens from the early Cambrian Chengjiang biota. These specimens potentially represent a new taxon which could be useful in sorting out the problematic topology of artiopodan arthropods - a topic of interest to specialists in Cambrian arthropods. The authors provide solid anatomical and phylogenetic evidence in support of a new interpretation of the homology of dorsal sutures in trilobites and their relatives.

    2. Reviewer #1 (Public Review):

      Summary:<br /> Du et al. report 16 new well-preserved specimens of atiopodan arthropods from the Chengjiang biota, which demonstrate both dosal and vental anatomies of a pothential new taxon of atiopodans that are closely related to trolobites. Authors assigned their specimens to Acanthomeridion serratum, and proposed A. anacanthus as a junior subjective synonym of Acanthomeridion serratum. Critially, the presence of ventral plates (interpreted as cephalic liberigenae), together with phylogenic results, lead authors to conclude that the cephalic sutures originated multiple times within the Artiopoda.

      Strengths:<br /> New specimens are highly qualified and informative. The morphology of dorsal exoskeleton, except for the supposed free cheek, were well illustrated and described in detail, which provide a wealth of information for taxonmic and phylogenic analyses.

      Weaknesses:<br /> The weaknesses of this work is obvious in a number of aspects. Technically, ventral morphlogy is less well revealed and is poorly illustrated. Additional diagrams are necessary to show the trunk appendages and suture lines. Taxonomically, I am not convinced by authors' placement. The specimens are markedly different from either Acanthomeridion serratum Hou et al. 1989 or A. anacanthus Hou et al. 2017. The ontogenetic description is extremely weak and the morpholical continuity is not established. Geometric and morphomitric analyses might be helpful to resolve the taxonomic and ontogenic uncertainties. I am confused by author's description of free cheek (libragena) and ventral plate. Are they the same object? How do they connect with other parts of cephalic shield, e.g. hypostome and fixgena. Critically, homology of cephalic slits (eye slits, eye notch, doral suture, facial suture) not extensivlely discussed either morphologically or functionally. Finally, authors claimed that phylogenic results support two separate origins rather than a deep origin. However, the results in Figure 4 can be explain a deep homology of cephalic suture in molecular level and multiple co-options within the Atiopoda.

      Comments on the revised version:

      I have seen the extensive revision of the manuscript. The main point "Multiple origins of dorsal ecdysial sutures in atiopoans" is now partially supported by results presented by the authors. I am still unsatisfied with descriptions and interpretations of critical features newly revealed by authors. The following points might be useful for the author to make further revisions.

      (1) The antennae were well illustrated in a couple of specimens, while it was described in a short sentence.<br /> (2) There are also imprecise descriptions of features.<br /> (3) Ontogeny of the cephalon was not described.<br /> (3) The critical head element is the so called "ventral plate". How this element connects with the cephalic shield is not adequately revealed. The authors claimed that the suture is along the cephalic margin. However, the lateral margin of cephalon is not rounded but exhibit two notches (e.g. Fig 3C) . This gives an indication that the supposed ventral plates have a dorsal extension to fit the notches. Alternatively, the "ventral plate" can be interpreted as a small free cheek with a large ventral extension, providing evidence for librigenal hypothesis.

    3. Reviewer #3 (Public Review):

      Summary:

      Well-illustrated new material is documented for Acanthomeridion, a formerly incompletely known Cambrian arthropod. The formerly known facial sutures are proposed be associated with ventral plates that the authors homologise with the free cheeks of trilobites (although also testing alternative homologies). An update of a published phylogenetic dataset permits reconsideration of whether dorsal ecdysial sutures have a single or multiple origins in trilobites and their relatives.

      Strengths:

      Documentation of an ontogenetic series makes a sound case that the proposed diagnostic characters of a second species of Acanthomeridion are variation within a single species. New microtomographic data shed light on appendage morphology that was not formerly known. The new data on ventral plates and their association with the ecdysial sutures are valuable in underpinning homologies with trilobites.

      I think the revision does a satisfactory job of reconciling the data and analyses with the conclusions drawn from them. Referee 1's valid concerns about whether a synonymy of Acanthomeridion anacanthus is justified have been addressed by the addition of a length/width scatterplot in Figure 6. Referee 2's doubts about homology between the librigenae of trilobites and ventral plates of Acanthomeridion have been taken on board by re-running the phylogenetic analyses with a coding for possible homology between the ventral plates and the doublure of olenelloid trilobites. The authors sensibly added more trilobite terminals to the matrix (including Olenellus) and did analyses with and without constraints for olenelloids being a grade at the base of Trilobita. My concerns about counting how many times dorsal sutures evolved on a consensus tree have been addressed (the authors now play it safe and say "multiple" rather than attempting to count them on a bushy topology). The treespace visualisation (Figure 9) is a really good addition to the revised paper.

      Weaknesses:

      The question of how many times dorsal ecdysial sutures evolved in Artiopoda was addressed by Hou et al (2017), who first documented the facial sutures of Acanthomeridion and optimised them onto a phylogeny to infer multiple origins, as well as in a paper led by the lead author in Cladistics in 2019. Du et al. (2019) presented a phylogeny based on an earlier version of the current dataset wherein they discussed how many times sutures evolved or were lost based on their presence in Zhiwenia/Protosutura, Acanthomeridion and Trilobita. The answer here is slightly different (because some topologies unite Acanthomeridion and trilobites). This paper is not a game-changer because these questions have been asked several times over the past seven years, but there are solid, worthy advances made here.

      I'd like to see some of the most significant figures from the Supplementary Information included in the main paper so they will be maximally accessed. The "stick-like" exopods are not best illustrated in the main paper; their best imagery is in Figure S1. Why not move that figure (or at least its non-redundant panels) as well as the reconstruction (Figure S7) to the main paper? The latter summarises the authors' interpretation that a large axe-shaped hypostome appears to be contiguous with ventral plates. The specimens depict evidence for three pairs of post-antennal cephalic appendages but it's a bit hard to picture how they functioned if there's no room between the hypostome and ventral plates. Also, a comment is required on the reconstruction involving all cephalic appendages originating against/under the hypostome rather the first pair being paroral near the posterior end of the hypostome and the rest being post-hypostomal as in trilobites.

    4. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment 

      The authors present 16 new well-preserved specimens from the early Cambrian Chengjiang biota. These specimens potentially represent a new taxon which could be useful in sorting out the problematic topology of artiopodan arthropods - a topic of interest to specialists in Cambrian arthropods. Because the anatomic features in the new specimens were neither properly revealed nor correctly interpreted, the evidence for several conclusions is inadequate. 

      We thank the Senior Editor, Reviewing Editor and three reviewers for their work, and for their comments aimed at improving this project and manuscript. We have engaged with all the comments in detail, in order to strengthen our work. This includes adding additional data to support that all Acanthomeridion specimens belong to a single species, running further phylogenetic analyses including more trilobite terminals to test the specific hypothesis and interpretation raised by Reviewer 2, and visualising our results in treespace in order to determine support for the different interpretations of the ventral structures and their implications for the evolution of Artiopoda. We have also greatly expanded the introduction, which we feel adds clarity to areas misunderstood by some reviewers in the previous version of the manuscript.

      Our point-by-point response to the public reviews of the reviewers are outlined below. We have also made changes resulting from the additional suggestions which are not public, which we have not reproduced below. We submit a new version of the main text, and can provide a tracked changes version if required. The new main text includes 9 figures and is 8624 words including captions and reference list.

      Public Reviews: 

      Reviewer #1 (Public Review): 

      Summary: 

      Du et al. report 16 new well-preserved specimens of atiopodan arthropods from the Chengjiang biota, which demonstrate both dorsal and ventral anatomies of a potential new taxon of artipodeans that are closely related to trilobites. Authors assigned their specimens to Acanthomeridion serratum and proposed A. anacanthus as a junior subjective synonym of Acanthomeridion serratum. Critically, the presence of ventral plates (interpreted as cephalic liberigenae), together with phylogenic results, lead authors to conclude that the cephalic sutures originated multiple times within the Artiopoda. 

      We thank Reviewer 1 for their comments on the strengths and weaknesses of the previous version of the manuscript. We hope that the revised version strengthens our conclusions that Acanthomeridion anacanthus is a junior synonym of A. serratum.

      Strengths: 

      New specimens are highly qualified and informative. The morphology of the dorsal exoskeleton, except for the supposed free cheek, was well illustrated and described in detail, which provides a wealth of information for taxonomic and phylogenic analyses. 

      Weaknesses: 

      The weaknesses of this work are obvious in a number of aspects. Technically, ventral morphology is less well revealed and is poorly illustrated. Additional diagrams are necessary to show the trunk appendages and suture lines. Taxonomically, I am not convinced by the authors' placement. The specimens are markedly different from either Acanthomeridion serratum Hou et al. 1989 or A. anacanthus Hou et al. 2017. The ontogenetic description is extremely weak and the morpholical continuity is not established. Geometric and morphometric analyses might be helpful to resolve the taxonomic and ontogenic uncertainties. 

      We appreciate that the reviewer was not convinced by our synonimisation in the first version of the manuscript. The recommendation of the reviewer to provide linear morphometric support for our synonymisation was much appreciated. We have provided measurements of the length and width of the thorax (Figure 6 in the new version), visualising the position of specimens previously assigned to A. anacanthus, to show this morphological continuity. These act as a complement to Figure 5, which shows the fossils in an ontogenetic trend.

      I am confused by the author's description of the free cheek (libragena) and ventral plate. Are they the same object? How do they connect with other parts of the cephalic shield, e.g. hypostome, and fixgena? Critically, the homology of cephalic slits (eye slits, eye notch, dorsal suture, facial suture) is not extensively discussed either morphologically or functionally.

      We appreciate that the brevity of the introduction in the previous version led to some misunderstandings and some confusion. We have provided a greatly expanded introduction, including a new Figure 1, which outlines the possible homologies of the ventral plates and the three hypotheses considered in this study. The function of the cephalic and dorsal suture are now discussed in more detail both in introduction and discussion.

      Finally, the authors claimed that phylogenic results support two separate origins rather than a deep origin. However, the results in Figure 4 can explain a deep homology of the cephalic suture at molecular level and multiple co-options within the Atiopoda. 

      A deep molecular origin is difficult to demonstrate using solely fossil material from an extinct group such as Artiopoda. Thus our study focuses on morphological origins. The number of losses required for a deep morphological origin means that we favour multiple independent morphological origins.

      Reviewer #2 (Public Review): 

      Overall: This paper describes new material of Acanthomeridion serratum that the authors claim supports its synonymy with Acanthomeridion anacanthus. The material is important and the description is acceptable after some modification. In addition, the paper offers thoughts and some exploration of the possibility of multiple origins of the dorsal facial suture among artiopods, at least once within Trilobita and also among other non-trilobite artiopods. Although this possibility is real and apparently correct, the suggestions presented in this paper are both surprising and, in my opinion, unlikely to be true because the potential homologies proposed with regard to Acanthomeridion and trilobite-free cheeks are unconventional and poorly supported. 

      What to do? I can see two possibilities. One, which I recommend, is to concentrate on improving the descriptive part of the paper and omit discussion and phylogenetic analysis of dorsal facial suture distribution, leaving that for more comprehensive consideration elsewhere. The other is to seek to improve both simultaneously. That may be possible but will require extensive effort. 

      We thank the reviewer for their detailed comments and suggestions for multiple ways in which we might revise the manuscript. We have taken the option that is more effort, but we hope more reward, in interrogating the larger question alongside improving the descriptive part of the paper. This has taken a long time and incorporation of new techniques, but has in our opinion greatly strengthened the work.

      Major concerns 

      Concern 1 - Ventral sclerites as free cheek homolog, marginal sutures, and the trilobite doublure 

      Firstly, a couple of observations that bear on the arguments presented - the eyes of A. serratum are almost marginal and it is not clear whether a) there is a circumocular suture in this animal and b) if there was, whether it merged with the marginal suture. These observations are important because this animal is not one in which an impressive dorsal facial suture has been demonstrated - with eyes that near marginal it simply cannot do so. Accordingly, the key argument of this paper is not quite what one would expect. That expectation would be that a non-trilobite artiopod, such as A. serratum, shows a clear dorsal facial suture. But that is not the case, at least with A. serratum, because of its marginal eyes. Rather, the argument made is that the ventral doublure of A. serratum is the homolog of the dorsal free cheeks of trilobites. This opens up a series of issues. 

      We appreciate that the reviewer disagrees with both interpretations we offered for the ventral plates, and has offered a third interpretation for the homology of this feature with the doublure of trilobites. Support for our original interpretation comes from the position of the eye stalks in Acanthomeridion, which fall very close to the suture between ventral plate rest of the cephalon. However, we appreciate that the reviewer has a valid interpretation, that the ventral plates might be homologues of the doublure alone.

      To clarify the (two, now three) hypotheses of homology for the ventral plates considered in this study, we provide a new summary figure (Figure 1). In addition, the introduction has been greatly lengthened with further discussion of the different suture types in trilobites, their importance for trilobite classification schemes, and extensive references to older literature are now included. Further, we add background to the hypotheses around the origins of dorsal ecdysial sutures. 

      We add that the interpretation of A. serratum as having features homologous to the dorsal sutures of trilobites is already present in the literature, and so while the reviewer may disagree with it, it is certainly a hypothesis that requires testing.

      The paper's chief claim in this regard is that the "teardrop" shaped ventral, lateral cephalic plates in Acanthomeridion serratum are potential homologs of the "free cheeks" of those trilobites with a dorsal facial suture. There is no mention of the possibility that these ventral plates in A. serratum could be homologs of the lateral cephalic doublure of olenelloid trilobites, which is bound by an operative marginal suture or, in those trilobites with a dorsal facial suture, that it is a homolog of only the doublure portions of the free cheeks and not with their dorsal components. 

      We include this third possibility in our revised analyses and manuscript. To test this properly required adding in an olenelloid trilobite to our matrix, as we needed a terminal that had both a marginal and circumoral suture, but not fused. We chose Olenellus getzi for this purpose, as it is the only Olenellus with some appendages known (the antennae). We also added further characters to the morphological matrix, and additional trilobites from which soft tissues are known, in order to better resolve this part of the tree. Trilobites in the final analyses were: Anacheirurus adserai, Cryptolithus tesselatus, Eoredlichia intermedia, Olenoides serratus, Olenellus getzi, Triarthrus eatoni.

      However, addition of these trilobites added a further complication. Under unconstrained analysis, Olenellus getzi was resolved with Eoredlichia intermediata as a clade sister to all other trilobites.

      Thus the topology of Paterson et al. 2019 (PNAS) was not recovered, and so the hypothesis of Reviewer 2 could not be robustly tested. In order to achieve a topology comparable to Paterson et al., we ran a further three analyses, where we constrained a clade of all trilobites except for O. getzi. This recovered a topology where the earliest diverging trilobites had unfused sutures, and thus one suitable for considering the role of Acanthomeridion serratum ventral plates as homologues of the doublure of trilobites.

      Unfortunately, for these analyses (both constrained and unconstrained), Acanthomeridion was not resolved as sister to trilobites, but instead elsewhere in the tree (see Table 1 in main text, Fig. 9, and  SFig 9). Thus our analyses do not find support for the reviewer’s hypothesis as multiple origins of this feature are still required.

      It was still an excellent point that we should consider this hypothesis, and we have retained it, and discussion surrounding it, in our manuscript.

      The introduction to the paper does not inform the reader that all olenelloids had a marginal suture - a circumcephalic suture that was operative in their molting and that this is quite different from the situation in, say, "Cedaria" woosteri in which the only operative cephalic exoskeletal suture was circumocular. The conservative position would be that the olenelloid marginal suture is the homolog of the marginal suture in A. serratum: the ventral plates thus being homolog of the trilobite cephalic doublure, not only potential homolog to the entire or dorsal only part of the free cheeks of trilobites with a dorsal facial suture. As the authors of this paper decline to discuss the doublure of trilobites (there is a sole mention of the word in the MS, in a figure caption) and do not mention the olenelloid marginal suture, they give the reader no opportunity to assess support for this alternative. 

      At times the paper reads as if the authors are suggesting that olenelloids, which had a marginal cephalic suture broadly akin to that in Limulus, actually lacked a suture that permitted anterior egression during molting. The authors are right to stress the origin of the dorsal cephalic suture in more derived trilobites as a character seemingly of taxonomic significance but lines such as 56 and 67 may be taken by the non-specialist to imply that olenelloids lacked a forward egressionpermiting suture. There is a notable difference between not knowing whether sutures existed (a condition apparently quite common among soft-bodied artiopods) and the well-known marginal suture of olenelloids, but as the MS currently reads most readers will not understand this because it remains unexplained in the MS. 

      As noted in response to a previous point (above) we now have a greatly expanded introduction which should give the reader an opportunity to assess support for this alternative hypothesis. We now include Olenellus getzi in our analyses, and have added characters to the morphological matrix to make this clear.

      A reference to the case of ‘Cedaria’ woosteri is made in the introduction to highlight further the variability of trilobites, as is a reference to Foote’s analysis of cranidial shapes and support this provides for a  single origin of the dorsal suture.

      With that in mind, it is also worth further stressing that the primary function of the dorsal sutures in those which have them is essentially similar to the olenelloid/limulid marginal suture mentioned above. It is notable that the course of this suture migrated dorsally up from the margin onto the dorsal shield and merged with the circumocular suture, but this innovation does not seem to have had an impact on its primary function - to permit molting by forward egression. Other trilobites completely surrendered the ability to molt by forward egression, and there are even examples of this occurring ontogenetically within species, suggesting a significant intraspecific shift in suture functionality and molting pattern. The authors mention some of this when questioning the unique origin of the dorsal facial suture of trilobites, although I don't understand their argument: why should the history of subsequent evolutionary modification of a character bear on whether its origin was unique in the group? 

      We include reference to evolutionary modification and loss of this character as it is important to stress that if a character is known to have been lost multiple times it is possible that it had a deeper root (in an earlier diverging member of Artiopoda than Trilobita) and was lost in olenelloids. This is the question that we seek to address in our manuscript.

      The bottom line here is that for the ventral plates of A. serratum to be strict homologs of only the dorsal portion of the dorsal free cheeks, there would be no homolog of the trilobite doublure in A. serratum. The conventional view, in contrast, would be that the ventral plates are a homolog of the ventral doublure in all trilobites and ventral plates in artiopods. I do not think that this paper provides a convincing basis for preferring their interpretation, nor do I feel that it does an adequate job of explaining issues that are central to the subject. 

      We stress that our interpretations – that the ventral plates are not homologous to any artiopodan feature or that they are homologous to the free cheeks of trilobites – have both been raised in the literature before. Whereas we could not find mention of the reviewer’s ‘conventional view’ relating to Acanthomeridion. We appreciate that this view is still valid and worth investigating, which we have done in the further analyses conducted. However, we did not find support for it. Instead we find some support for both ventral plates as homologues of free cheeks, and as unique structures within Artiopoda.

      Concern 2. Varieties of dorsal sutures and the coexistence of dorsal and marginal sutures 

      The authors do not clarify or discuss connections between the circumocular sutures (a form of dorsal suture that separates the visual surface from the rest of the dorsal shield) and the marginal suture that facilitates forward egression upon molting. Both structures can exist independently in the same animal - in olenelloids for example. Olenelloids had both a suture that facilitated forward egression in molting (their marginal suture) and a dorsal suture (their circumocular suture). The condition in trilobites with a dorsal facial suture is that these two independent sutures merged - the formerly marginal suture migrating up the dorsal pleural surface to become confluent with the circumocular suture. (There are also interesting examples of the expansion of the circumocular suture across the pleural fixigena.) The form of the dorsal facial suture has long figured in attempts at higher-level trilobite taxonomy, with a number of character states that commonly relate to the proximity of the eye to the margin of the cephalic shield. The form of the dorsal facial suture that they illustrate in Xanderella, which is barely a strip crossing the dorsal pleural surface linking marginal and circumocular suture, is comparable to that in the trilobites Loganopeltoides and Entomapsis but that is a rare condition in that clade as a whole. The paper would benefit from a clear discussion of these issues at the beginning - the dorsal facial suture that they are referring to is a merged circumcephalic suture and circumocular suture - it is not simply the presence of a molt-related suture on the dorsal side of the cephalon. 

      We have added in an expanded introduction where these points are covered in detail. We appreciate that this was not clear in the earlier version, and this suggestion has greatly improved our work.

      Concern 3. Phylogenetics 

      While I appreciate that the phylogenetic database is a little modified from those of other recent authors, still I was surprised not to find a character matrix in the supplementary information (unless it was included in some way I overlooked), which I would consider a basic requirement of any paper presenting phylogenetic trees - after all, there's no a space limit. It is not possible for a reviewer to understand the details of their arguments without seeing the character states and the matrix of state assignments. 

      A link to a morphobank project was included in the first submission. This project has been updated for the current submission, including an additional matrix to treat the reviewer’s hypothesis for the ventral plates. Morphobank Project #P4290. Email address: P4290, reviewer password:

      Acanthomeridion2023, accessible at morphobank.org. We have added in additional details for the reviewer and others to help them access the project:

      The project can be accessed at morphobank.org, using the below credentials to log in:  Email address: P4290, Password: Acanthomeridion 2023.

      The section "phylogenetic analyses" provides a description of how tree topology changes depending on whether sutures are considered homologous or not using the now standard application of both parsimony and maximum likelihood approaches but, considering that the broader implications of this paper rest of the phylogenetic interpretation, I also found the absence of detailed discussion of the meaning and implications of these trees to be surprising, because I anticipated that this was the main reason for conducting these analysis. The trees are presented and briefly described but not considered in detail. I am troubled by "Circles indicate presence of cephalic ecdysial sutures" because it seems that in "independent origin of sutures" trilobites are considered to have two origins (brown color dot) of cephalic ecdysial sutures - this may be further evidence that the team does not appreciate that olenelloids have cephalic ecdysial sutures, as the basal condition in all trilobites. Perhaps I'm misunderstanding their views, but from what's presented it's not possible to know that. Similarly, in the "sutures homologous" analyses why would there be two independent green dots for both Acanthomeridion and Trilobita, rather than at the base of the clade containing them both, as cephalic ecdysial sutures are basal to both of them? Here again, we appear to see evidence that the team considers dorsal facial sutures and cephalic ecdysial sutures to be synonymous - which is incorrect.  

      We appreciate that the reviewer misunderstood the meaning of the dots, leading to confusion. The dots indicated how features were coded in the phylogenetic analysis. In our revised version of this figure (Figure 8 in the new version), these dots are now clearly labelled as indicating ‘coding in phylogenetic matrix’. Further, with the revised character list, we now can provide additional detail for the types of sutures (relevant as we now include more trilobite terminals).

      This point aside, and at a minimum, that team needs to do a more thorough job of characterizing and considering the variety of conditions of dorsal sutures among artiopods, their relationships to the marginal suture and to the circumocular suture, the number, and form of their branches, etc. 

      We thank the reviewer for this summary, and appreciate their concerns and thorough review. Our revised version takes into account all these points raised, and they have greatly improved the clarity, scope and thoroughness of the work.

      Reviewer #3 (Public Review): 

      Summary:

      Well-illustrated new material is documented for Acanthomeridion, a formerly incompletely known Cambrian arthropod. The formerly known facial sutures are shown to be associated with ventral plates that the authors very reasonably homologise with the free cheeks of trilobites. A slight update of a phylogenetic dataset developed by Du et al, then refined slightly by Chen et al, then by Schmidt et al, and again here, permits another attempt to optimise the number of origins of dorsal ecdysial sutures in trilobites and their relatives. 

      Strengths:

      Documentation of an ontogenetic series makes a sound case that the proposed diagnostic characters of a second species of Acanthomeridion are variations within a single species. New microtomographic data shed some light on appendage morphology that was not formerly known. The new data on ventral plates and their association with the ecdysial sutures are valuable in underpinning homologies with trilobites. 

      We thank the Reviewer 3 for their positive comments about the manuscript. We appreciate the constructive comments for improvements, and detailed corrections, which we have incorporated into our revised work.

      Weaknesses:

      The main conclusion remains clouded in ambiguity because of a poorly resolved Bayesian consensus and is consistent with work led by the lead author in 2019 (thus compromising the novelty of the findings). The Bayesian trees being majority rules consensus trees, optimising characters onto them (Figure 7b, d) is problematic. Optimising on a consensus tree can produce spurious optimisations that inflate tree length or distort other metrics of fit. Line 264 refers to at least three independent origins of cephalic sutures in artiopodans but the fully resolved Figure 7c requires only two origins. 

      We thank the reviewer for pointing this out. However now the analyses have been re-run we have new results to consider. The results still support multiple origins of sutures. We also note that the dots were indicating how terminals were coded. This is now clearer in the revised version of this figure (Figure 8 in the new version).

      We have extended our interrogation of the trees by incorporating treespace analyses. These add support for the nodes of interest (around the base of trilobites), showing that the coding of Acanthomeridion ventral plate homologies impacts its position in the tree, and thus has implications for our understanding of the evolution of sutures in trilobites.

      The question of how many times dorsal ecdysial sutures evolved in Artiopoda was addressed by Hou et al (2017), who first documented the facial sutures of Acanthomeridion and optimised them onto a phylogeny to infer multiple origins, as well as in a paper led by the lead author in Cladistics in 2019. Du et al. (2019) presented a phylogeny based on an earlier version of the current dataset wherein they discussed how many times sutures evolved or were lost based on their presence in

      Zhiwenia/Protosutura, Acanthomeridion, and Trilobita. To their credit, the authors acknowledge this (lines 62-65). The answer here is slightly different (because some topologies unite Acanthomeridion and trilobites). 

      The following points are not meant to be "Weaknesses" but rather are refinements: 

      I recommend changing the title of the paper from "cephalic sutures" to "dorsal ecdysial sutures" to be more precise about the character that is being tracked evolutionarily. Lots of arthropods have cephalic sutures (e.g., the ventral marginal suture of xiphosurans; the Y-shaped dorsomedian ecdysial line in insects). The text might also be updated to change other instances of "cephalic sutures" to a more precise wording. 

      We appreciate this point and have changed the title as suggested. 

      The authors have provided (but not explicitly identified) support values for nodes in their Bayesian trees but not in their parsimony ones. Please do the jackknife or bootstrap for the parsimony analyses and make it clear that the Bayesian values are posterior probabilities. 

      With the addition of further trilobite terminals to our parsimony analyses, the results became poor.

      Specifically the internal relationships of trilobites did not conform to any previous study, and Olenellus getzi was not resolved as an early diverging member of the group. This meant that these analyses could not be used for addressing the hypothesis of reviewer two. We decided to exclude reporting parsimony analysis results from this version to avoid confusion.

      We have added a note that the values reported at the nodes are posterior probabilities to figures S8, S9 and S10 where we show the full Bayesian results.

      In line 65 or somewhere else, it might be noted that a single origin of the dorsal facial sutures in trilobites has itself been called into question. Jell (2003) proposed that separate lineages of Eutrilobita evolved their facial sutures independently from separate sister groups within Olenellina. 

      We have added this to the introduction (Line 98). Thank you for raising this point.

      I have provided minor typographic or terminological corrections to the authors in a list of recommendations that may not be publicly available. 

      We appreciate the points made by the reviewer and their detailed corrections, which we have corrected in the revised version.

    1. eLife assessment

      This study provides valuable new insights into how multisensory information is processed in the lateral cortex of the inferior colliculus, a poorly understood part of the auditory midbrain. By developing new imaging techniques that provide the first optical access to the lateral cortex in a living animal, the authors provide convincing in vivo evidence that this region contains separate subregions that can be distinguished by their sensory inputs and neurochemical profiles, as suggested by previous anatomical and in vitro studies. This work provides a foundation for future research exploring how this part of the auditory midbrain contributes to multisensory-based behavior.

    2. Reviewer #1 (Public Review):

      In this paper the authors provide a characterisation of auditory responses (tones, noise, and amplitude modulated sounds) and bimodal (somatosensory-auditory) responses and interactions in the higher order lateral cortex (LC) of the inferior colliculus (IC) and compare these characteristic with the higher order dorsal cortex (DC) of the IC - in awake and anaesthetised mice. Dan Llano's group have previously identified gaba'ergic patches (modules) in the LC distinctly receiving inputs from somatosensory structures, surrounded by matrix regions receiving inputs from auditory cortex. They here use 2P calcium imaging combined with an implanted prism to - for the first time - get functional optical access to these subregions (modules and matrix) in the lateral cortex of IC in vivo, in order to also characterise the functional difference in these subparts of LC. They find that both DC and LC of both awake and anaesthetised appears to be more responsive to more complex sounds (amplitude modulated noise) compared to pure tones and that under anesthesia the matrix of LC is more modulated by specific frequency and temporal content compared to the gaba'ergic modules in LC. However, while both LC and DC appears to have low frequency preferences, this preference for low frequencies is more pronounced in DC. Furthermore, in both awake and anesthetized mice somatosensory inputs are capable of driving responses on its own in the modules of LC, but very little in the matrix. The authors now compare bimodal interactions under anaesthesia and awake states and find that effects are different in some cases under awake and anesthesia - particularly related to bimodal suppression and enhancement in the modules.

      The paper provides new information about how subregions with different inputs and neurochemical profiles in the higher order auditory midbrain process auditory and multisensory information, and is useful for the auditory and multisensory circuits neuroscience community.

    3. Reviewer #2 (Public Review):

      Summary:

      The study describes differences in responses to sounds and whisker deflections as well as combinations of these stimuli in different neurochemically defined subsections of the lateral and dorsal cortex of the inferior colliculus in anesthetised and awake mice.

      Strengths:

      A major achievement of the work lies in obtaining the data in the first place as this required establishing and refining a challenging surgical procedure to insert a prism that enabled the authors to visualise the lateral surface of the inferior colliculus. Using this approach, the authors were then able to provide the first functional comparison of neural responses inside and outside of the GABA-rich modules of the lateral cortex. The strongest and most interesting aspects of the results, in my opinion, concern the interactions of auditory and somatosensory stimulation. For instance, the authors find that a) somatosensory-responses are strongest inside the modules and b) somatosensory-auditory suppression is stronger in the matrix than in the modules. This suggests that, while somatosensory inputs preferentially target the GABA-rich modules, they do not exclusively target GABAergic neurons within the modules (given that the authors record exclusively from excitatory neurons we wouldn't expect to see somatosensory responses if they targeted exclusively GABAergic neurons) and that the GABAergic neurons of the modules (consistent with previous work) preferentially impact neurons outside the modules, i.e. via long-range connections.

    4. Reviewer #3 (Public Review):

      The lateral cortex of the inferior colliculus (LC) is a region of the auditory midbrain noted for receiving both auditory and somatosensory input. Anatomical studies have established that somatosensory input primarily impinges on "modular" regions of the LC, which are characterized by high densities of GABAergic neurons, while auditory input is more prominent in the "matrix" regions that surround the modules. However, how auditory and somatosensory stimuli shape activity, both individually and when combined, in the modular and matrix regions of the LC has remained unknown.

      The major obstacle to progress has been the location of the LC on the lateral edge of the inferior colliculus where it cannot be accessed in vivo using conventional imaging approaches. The authors overcame this obstacle by developing methods to implant a microprism adjacent to the LC. By redirecting light from the lateral surface of the LC to the dorsal surface of the microprism, the microprism enabled two-photon imaging of the LC via a dorsal approach in anesthetized and awake mice. Then, by crossing GAD-67-GFP mice with Thy1-jRGECO1a mice, the authors showed that they could identify LC modules in vivo using GFP fluorescence while assessing neural responses to auditory, somatosensory, and multimodal stimuli using Ca2+ imaging. Critically, the authors also validated the accuracy of the microprism technique by directly comparing results obtained with a microprism to data collected using conventional imaging of the dorsal-most LC modules, which are directly visible on the dorsal IC surface, finding good correlations between the approaches.

      Through this innovative combination of techniques, the authors found that matrix neurons were more sensitive to auditory stimuli than modular neurons, modular neurons were more sensitive to somatosensory stimuli than matrix neurons, and bimodal, auditory-somatosensory stimuli were more likely to suppress activity in matrix neurons and enhance activity in modular neurons. Interestingly, despite their higher sensitivity to somatosensory stimuli than matrix neurons, modular neurons in the anesthetized prep were overall more responsive to auditory stimuli than somatosensory stimuli (albeit with a tendency to have offset responses to sounds). This suggests that modular neurons should not be thought of as primarily representing somatosensory input, but rather as being more prone to having their auditory responses modified by somatosensory input. However, this trend was different in the awake prep, where modular neurons became more responsive to somatosensory stimuli. Thus, to this reviewer, one of the most intriguing results of the present study is the extent to which neural responses in the LC changed in the awake preparation. While this is not entirely unexpected, the magnitude and stimulus specificity of the changes caused by anesthesia highlight the extent to which higher-level sensory processing is affected by anesthesia and strongly suggests that future studies of LC function should be conducted in awake animals.

      Together, the results of this study expand our understanding of the functional roles of matrix and module neurons by showing that responses in LC subregions are more complicated than might have been expected based on anatomy alone. The development of the microprism technique for imaging the LC will be a boon to the field, finally enabling much-needed studies of LC function in vivo. The experiments were well-designed and well-controlled, the limitations of two-photon imaging for tracking neural activity are acknowledged, and appropriate statistical tests were used.

    5. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this paper the authors provide a characterisation of auditory responses (tones, noise, and amplitude modulated sounds) and bimodal (somatosensory-auditory) responses and interactions in the higher order lateral cortex (LC) of the inferior colliculus (IC) and compare these characteristic with the higher order dorsal cortex (DC) of the IC - in awake and anaesthetised mice. Dan Llano's group have previously identified gaba'ergic patches (modules) in the LC distinctly receiving inputs from somatosensory structures, surrounded by matrix regions receiving inputs from auditory cortex. They here use 2P calcium imaging combined with an implanted prism to - for the first time - get functional optical access to these subregions (modules and matrix) in the lateral cortex of IC in vivo, in order to also characterise the functional difference in these subparts of LC. They find that both DC and LC of both awake and anaesthetised appears to be more responsive to more complex sounds (amplitude modulated noise) compared to pure tones and that under anesthesia the matrix of LC is more modulated by specific frequency and temporal content compared to the gaba'ergic modules in LC. However, while both LC and DC appears to have low frequency preferences, this preference for low frequencies is more pronounced in DC. Furthermore, in both awake and anesthetized mice somatosensory inputs are capable of driving responses on its own in the modules of LC, but very little in the matrix. The authors now compare bimodal interactions under anaesthesia and awake states and find that effects are different in some cases under awake and anesthesia - particularly related to bimodal suppression and enhancement in the modules.

      The paper provides new information about how subregions with different inputs and neurochemical profiles in the higher order auditory midbrain process auditory and multisensory information, and is useful for the auditory and multisensory circuits neuroscience community.

      The manuscript is improved by the response to reviewers. The authors have addressed my comments by adding new figures and panels, streamlining the analysis between awake and anaesthetised data (which has led to a more nuanced, and better supported conclusion), and adding more examples to better understand the underlying data. In streamlining the analyses between anaesthetised and awake data I would probably have opted for bringing these results into merged figures to avoid repetitiveness and aid comparison, but I acknowledge that that may be a matter of style. The added discussions of differences between awake and anaesthesia in the findings and the discussion of possible reasons why these differences are present help broaden the understanding of what the data looks like and how anaesthesia can affect these circuits.

      As mentioned in my previous review, the strength of this study is in its demonstration of using prism 2p imaging to image the lateral shell of IC to gain access to its neurochemically defined subdivisions, and they use this method to provide a basic description of the auditory and multisensory properties of lateral cortex IC subdivisions (and compare it to dorsal cortex of IC). The added analysis, information and figures provide a more convincing foundation for the descriptions and conclusions stated in the paper. The description of the basic functionality of the lateral cortex of the IC are useful for researchers interested in basic multisensory interactions and auditory processing and circuits. The paper provides a technical foundation for future studies (as the authors also mention), exploring how these neurochemically defined subdivisions receiving distinct descending projections from cortex contribute to auditory and multisensory based behaviour.

      Minor comment:

      - The authors have now added statistics and figures to support their claims about tonotopy in DC and LC. I asked for and I think allows readers to better understand the tonotopical organisation in these areas. One of the conclusions by the authors is that the quadratic fit is a better fit that a linear fit in DCIC. Given the new plots shown and previous studies this is likely true, though it is worth highlighting that adding parameters to a fitting procedure (as in the case when moving from linear to quadratic fit) will likely lead to a better fit due to the increased flexibility of the fitting procedure.

      Thank you for the suggestion. We have highlighted that the quadratic function allowed the regression model to include the cells tuned to higher frequencies at the rostromedial part of the DC and result in a better fit, which is consistent with the tonotopic organization that was previously described as shown in text at (lines 208-211).

      Reviewer #2 (Public Review):

      Summary:

      The study describes differences in responses to sounds and whisker deflections as well as combinations of these stimuli in different neurochemically defined subsections of the lateral and dorsal cortex of the inferior colliculus in anesthetised and awake mice.

      Strengths:

      A major achievement of the work lies in obtaining the data in the first place as this required establishing and refining a challenging surgical procedure to insert a prism that enabled the authors to visualise the lateral surface of the inferior colliculus. Using this approach, the authors were then able to provide the first functional comparison of neural responses inside and outside of the GABA-rich modules of the lateral cortex. The strongest and most interesting aspects of the results, in my opinion, concern the interactions of auditory and somatosensory stimulation. For instance, the authors find that a) somatosensory-responses are strongest inside the modules and b) somatosensory-auditory suppression is stronger in the matrix than in the modules. This suggests that, while somatosensory inputs preferentially target the GABA-rich modules, they do not exclusively target GABAergic neurons within the modules (given that the authors record exclusively from excitatory neurons we wouldn't expect to see somatosensory responses if they targeted exclusively GABAergic neurons) and that the GABAergic neurons of the modules (consistent with previous work) preferentially impact neurons outside the modules, i.e. via long-range connections.

      Weaknesses:

      While the findings are of interest to the subfield they have only rather limited implications beyond it and the writing is not quite as precise as it could be.

      Reviewer #3 (Public Review):

      The lateral cortex of the inferior colliculus (LC) is a region of the auditory midbrain noted for receiving both auditory and somatosensory input. Anatomical studies have established that somatosensory input primarily impinges on "modular" regions of the LC, which are characterized by high densities of GABAergic neurons, while auditory input is more prominent in the "matrix" regions that surround the modules. However, how auditory and somatosensory stimuli shape activity, both individually and when combined, in the modular and matrix regions of the LC has remained unknown.

      The major obstacle to progress has been the location of the LC on the lateral edge of the inferior colliculus where it cannot be accessed in vivo using conventional imaging approaches. The authors overcame this obstacle by developing methods to implant a microprism adjacent to the LC. By redirecting light from the lateral surface of the LC to the dorsal surface of the microprism, the microprism enabled two-photon imaging of the LC via a dorsal approach in anesthetized and awake mice. Then, by crossing GAD-67-GFP mice with Thy1-jRGECO1a mice, the authors showed that they could identify LC modules in vivo using GFP fluorescence while assessing neural responses to auditory, somatosensory, and multimodal stimuli using Ca2+ imaging. Critically, the authors also validated the accuracy of the microprism technique by directly comparing results obtained with a microprism to data collected using conventional imaging of the dorsal-most LC modules, which are directly visible on the dorsal IC surface, finding good correlations between the approaches.

      Through this innovative combination of techniques, the authors found that matrix neurons were more sensitive to auditory stimuli than modular neurons, modular neurons were more sensitive to somatosensory stimuli than matrix neurons, and bimodal, auditory-somatosensory stimuli were more likely to suppress activity in matrix neurons and enhance activity in modular neurons. Interestingly, despite their higher sensitivity to somatosensory stimuli than matrix neurons, modular neurons in the anesthetized prep were overall more responsive to auditory stimuli than somatosensory stimuli (albeit with a tendency to have offset responses to sounds). This suggests that modular neurons should not be thought of as primarily representing somatosensory input, but rather as being more prone to having their auditory responses modified by somatosensory input. However, this trend was different in the awake prep, where modular neurons became more responsive to somatosensory stimuli. Thus, to this reviewer, one of the most intriguing results of the present study is the extent to which neural responses in the LC changed in the awake preparation. While this is not entirely unexpected, the magnitude and stimulus specificity of the changes caused by anesthesia highlight the extent to which higher-level sensory processing is affected by anesthesia and strongly suggests that future studies of LC function should be conducted in awake animals.

      Together, the results of this study expand our understanding of the functional roles of matrix and module neurons by showing that responses in LC subregions are more complicated than might have been expected based on anatomy alone. The development of the microprism technique for imaging the LC will be a boon to the field, finally enabling much-needed studies of LC function in vivo. The experiments were well-designed and well-controlled, the limitations of two-photon imaging for tracking neural activity are acknowledged, and appropriate statistical tests were used.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      - Increase font size of scale bars on figure 6.

      Thank you for the suggestion. We have increased the font size of the scale bar.

      Reviewer #2 (Recommendations For The Authors):

      Line 505: typo: 'didtinction'

      Thank you for the suggestion and we do apologize for the typo. We have fixed the word as shown in the text (line 506).

      No further comments.

      Reviewer #3 (Recommendations For The Authors):

      Line 543: Change "contripute" to "contribute"

      Thank you for the suggestion and we do apologize for the typo. We have fixed the word as shown in the text (line 544).

    1. eLife assessment

      The work by Han and collaborators describes valuable findings on the role of Akkermansia muciniphila during ETEC infection. If confirmed, these findings will add to a growing list of beneficial properties of this organism. Although the strength of the evidence used to justify the conclusions in the manuscript is solid, the issues raised about the sequencing method used should be addressed.

    2. Reviewer #2 (Public Review):

      Ma X. et al proposed that A. muciniphila was a key strain that promotes the proliferation and differentiation of intestinal stem cells through acting on the Wnt/b-catenin signaling pathway. They used various models, such as piglet model, mouse model and intestinal organoids to address how A. muciniphila and B. fragilis offer the protection against ETEC infection. They showed that FMT with fecal samples, A. muciniphila or B. fragilis protected piglets and/or mice from ETEC infection, and this protection is manifested as reduced intestinal inflammation/bacterial colonization, increased tight junction/Muc2 proteins, as well as proper Treg/Th17 cells. Additionally, they demonstrated that A. muciniphila protected basal-out and/or apical-out intestinal organoids against ETEC infection via Wnt signaling.

      Comments on revised version:

      Please add proper references to indicate the invasion of ETEC into organoids after 1 h of infection.

    3. Reviewer #3 (Public Review):

      Summary:

      The manuscript by Ma et al. describes a multi-model (pig, mouse, organoid) investigation into how fecal transplants protect against E. coli infection. The authors identify A. muciniphila and B. fragilis as two important strains and characterize how these organisms impact the epithelium by modulating host signaling pathways, namely the Wnt pathway in lgr5 intestinal stem cells.

      Strengths:

      The strengths of this manuscript include the use of multiple model systems and follow up mechanistic investigations to understand how A. muciniphila and B. fragilis interacted with the host to impact epithelial physiology.

      Weaknesses:

      After an additional revision, the bioinformatics section of the methods has changed significantly from previous versions and now indicates a third sequencer was used instead: Ion S5 XL. Important parameters required to replicate analysis have still not been provided. Inspection of the SRA data indicates a mix of Illumina MiSeq and Illumina HiSeq 2500. It is now unclear which sequencing technology was used as authors have variably reported 4 different sequencers for these samples. Appropriate metadata was not provided in the SRA, although some groups may be inferred from sample names. These changing descriptions of the methodologies and ambiguity in making the data available create concerns about rigor of study and results.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #2 (Public Review):

      The authors indicated that the adherence of ETEC is to intestinal epithelial cells. However, it is also possible that the majority of ETEC may reside in the intestinal mucus, particularly under in vivo infection condition. The colonization of ETEC in the jejunum and colon of piglets (Fig 2C) and in the intestines of mice (Fig S2A) does not necessarily reflect the adherence of ETEC to epithelial cells. Please verify these observations with other methods, such as immunostaining. Also, while Salmonella enterica serovar Typhimurium or Listeria monocytogenes can invade organoids within 1 hour, it is unknown if ETEC invade into organoids in this study. Clarifying this will help resolve if A. muciniphila block the adherence and/or invasion of ETEC. Please also address if A. muciniphila metabolites could prevent ETEC infection in the organoid models.

      In the original manuscript, the sentence “ETEC K88 adheres to intestinal epithelial cells and induces gut inflammation (Yu et al., 2018)” in line 447 is a reference cited for the purpose of connecting the previous and the following, and it is not our result. We have deleted this sentence on line 457. Previous studies have shown that ETEC enter into intestinal epithelial cells after only one hour of infection (Xiao et al., 2022; Qian et al., 2023). Whether A. muciniphila metabolites prevent ETEC infection in the organoid models is not the focus of this manuscript, it may be further explored by other members of the research group in the future.

      References:

      Xiao K, Yang Y, Zhang Y, Lv QQ, Huang FF, Wang D, Zhao JC, Liu YL. 2022. Long-chain PUFA ameliorate enterotoxigenic Escherichia coli-induced intestinal inflammation and cell injury by modulating pyroptosis and necroptosis signaling pathways in porcine intestinal epithelial cells. Br. J. Nutr. 128(5):835-850.

      Qian MQ, Zhou XC, Xu TT, Li M, Yang ZR, Han XY. 2023. Evaluation of Potential Probiotic Properties of Limosilactobacillus fermentum Derived from Piglet Feces and Influence on the Healthy and E. coli-Challenged Porcine Intestine. Microorganisms. 11(4).

      Reviewer #3 (Public Review):

      Summary:

      The manuscript by Ma et al. describes a multi-model (pig, mouse, organoid) investigation into how fecal transplants protect against E. coli infection. The authors identify A. muciniphila and B. fragilis as two important strains and characterize how these organisms impact the epithelium by modulating host signaling pathways, namely the Wnt pathway in lgr5 intestinal stem cells.

      Strengths:

      The strengths of this manuscript include the use of multiple model systems and follow up mechanistic investigations to understand how A. muciniphila and B. fragilis interacted with the host to impact epithelial physiology.

      Weaknesses:

      After revision, the bioinformatics section of the methods is still jumbled and may indicate issues in the pipeline. Important parameters are not included to replicate analyses. Merging the forward and reverse reads may represent a problem for denoising. Chimera detection was performed prior to denoising.

      Potential denoising issues for NovaSeq data was not addressed in the response. The authors did not clarify if multiple testing correction was applied; however, it may be assumed not as written. The raw sequencing data made available through the SRA accession (if for the correct project) indicates it was a MiSeq platform; however, the sample names do not appear to link up to this experimental design and metadata not sufficient to replicate analyses.

      We have redescribed the method for microbiome sequencing analysis on lines 298-327.

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      SRA accession must be confirmed and metadata made available.

      We updated the SRA data.

    1. eLife assessment

      The aim of this valuable study is to uncover developmental roles of the neuropeptide prothoracicotropic hormone (PTTH) and ecdysone, which later regulate female receptivity of Drosophila melanogaster. The work combines spatially and temporally restricted genetic manipulation with behavior quantification to explore these molecular pathways and the neuronal substrates participating in the control of female sexual receptivity. At present, the implication of both signaling pathways in this process is convincing but the strength of the evidence is incomplete to support the main claim that PTTH pathway controls female sexual receptivity through the function of ecdysone in pC1 neurons.

    2. Reviewer #1 (Public Review):

      Summary

      This article delves into the role of Ecdysone in regulating female sexual receptivity in Drosophila. The researchers discovered that PTTH, a positive regulator of Ecdysone production, hurts the receptivity of adult virgin females. Specifically, the researchers found that losing larval PTTH before metamorphosis significantly increases female receptivity immediately after adult eclosion. In addition, Ecdysone, through its receptor EcR-A, is necessary during metamorphic neurodevelopment for the proper development of P1 neurons, as its silencing leads to morphological changes associated with reduced adult female receptivity. Furthermore, Torso enhances receptivity in the adult stage. The molecular mechanisms linking each molecule to female receptivity have yet to be fully understood; therefore, the involvement of the juvenile-to-adult hormonal pathway (PTTH/Torso/ecdysone) in female receptivity is not proven.

      Strengths

      (1) Robust Methodology and Experimental Design: The study employs a comprehensive and well-structured experimental approach, combining genetic manipulations, behavioral assays, and molecular analyses. This multi-faceted methodology allows for a thorough investigation of the role of PTTH and Ecdysone in regulating female sexual receptivity in Drosophila. The use of specific gene knockouts, RNA interference, and overexpression techniques provides strong evidence supporting the findings.<br /> (2) Clear and Substantial Findings: The authors provide compelling data showing that PTTH negatively regulates female receptivity during the larval stage, which is rescued by Ecdysone feeding. Instead, metamorphic Ecdysone has a positive role during neurodevelopment. The experiments demonstrate this dual and temporally distinct role of PTTH/Ecdysone, shedding light on a complex hormonal regulation mechanism.<br /> (3) Clarification of Experimental Details: In response to the initial review, the authors have clarified important experimental details, such as the precise timing of genetic manipulations and the specific developmental stages examined. This clarification enhances the reproducibility and understanding of the study.

      Weaknesses

      (1) Unresolved Contradictions and Complexity in Results: Despite the detailed responses, the paper still presents complex and somewhat contradictory findings regarding the roles of PTTH, Torso, and Ecdysone. The observed increase in EcR-A expression in PTTH mutants and the nuanced explanation regarding the feedforward relationship, while insightful, do not fully resolve the initial confusion about the differing effects of PTTH and Ecdysone manipulations on female receptivity. This required more exploration.<br /> (2) Insufficient Exploration of Mechanistic Pathways: The potential mechanisms underlying the role of PTTH/Torso-Ecdysone across different developmental stages remain underexplored. While the authors suggest a feedforward relationship and possible interaction with other neurons, these hypotheses are not thoroughly tested or elaborated upon, leaving gaps in the mechanistic understanding.<br /> (3) Limited Scope of Validation Experiments: While the authors addressed some reviewer concerns about validation, the scope remains somewhat limited. The lack of existing PTTH mutants and the challenges in manipulating PTTH expression without affecting receptivity suggests that further work is needed to validate these pathways robustly. The inability to fully replicate the PTTHdelete phenotype through other means leaves some questions unanswered.<br /> (4). Complexity in Interpretation of dsx-Positive Neurons: The relevance of dsx-positive neurons in the context of PTTH's effects on female receptivity remains ambiguous. Although the authors provide some context, the biological significance of these observations is not fully clarified.

      Conclusion<br /> The manuscript presents a well-conceived study with significant findings that advance the understanding of hormonal regulation of female receptivity in Drosophila. However, complexities in the data and unresolved mechanistic questions suggest that further work is needed to clarify the exact pathways and interactions involved. The authors' responses to feedback have strengthened the paper, but additional experiments and more thorough mechanistic exploration would enhance the robustness and clarity of the conclusions.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors tried to identify novel adult functions of the classical Drosophila juvenile-adult transition axis (i.e. ptth-ecdysone). Surprisingly, larval ptth-expressing neurons expressed the sex-specific doublesex gene, thus belonging to the sexual dimorphic circuit. Lack of ptth during late larval development caused enhanced female sexual receptivity, effect rescued by supplying ecdysone in the food. Among many other cellular players, pC1 neurons control receptivity by encoding the mating status of females. Interestingly, during metamorphosis a subtype of pC1 neurons required Ecdysone Receptor A in order to regulate such female receptivity. A transcriptomic analysis using pC1-specific Ecdyone signaling down-regulation gives some hints of possible downstream mechanisms.

      Strengths:

      The manuscript showed solid genetic evidence that lack of ptth during development caused enhanced copulation rate in female flies, which includes ptth mutant rescue experiments by over-expressing ptth as well as by adding ecdysone-supplemented food. They also present elegant data dissecting the temporal requirements of ptth-expressing neurons by shifting animals from non-permissive to permissive temperatures, in order to inactivate neuronal function (although not exclusively ptth function). They showed that EcR-A is up-regulated in ptth mutant background. By combining different drivers together with EcR-A RNAi and torso RNAi lines authors also identified the Ecdysone receptor and torso requirements of a particular subtype of pC1 neurons during metamorphosis. Convincing live calcium imaging showed no apparent effect of EcR-A in neural activity, although some effect on morphology is uncovered. Finally, bulk RNAseq shows differential gene expression after EcR-A down-regulation.

      Weaknesses:

      The paper has three main weaknesses. The first one refers to temporal requirements of ptth and ecdysone signaling. Whereas ptth is necessary during larval development, ecdysone effect appears during pupal development. ptth induces ecdysone synthesis during larval development but there is no published evidence about a similar role for ptth during pupal stages. The down-regulation of EcR-A by RNAi requires at least 8 h to be complete, whereas the activation of ptth neurons in larva stages is immediate. Furthermore, larval and pupal ecdysone functions are different (triggering metamorphosis vs tissue remodeling). The second caveat is the fact that ptth and ecdysone/torso loss-of-function experiments render opposite effects (enhancing and decreasing copulation rates, respectively). The most plausible explanation is that both functions are independent of each other, also suggested by differential temporal requirements. Finally, in order to identify the effect in the transcriptional response of down-regulating EcR-A in a very small population of neurons, a scRNAseq study should have been performed instead of bulk RNAseq.

      In summary, despite the authors providing convincing evidence that ptth and ecdysone signaling pathways are involved in female receptivity, the main claim that ptth regulates this process through ecdysone is not supported by results. More likely, they'd rather be independent processes.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript shows that mutations that disable the gene encoding the PTTH gene cause an increase in female receptivity (they mate more quickly), a phenotype that can be reversed by feeding these mutants the molting hormone, 20-hydoxyecdysone (20E). The use of an inducible system reveals that inhibition or activation of PTTH neurons during the larval stages increases and decreases female receptivity, respectively, suggesting that PTTH is required during the larval stages to affect the receptivity of the (adult) female fly. Showing that these neurons express the sex-determining gene dsx leads the authors to show that interfering with 20E actions in pC1 neurons, which are dsx-positive neurons known to regulate female receptivity, reduces female receptivity and increases the arborization pattern of pC1 neurons. The work concludes by showing that targeted knockdown of EcRA in pC1 neurons causes 527 genes to be differentially expressed in the brains of female flies, of which 123 passed a false discovery rate cutoff of 0.01; interestingly, the gene showing the greatest down-regulation was the gene encoding dopamine beta-monooxygenase.

      This reviewer appreciates the effort that was done to revise the manuscript and address the various comments made by the reviewers. Nevertheless, I feel that the main concerns remain. These are not necessarily due to an unwillingness on the part of the authors to address them, but rather to difficulties that are inherent to trying to assign specific roles to EcR and pC1 neurons at a time when major changes are occurring (or are about to occur) in the nervous system, and do so using tools that are currently not sharp or specific enough. Many of the conclusions are supported by the results and those that may have alternative interpretations can remain more speculative until better tools become available. It is, nevertheless, an interesting and provocative piece of work.

      Strengths

      This is an interesting piece of work, which may shed light on the basis for the observation noted previously that flies lacking PTTH neurons show reproductive defects ("... females show reduced fecundity"; McBrayer, 2007; DOI 10.1016/j.devcel.2007.11.003).

      Weaknesses:

      There are some results whose interpretation seem ambiguous and findings whose causal relationship is implied but not demonstrated.

      (1) At some level, the findings reported here are not at all surprising. Since 20E regulates the profound changes that occur in the central nervous system (CNS) during metamorphosis, it is not surprising that PTTH would play a role in this process. Although animals lacking PTTH (rather paradoxically) live to adulthood, they do show greatly extended larval instars and a corresponding great delay in the 20E rise that signals the start of metamorphosis. For this reason, concluding that PTTH plays a SPECIFIC role in regulating female receptivity seems a little misleading, since the metamorphic remodeling of the entire CNS is likely altered in PTTH mutants. Since these mutants produce overall normal (albeit larger--due to their prolonged larval stages) adults, these alterations are likely to be subtle. Courtship has been reported as one defect expressed by animals lacking PTTH neurons, but this behavior may stand out because reduced fertility and increased male-male courtship (McBrayer, 2007) would be noticeable defects to researchers handling these flies. By contrast, detecting defects in other behaviors (e.g., optomotor responses, learning and memory, sleep, etc) would require closer examination. For this reason I would ask the authors to temper their statement that PTTH is SPECIFICALLY involved in regulating female receptivity.<br /> (2) The link between PTTH and the role of pC1 neurons in regulating female receptivity is not clear. Again, since 20E controls the metamorphic changes that occur in the CNS, it is not surprising that 20E would regulate the arborization of pC1 neurons. And since these neurons have been implicated in female receptivity, it would therefore be expected that altering 20E signaling in pC1 neurons would affect this phenotype. However, this does not mean that the defects in female receptivity expressed by PTTH mutants are due to defects in pC1 arborization. For this the authors would at least have to show that PTTH mutants show the changes in pC1 arborization shown in Fig. 6. And even then the most that could be said is that the changes observed in these neurons "may contribute" to the observed behavioral changes. Indeed, the changes observed in female receptivity may be caused by PTTH/20E actions on different neurons.<br /> (3) Some of the results need commenting on, or refining, or revising:<br /> (a) For some assays PTTH behaves sometimes like a recessive gene and at other times like a semi-dominant, and yet at others like a dominant gene. For instance, in Fig. 1D-G, PTTH[-]/+ flies behave like wildtype (D), express an intermediate phenotype (E-F), or behave like the mutant (G). This may all be correct but merits some comment.<br /> (b) Some of the conclusions are overstated. i) Although Fig. 2E-G does show that silencing the PTTH neurons during the larval stages affects copulation rate (E) the strength of the conclusion is tempered by the behavior of one of the controls (tub-GAL80[ts]/+, UAS-Kir2.1/+) in panels F and G, where it behaves essentially the same as the experimental group (and quite differently from the PTTH-GAL4/+ control; blue line).(Incidentally, the corresponding copulation latency should also be shown for these data.). ii) For Fig. 5I-K, the conclusion stated is that "Knock-down of EcR-A during pupal stage significantly decreased the copulation rate." Although strictly correct, the problem is that panel J is the only one for which the behavior of the control lacking the RNAi is not the same as that of the experimental group. Thus, it could just be that when the experiment was done at the pupal stage is the only situation when the controls were both different from the experimental. Again, the results shown in J are strictly speaking correct but the statement is too definitive given the behavior of one of the controls in panels I and K. Note also that panel F shows that the UAS-RNAi control causes a massive decrease in female fertility, yet no mention is made of this fact.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review): 

      Summary: This article explores the role of Ecdysone in regulating female sexual receptivity in Drosophila. The researchers found that PTTH, throughout its role as a positive regulator of ecdysone production, negatively affects the receptivity of adult virgin females. Indeed, loss of larval PTTH before metamorphosis significantly increases female receptivity right after adult eclosion and also later. However, during metamorphic neurodevelopment, Ecdysone, primarily through its receptor EcR-A, is required to properly develop the P1 neurons since its silencing led to morphological changes associated with a reduction in adult female receptivity. Nonetheless, the result shown in this manuscript sheds light on how Ecdysone plays a dual role in female adult receptivity, inhibiting it during larval development and enhancing it during metamorphic development. Unfortunately, this dual and opposite effect in two temporally different developmental stages has not been highlighted or explained. 

      Strengths: This paper exhibits multiple strengths in its approach, employing a well-structured experimental methodology that combines genetic manipulations, behavioral assays, and molecular analysis to explore the impact of Ecdysone on regulating virgin female receptivity in Drosophila. The study provides clear and substantial findings, highlighting that removing PTTH, a positive Ecdysone regulator, increases virgin female receptivity. Additionally, the research expands into the temporal necessity of PTTH and Ecdysone function during development. 

      Weaknesses: 

      There are two important caveats with the data that are reflecting a weakness: 

      (1) Contradictory Effects of Ecdysone and PTTH: One notable weakness in the data is the contrasting effects observed between Ecdysone and its positive regulator PTTH. PTTH loss of function increases female receptivity, while ecdysone loss of function reduces it. Given that PTTH positively regulates Ecdysone, one would expect that the loss of function of both would result in a similar phenotype or at least a consistent directional change. 

      A1. As newly formed prepupae, the ptth-Gal4>UAS-Grim flies display similar changes in gene expression to the genetic control flies to response to a high-titer ecdysone pulse. These include the repression of EcR (McBrayer et al.,2007). We tested whether there is a similar feedforward relationship between PTTH and EcR-A. We quantified the EcR-A mRNA level of PTTH -/- and PTTH -/+ in the whole body of newly formed prepupae. Indeed, PTTH -/- induced increased EcR-A expression in the whole body of newly formed prepupae compared with PTTH -/+ flies. Because of the function of EcR-A in gene expression, this suggests that PTTH -/- disturbs the regulation of a serious of gene expressions during metamorphosis. However, it is not sure that the EcR-A expression in pC1 neurons is increased compared with genetic controls when PTTH is deleted. Furthermore, PTTH -/- must affect development of other neurons rather than only pC1 neurons. So, the feedforward relationship between PTTH and EcRA at the start of prepupal stage is one possible cause for the contradictory effects of PTTH -/- and EcR-A RNAi in pC1 neurons.  

      (2) Discordant Temporal Requirements for Ecdysone and PTTH: Another weakness lies in the different temporal requirements for Ecdysone and PTTH. The data from the manuscript suggest that PTTH is necessary during the larval stage, as shown in Figure 2 E-G, while Ecdysone is required during the pupal stage, as indicated in Figure 5 I-K. Ecdysone is a crucial developmental hormone with precisely regulated expression throughout development, exhibiting several peaks during both larval and pupal stages. PTTH is known to regulate Ecdysone during the larval stage, specifically by stimulating the kinetics of Ecdysone peaking at the wandering stage. However, it remains unclear whether pupal PTTH, expressed at higher levels during metamorphosis, can stimulate Ecdysone production during the pupal stage. Additionally, given the transient nature of the Ecdysone peak produced at wandering time, which disappears shortly before the end of the prepupal stage, it is challenging to infer that larval PTTH will regulate Ecdysone production during the pupal stage based on the current state of knowledge in the neuroendocrine field.  

      Considering these two caveats, the results suggest that the authors are witnessing distinct temporal and directional effects of Ecdysone on virgin female receptivity.  

      A2. First of all, it is necessary to clarify the detailed time for the manipulation of Ptth gene and PTTH neurons. In Figure 3, activation of PTTH neurons during the stage 2 inhibited the female receptivity. The “stage 2” is from six hours before the 3rd-instar larvae to the end of the wandering larvae (the start of prepupae). In Figure 5, The “pupal stage” is from the prepupal stage to the end of pupal stage. This “pupal stage” includes the forming of prepupae when the ecdysone peak is not disappeared. The time of manipulating Ptth and EcR-A in pC1 neurons are continuous. In addition, the pC1-Gal4 expressing neurons appear also at the start of prepupal stage. So, it is possible that PTTH regulates female receptivity through the function of EcR-A in pC1 neurons. 

      Reviewer #1 (Recommendations For The Authors): 

      In light of the significant caveat previously discussed, I will just make a few general suggestions: 

      (1) The paper primarily focuses on robust phenotypes, particularly in PTTH mutants, with a well-detailed execution of several experiments, resulting in thorough and robust outcomes. However, due to the caveat previously presented (opposite effect in larva and pupa), consider splitting the paper into two parts: Figures 1 to 4 deal with the negative effect of PTTH-Ecdysone on early virgin female receptivity, while Figures 5 to 7 focus on the positive metamorphic effect of Ecdysone in P1 metamorphic neurodevelopment. However, in this scenario, the mechanism by which PTTH loss of function increases female receptivity should be addressed.

      A3. It is a good suggestion that splitting the paper into two parts associated with the PTTH function and EcR function in pC1 neurons separately, if it is impossible that PTTH functions in female receptivity through the function of EcR-A in pC1 neurons. However, because of the feedforward relationship between PTTH and EcR-A in the newly formed prepupae, and the time of manipulating Ptth and EcR-A in pC1 neurons is continuous, it is possible that these two functions are not independent of each other. So, we still keep the initial edition.

      (2) Validate the PTTH mutants by examining homozygous mutant phenotypes and the dose-dependent heterozygous mutant phenotype using existing PTTH mutants. This could also be achieved using RNAi techniques.

      A4. We did not get other existing PTTH mutants. We instead decreased the PTTH expression in PTTH neurons and dsx+ neurons, but did not detect the similar phenotype to that of PTTH -/-. Similarly, the overexpression through PTTH-Gal4>UAS-PTTH is also not sufficient to change female receptivity. It is possible that both decreasing and increasing PTTH expression are not sufficient to change female receptivity.

      (3) Clarify if elav-Gal4 is not expressed in PTTH neurons and discuss how the rescue mechanisms work (hormonal, paracrine, etc.) in the text.

      A5. We tested the overlap of elav-Gal4>GFP signal and the stained PTTH with PTTH antibody. We did not detect the overlap. It suggests that elav-Gal4 is not expressed in PTTH neurons. However, we detected the expression of PTTH (PTTH antibody) in CNS when overexpressed PTTH using elav-Gal4>UASPTTH based on PTTH -/-. Furthermore, this rescued the phenotype of PTTH -/- in female receptivity. Insect PTTH isoforms have similar probable signal peptide for secreting. Indeed, except for the projection of axons to PG gland, PTTH also carries endocrine function acting on its receptor Torso in light sensors to regulate light avoidance of larvae. The overexpressed PTTH in other neurons through elav-Gal4>UASPTTH may act on the PG gland through endocrine function and then induce the ecdysone synthesis and release. So that, although elav-Gal4 is not expressed in PTTH neurons, the ecdysone synthesis triggered by PTTH from the hemolymph may result in the rescued PTTH -/- phenotype in female receptivity.

      (4) Consider renaming the new PTTH mutant to avoid confusion with the existing PTTHDelta allele. 

      A6. We have renamed our new PTTH mutant as PtthDelete.

      (5) Include the age of virgin females in each figure legend, especially for Figures 2 to 7, to aid in interpretation. This is essential information since wild-type early virgins -day 1- show no receptivity. In contrast, they reach a typical 80% receptivity later, and the mechanism regulating the first face might differ from the one occurring later.

      A7. We have included the age of virgin females in each figure legend. 

      (6) Explain the relevance of observing that PTTH adult neurons are dsx-positive, as it's unclear why this observation is significant, considering that these neurons are not responsible for the observed receptivity effect in virgin females. Alternatively, address this in the context of the third instar larva or clarify its relevance.  

      A8. We decreased the DsxF expression in PTTH neurons and did not detect significantly changed female receptivity. Almost all neurons regulating female receptivity, including pC1 neurons, express DsxF. We suppose that PTTH neurons have some relationship with other DsxF-positive neurons which regulate female receptivity. Indeed, we detected the overlap of dsx-LexA>LexAop-RFP and torso-Gal4>UAS-GFP during larval stage. Furthermore, decreasing Torso expression in pC1 neurons significantly inhibit female receptivity. 

      These results suggest that, PTTH regulates female receptivity not only through ecdysone, but also may through regulating other neurons especially DsxF-positive neurons associated with female receptivity directly. 

      Reviewer #2 (Public Review): 

      Summary: The authors tried to identify novel adult functions of the classical Drosophila juvenile-adult transition axis (i.e. ptth-ecdysone). Surprisingly, larval ptth-expressing neurons expressed the sex-specific doublesex gene, thus belonging to the sexual dimorphic circuit. Lack of ptth during late larval development caused enhanced female sexual receptivity, an effect rescued by supplying ecdysone in the food. Among many other cellular players, pC1 neurons control receptivity by encoding the mating status of females. Interestingly, during metamorphosis, a subtype of pC1 neurons required Ecdysone Receptor A in order to regulate such female receptivity. A transcriptomic analysis using pC1-specific Ecdyone signaling down-regulation gives some hints of possible downstream mechanisms. 

      Strengths: the manuscript showed solid genetic evidence that lack of ptth during development caused enhanced copulation rate in female flies, which includes ptth mutant rescue experiments by overexpressing ptth as well as by adding ecdysone-supplemented food. They also present elegant data dissecting the temporal requirements of ptth-expressing neurons by shifting animals from non-permissive to permissive temperatures, in order to inactivate neuronal function (although not exclusively ptth function). By combining different drivers together with a EcR-A RNAi line authors also identified the Ecdysone receptor requirements of a particular subtype of pC1 neurons during metamorphosis. Convincing live calcium imaging showed no apparent effect of EcR-A in neural activity, although some effect on morphology is uncovered. Finally, bulk RNAseq shows differential gene expression after EcR-A down-regulation. 

      Weaknesses: the paper has three main weaknesses. The first one refers to temporal requirements of ptth and ecdysone signaling. Whereas ptth is necessary during larval development, the ecdysone effect appears during pupal development. ptth induces ecdysone synthesis during larval development but there is no published evidence about a similar role for ptth during pupal stages. Furthermore, larval and pupal ecdysone functions are different (triggering metamorphosis vs tissue remodeling). The second caveat is the fact that ptth and ecdysone loss-of-function experiments render opposite effects (enhancing and decreasing copulation rates, respectively). The most plausible explanation is that both functions are independent of each other, also suggested by differential temporal requirements. Finally, in order to identify the effect in the transcriptional response of down-regulating EcR-A in a very small population of neurons, a scRNAseq study should have been performed instead of bulk RNAseq. 

      In summary, despite the authors providing convincing evidence that ptth and ecdysone signaling pathways are involved in female receptivity, the main claim that ptth regulates this process through ecdysone is not supported by results. More likely, they'd rather be independent processes. 

      B1. Clarification: in Figure 3, activation of PTTH neurons during the stage 2 inhibited the female receptivity. The “stage 2” is from six hours before the 3rd-instar larvae to the end of the wandering larvae (the start of prepupae). In Figure 5, The “pupal stage” is from the start of prepupal stage to the end of pupal stage. This “pupal stage” includes the forming of prepupae when the ecdysone peak is not disappeared. The time of manipulating Ptth and EcR-A in pC1 neurons are continuous. In addition, the pC1-Gal4 expressing neurons appear also at the start of prepupal stage. So, it is possible that PTTH regulates female receptivity through the function of EcR-A in pC1 neurons. 

      B2. During the forming of prepupae, the ptth-Gal4>UAS-Grim flies display similar changes in gene expression to the genetic control flies to response to a high-titer ecdysone pulse. These include the repression of EcR (McBrayer et al.,2007). We tested whether there is a similar feedforward relationship between PTTH and EcR-A. We quantified the EcR-A mRNA level of PTTH -/- and PTTH -/+ in the whole body of newly formed prepupae. Indeed, PTTH -/- induced increased EcR-A compared with PTTH -/+ flies. Because of the function of EcR-A in gene expression, this suggests that PTTH -/- disturbs the regulation of a serious of gene expressions during metamorphosis. However, it is not sure that the EcR-A expression in pC1 neurons is increased compared with genetic controls when PTTH is deleted. Furthermore, PTTH -/- must affect the development of other neurons rather than only pC1 neurons. So, the feedforward relationship between PTTH and EcR-A at the start of prepupal stage is one possible cause for the contradictory effects of PTTH -/- and EcR-A RNAi in pC1 neurons.

      B3. We will do single cell sequencing in pC1 neurons for the exploration of detailed molecular mechanism of female receptivity in the future.

      Reviewer #2 (Recommendations For The Authors): 

      Additional experiments and suggestions: 

      - torso LOF in the PG to determine whether or not the ecdysone peak regulated by ptth (there is a 1-day delay in pupation) is responsible for the ptth effect in L3. In the same line, what happens if torso is downregulated in the pC1 neurons? Is there any effect on copulation rates? 

      B4. Because the loss of phm-Gal4, we could not test female receptivity when decreasing the expression of Torso in PG gland. However, decreasing Torso expression in pC1 neurons significantly inhibit female receptivity. This suggests that PTTH regulates female receptivity not only through ecdysone but also through regulating dsx+ pC1 neurons in female receptivity directly.

      - What is the effect of down-regulating ptth in the dsx+ neurons? No ptth RNAi experiments are shown in the paper. 

      B5. We decreased PTTH expression in dsx+ neurons but did not detect the change in female receptivity.  We also decreased PTTH expression in PTTH neurons using PTTH-Gal4, also did not detect the change in female receptivity. Similarly, the overexpression through PTTH-Gal4>UAS-PTTH is also not sufficient to change female receptivity. It is possible that both decreasing and increasing PTTH expression are not sufficient to change female receptivity.

      - Why are most copulation rate experiments performed between 4-6 days after eclosion? ptth LOF effect only lasts until day 3 after eclosion (but very weak-fig 1). Again, this supports the idea that ptth and ecdysone effects are unrelated.

      B6. Most behavioral experiments were performed between 4-6 days after eclosion as most other studies in flies, because the female receptivity reaches the peak at that time. Ptth LOF made female receptivity enhanced from the first day after eclosion. This seems like the precocious puberty. Wild type females reach high receptivity at 2 days after eclosion (about 75% within 10 min). We suppose that Ptth LOF effect only lasts until day 3 after eclosion because too high level of receptivity of control flies to exceed.

      It is not sure whether the effect of PTTH-/- in female receptivity disappears after the 3rd day of adult flies. So that it is not sure whether PTTH and EcR-A effects in pC1 neurons are unrelated.

      - The fact that pC1d neuronal morphology changes (and not pC1b) does not explain the effect of EcR-A LOF. Despite it is highlighted in the discussion, data do not support the hypothesis. How do these pC1 neurons look like in a ptth mutant animal regarding Calcium imaging and/or morphology? 

      B7. We detected the pattern of pC1 neurons when PTTH is deleted. Consistent with the feedforward relationship between PTTH and expression of EcR-A in newly formed prepupae, PTTH deletion induced less established pC1-d neurons contrary to that induced by EcR-A reduction in pC1 neurons. However, it is not sure that the expression of EcR-A in pC1 neurons is increased when PTTH is deleted. Furthermore, on the one hand, manipulation of PTTH has general effect on the neurodevelopment not only regulating pC1 neurons. On the other hand, the detailed pattern of pC1-b neurons which is the key subtype regulating female receptivity when EcR-A is decreased in pC1 neurons or PTTH is deleted could not be seen clearly. So, the abnormal development of pC1-b neurons, if this is true, is just one of the possible reasons for the effect of PTTH deletion on female receptivity.

      - The discussion is incomplete, especially the link between ptth and ecdysone; discuss why the phenotype is the opposite (ptth as a negative regulator of ecdysone in the pupa, for instance); the difference in size due to ptth LOF might be related to differential copulation rates.  

      B8. We have revised the discussion. We could not exclude the effect of size of body on female receptivity when PTTH was deleted or PTTH neurons were manipulated, although there was not enough evidence for the effect of body size on female receptivity.

      - scheme of pC neurons may help. 

      B9. We have tried to label pC1 neurons with GFP and sort pC1 neurons through flow cytometry sorting, but could not success. This may because the number of pC1 neurons is too low in one brain. We will try single-cell sequencing in the future. 

      - Immunofluorescence images are too small.

      B10. We have resized the small images.

      Reviewer #3 (Public Review): 

      Summary: 

      This manuscript shows that mutations that disable the gene encoding the PTTH gene cause an increase in female receptivity (they mate more quickly), a phenotype that can be reversed by feeding these mutants the molting hormone, 20-hydoxyecdysone (20E). The use of an inducible system reveals that inhibition or activation of PTTH neurons during the larval stages increases and decreases female receptivity, respectively, suggesting that PTTH is required during the larval stages to affect the receptivity of the (adult) female fly. Showing that these neurons express the sex-determining gene dsx leads the authors to show that interfering with 20E actions in pC1 neurons, which are dsx-positive neurons known to regulate female receptivity, reduces female receptivity and increases the arborization pattern of pC1 neurons. The work concludes by showing that targeted knockdown of EcRA in pC1 neurons causes 527 genes to be differentially expressed in the brains of female flies, of which 123 passed a false discovery rate cutoff of 0.01; interestingly, the gene showing the greatest down-regulation was the gene encoding dopamine beta-monooxygenase. 

      Strengths 

      This is an interesting piece of work, which may shed light on the basis for the observation noted previously that flies lacking PTTH neurons show reproductive defects ("... females show reduced fecundity"; McBrayer, 2007; DOI 10.1016/j.devcel.2007.11.003). 

      Weaknesses: 

      There are some results whose interpretation seem ambiguous and findings whose causal relationship is implied but not demonstrated. 

      (1) At some level, the findings reported here are not at all surprising. Since 20E regulates the profound changes that occur in the central nervous system (CNS) during metamorphosis, it is not surprising that PTTH would play a role in this process. Although animals lacking PTTH (rather paradoxically) live to adulthood, they do show greatly extended larval instars and a corresponding great delay in the 20E rise that signals the start of metamorphosis. For this reason, concluding that PTTH plays a SPECIFIC role in regulating female receptivity seems a little misleading, since the metamorphic remodeling of the entire CNS is likely altered in PTTH mutants. Since these mutants produce overall normal (albeit larger--due to their prolonged larval stages) adults, these alterations are likely to be subtle. Courtship has been reported as one defect expressed by animals lacking PTTH neurons, but this behavior may stand out because reduced fertility and increased male-male courtship (McBrayer, 2007) would be noticeable defects to researchers handling these flies. By contrast, detecting defects in other behaviors (e.g., optomotor responses, learning and memory, sleep, etc) would require closer examination. For this reason, I would ask the authors to temper their statement that PTTH is SPECIFICALLY involved in regulating female receptivity.  

      C1. We agree with that, it is not surprising that PTTH regulates the profound changes that occur in the CNS during metamorphosis through ecdysone. Also, the behavioral changes induced by PTTH mutants include not only female receptivity. We will temper the statement about the function of PTTH on female receptivity.

      We think there are two new points in our text although more evidences are needed in the future. On the one hand, PTTH deletion and the reduction of EcR-A in pC1 neurons during metamorphosis have opposite effects on female receptivity. On the other hand, development of pC1-b neurons regulated by EcR-A during metamorphosis is important for female receptivity.

      (2) The link between PTTH and the role of pC1 neurons in regulating female receptivity is not clear. Again, since 20E controls the metamorphic changes that occur in the CNS, it is not surprising that 20E would regulate the arborization of pC1 neurons. And since these neurons have been implicated in female receptivity, it would therefore be expected that altering 20E signaling in pC1 neurons would affect this phenotype. However, this does not mean that the defects in female receptivity expressed by PTTH mutants are due to defects in pC1 arborization. For this, the authors would at least have to show that PTTH mutants show the changes in pC1 arborization shown in Fig. 6. And even then the most that could be said is that the changes observed in these neurons "may contribute" to the observed behavioral changes. Indeed, the changes observed in female receptivity may be caused by PTTH/20E actions on different neurons.

      C2. As newly formed prepupae, the ptth-Gal4>UAS-Grim flies display similar changes in gene expression to the genetic control flies to response to a high-titer ecdysone pulse. These include the repression of EcR (McBrayer et al., 2007). We tested whether there is a similar feedforward relationship between PTTH and EcR-A. We quantified the EcR-A mRNA level of PTTH -/- and PTTH -/+ in the whole body of newly formed prepupae. Indeed, PTTH -/- induced upregulated EcR-A in the whole body of newly formed prepupae compared with PTTH -/+ flies. We also detected the pattern of pC1 neurons when PTTH is deleted. Consistent with the feedforward relationship between PTTH and expression of EcR-A in newly formed prepupae, PTTH deletion induced less established pC1-d neurons contrary to that induced by EcR-A reduction in pC1 neurons. 

      However, it is not sure that the expression of EcR-A in pC1 neurons increases compared with genetic controls when PTTH is deleted. Furthermore, on the one hand, manipulation of PTTH has general effect on the neurodevelopment. On the other hand, the detailed pattern of pC1-b neurons which is the key subtype regulating female receptivity through EcR-A function in pC1 neurons could not be seen clearly. So, the abnormal development of pC1b neurons, if this is true, is just one of the possible reasons for the effect of PTTH deletion on female receptivity.

      (3) Some of the results need commenting on, or refining, or revising:  a- For some assays PTTH behaves sometimes like a recessive gene and at other times like a semidominant, and yet at others like a dominant gene. For instance, in Fig. 1D-G, PTTH[-]/+ flies behave like wildtype (D), express an intermediate phenotype (E-F), or behave like the mutant (G). This may all be correct but merits some comment.

      C3. Female receptivity increases with the increase of age after eclosion, not only for wild type flies but also PTTH mutants. At the first day after eclosion (Figure 1D), maybe the loss of PTTH in PTTH[-]/+ flies is not enough for sexual precocity as in PTTH -/-. At the second day after eclosion and after (Figure 1E-G), the loss of PTTH in PTTH[-]/+ flies is sufficient to enhance female receptivity compared with wild type flies. However, After the 2nd day of adult, female receptivity of all genotype flies increases sharply. At the 3rd day of adult and after, female receptivity of PTTH -/- reaches the peak and the receptivity of PTTH[-]/+ reaches more nearly to PTTH -/- when flies get older.  

      b - Some of the conclusions are overstated. i) Although Fig. 2E-G does show that silencing the PTTH neurons during the larval stages affects copulation rate (E) the strength of the conclusion is tempered by the behavior of one of the controls (tub-Gal80[ts]/+, UAS-Kir2.1/+) in panels F and G, where it behaves essentially the same as the experimental group (and quite differently from the PTTH-Gal4/+ control; blue line).(Incidentally, the corresponding copulation latency should also be shown for these data.). ii) For Fig. 5I-K, the conclusion stated is that "Knock-down of EcR-A during pupal stage significantly decreased the copulation rate." Although strictly correct, the problem is that panel J is the only one for which the behavior of the control lacking the RNAi is not the same as that of the experimental group. Thus, it could just be that when the experiment was done at the pupal stage is the only situation when the controls were both different from the experimental. Again, the results shown in J are strictly speaking correct but the statement is too definitive given the behavior of one of the controls in panels I and K. Note also that panel F shows that the UAS-RNAi control causes a massive decrease in female fertility, yet no mention is made of this fact.

      C4. i) For all figures in the text, only when all the control groups were significant different from assay group, we say the assay group is significantly different. In Figure 2E-G, the control groups were both different from the assay group only at the larval stage. The difference between two control groups may due to the genetic background. We have described more detailed statistical analysis in the legend. In addition, the corresponding copulation latency has been shown. ii) For Figure 5, we have revised the conclusion in text as “when the experiment was done at the pupal stage is the only situation when the controls were both different from the experimental.” Besides, the UAS-RNAi control causes a massive decrease in female fertility in panel F has been mentioned.

      Reviewer #3 (Recommendations For The Authors): 

      (1) I am not sure that PTTH neurons should be referred to as "PG neurons". I am aware that this name has been used before but the PG is a gland that does not have neurons; it is not even innervated in all insects. 

      C5. Agree. “PG neurons” has been changed into “PTTH neurons”.

      (2) Fig. 1A warrants some explanation. One can easily imagine what it shows but a description is warranted. 

      C6. Explanation has been added.

      (3) When more than one genotype is compared it would be more useful to use letters to mark the genotypes that are not statistically different from each other rather than simply using asterisks. For instance, in the case of copulation latencies shown in Fig. 1E-G, which result does the comparison refer to? For example, since the comparisons are the result of ANOVAs, which comparison receives "*" in Fig. 1F? Is it PTTH[-]/+ vs PTTH[-]/PTTH[-] or vs. +/+? 

      C7. Referred genotypes and conditions were marked in all figure legends.

      (4) Fig. 1H: Why is copulation latency of PTTH[-]/PTTH[-]+elav-GAL4 significantly different from that of PTTH[-]/PTTH[-]? This merits a comment. Also, why was elav-GAL4 used to effect the rescue and not the PTTH-GAL4 driver? 

      C8. We could not explain this phenomenon. This may due to the different genetic backgrounds between controls. We have mentioned this in figure legend.

      (5) Fig. 2C, the genotype is written in a confusing order, GAL4+UAS should go together as should LexA+LexAop. 

      C9. We have revised for avoiding confusion.

      (6) In Fig. 2, is "larval stage" the same period that is shown in Fig. 3A? Please clarify.

      C10. We have clarified this in text and legends.

      (7) Fig. 6. The fact that pC1 neurons can be labeled using the pC1-ss2-Gal4 at the start of the pupal stage does not mean that this is when these neurons appear (are born), only when they start expressing this GAL4. Other types of evidence would be needed to make a statement about the birthdate of these neurons. 

      C11. We have revised the description for the appearance of pC1-ss2-Gal4>GFP. The detailed birth time of pC1 neurons will be tested in future.

      (8) The results shown in Fig. 7 are not pursued further and thus appear like a prelude to the next manuscript. Unless the authors have more to add regarding the role of one of the differentially expressed genes (e.g., dopamine beta-monooxygenase, which they single out) I would suggest leaving this result out. 

      C12. We have leave this out.

      (9) Female flies lacking PTTH neurons were reported to show lower fecundity by McBrayer et al. (2007) and should be cited. 

      C13. This important study has been cited in the first manuscript. In this revision, we have cited it again when mentioning the lower fecundity of female flies lacking PTTH neurons.

      (10) Line 230: when were PTTH neurons activated? Since they are dead by 10h post-eclosion it isn't clear if this experiment even makes sense. 

      C14. Yes, we did this for making sure that PTTH neurons do not affect female receptivity at adult stage again.

      (11) Line 338: the statements in the figures say that PTTH function is required during the larval stages, not during metamorphosis 

      C15. This has been revised as “The result suggested that EcR-A in pC1 neurons plays a role in virgin female receptivity during metamorphosis. This is consistent with that PTTH regulates virgin female receptivity before the start of metamorphosis.”

      (12) Did the authors notice any abnormal behavior in males? McBrayer et al. (2007) mention that males lacking PTTH neurons show male-male courtship. This may remit to the impact of 20E on other dsx[+] neurons. 

      C16. Yes, we have noticed that males lacking PTTH show male-male courtship. It is possible that PTTH deletion induces male-male courtship through the impact of 20E on other dsx+ or fru+ neurons. We have added the corresponding discussion.

      (13) Line 145: please define CCT at first use 

      C17. CCT has been defined.

      (14) Overall the manuscript is well written; however, it would still benefit from editing by a native English speaker. I have marked a few corrections that are needed, but I probably missed some. 

      + Line 77: "If female is not willing..." should say "If THE female is not willing..." 

      + Line 78 "...she may kick the legs, flick the wings," should say "...she may kick HER legs, flick HER wings," 

      + Lines 93-94 this sentence is unclear: "...while the neurons in that fru P1 promoter or dsx is expressed regulate some aspects..." 

      + Line 108 "...similar as the function of hypothalamic-pituitary-gonadal (HPG).." should say "...similar

      TO the function of hypothalamic-pituitary-gonadal (HPG).." 

      + Line 152 "Due to that 20E functions through its receptor EcR.." should say ""BECAUSE 20E ACTS through its receptor EcR.." 

      + Lines 155, 354 "unnormal" is not commonly used (although it is an English word); "abnormal" is usually used instead. 

      + Line 273: "....we then asked that whether ecdysone regulates" delete "that"  + Sentences lines 306-309 need to be revised.

      C18. Thank you for your suggestions. We have revised as you advise.

    1. eLife assessment

      This study presents valuable findings on the relationship between prediction errors and brain activation in response to unexpected omissions of painful electric shocks. The strengths are the research question posed, as it has remained unresolved if prediction errors in the context of biologically aversive outcomes resemble reward-based prediction errors. The evidence is solid but there are weaknesses in the experimental design, where verbal instructions do not align with experienced outcome probabilities. It is further unclear how to interpret neural prediction error signaling in the assumed absence of learning. The work will be of interest to cognitive neuroscientists and psychologists studying appetitive and aversive learning.

    2. Reviewer #1 (Public Review):

      Summary:

      Willems and colleagues test whether unexpected shock omissions are associated with reward-related prediction errors by using an axiomatic approach to investigate brain activation in response to unexpected shock omission. Using an elegant design that parametrically varies shock expectancy through verbal instructions, they see a variety of responses in reward-related networks, only some of which adhere to the axioms necessary for prediction error. In addition, there were associations between omission-related responses and subjective relief. They also use machine learning to predict relief-related pleasantness and find that none of the a priori "reward" regions were predictive of relief, which is an interesting finding that can be validated and pursued in future work.

      Strengths:

      The authors pre-registered their approach and the analyses are sound. In particular, the axiomatic approach tests whether a given region can truly be called a reward prediction error. Although several a priori regions of interest satisfied a subset of axioms, no ROI satisfied all three axioms, and the authors were candid about this. A second strength was their use of machine learning to identify a relief-related classifier. Interestingly, none of the ROIs that have been traditionally implicated in reward prediction error reliably predicted relief, which opens important questions for future research.

      Weaknesses:

      The authors have done many analyses to address weaknesses in response to reviews. I will still note that given that one third of participants (n=10) did not show parametric SCR in response to instructions, it seems like some learning did occur. As prediction error is so important to such learning, a weakness of the paper is that conclusions about prediction error might differ if dynamic learning were taken into account using quantitative models.

    3. Reviewer #3 (Public Review):

      Summary:

      The authors conducted a human fMRI study investigating the omission of expected electrical shocks with varying probabilities. Participants were informed of the probability of shock and shock intensity trial-by-trial. The time point corresponding to the absence of the expected shock (with varying probability) was framed as a prediction error producing the cognitive state of relief/pleasure for the participant. fMRI activity in the VTA/SN and ventral putamen corresponded to the surprising omission of a high probability shock. Participants' subjective relief at having not been shocked correlated with activity in brain regions typically associated with reward-prediction errors. The overall conclusion of the manuscript was that the absence of an expected aversive outcome in human fMRI looks like a reward-prediction error seen in other studies that use positive outcomes.

      Strengths:

      Overall, I found this to be a well-written human neuroimaging study investigating an often overlooked question on the role of aversive prediction errors, and how they may differ from reward-related prediction errors. The paper is well-written and the fMRI methods seem mostly rigorous and solid.

      Once again, the authors were very responsive to feedback. I have no further comments.

    4. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review): 

      The reviewer retained most of their comments from the previous reviewing round. In order to meet these comments and to further examine the dynamic nature of threat omission-related fMRI responses, we now re-analyzed our fMRI results using the single trial estimates. The results of these additional analyses are added below in our response to the recommendations for the authors of reviewer 1. However, we do want to reiterate that there was a factually incorrect statement concerning our design in the reviewer’s initial comments. Specifically, the reviewer wrote that “25% of shocks are omitted, regardless of whether subjects are told that the probability is 100%, 75%, 50%, 25%, or 0%.” We want to repeat that this is not what we did. 100% trials were always reinforced (100% reinforcement rate); 0% trials were never reinforced (0% reinforcement rate). For all other instructed probability levels (25%, 50%, 75%), the stimulation was delivered in 25% of the trials (25% reinforcement rate). We have elaborated on this misconception in our previous letter and have added this information more explicitly in the previous revision of the manuscript (e.g., lines 125-129; 223-224; 486-492).   

      Reviewer #1 (Recommendations For The Authors): 

      I do not have any further recommendations, although I believe an analysis of learning-related changes is still possible with the trial-wise estimates from unreinforced trials. The authors' response does not clarify whether they tested for interactions with run, and thus the fact that there are main effects does not preclude learning. I kept my original comments regarding limitations, with the exception of the suggestion to modify the title. 

      We thank the reviewer for this recommendation. In line with their suggestion, we have now reanalyzed our main ROI results using the trial-by-trial estimates we obtained from the firstlevel omission>baseline contrasts. Specifically, we extracted beta-estimates from each ROI and entered them into the same Probability x Intensity x Run LMM we used for the relief and SCR analyses. Results from these analyses (in the full sample) were similar to our main results. For the VTA/SN model, we found main effects of Probability (F = 3.12, p = .04), and Intensity (F = 7.15, p < .001) (in the model where influential outliers were rescored to 2SD from mean). There was no main effect of Run (F = 0.92, p = .43) and no Probability x Run interaction (F = 1.24, p = .28). If the experienced contingency would have interfered with the instructions, there should have been a Probability x Run interaction (with the effect of Probability only being present in the first runs). Since we did not observe such an interaction, our results indicate that even though some learning might still have taken place, the main effect of Probability remained present throughout the task.  

      There is an important side note regarding these analyses: For the first level GLM estimation, we concatenated the functional runs and accounted for baseline differences between runs by adding run-specific intercepts as regressors of no-interest. Hence, any potential main effect of run was likely modeled out at first level. This might explain why, in contrast to the rating and SCR results (see Supplemental Figure 5), we found no main effect of Run. Nevertheless, interaction effects should not be affected by including these run-specific intercepts.

      Note that when we ran the single-trial analysis for the ventral putamen ROI, the effect of intensity became significant (F = 3.89, p = .02). Results neither changed for the NAc, nor the vmPFC ROIs.  

      Reviewer #2 (Public Review): 

      Comments on revised version: 

      I want to thank the authors for their thorough and comprehensive work in revising this manuscript. I agree with the authors that learning paradigms might not be a necessity when it comes to study the PE signals, but I don't particularly agree with some of the responses in the rebuttal letter ("Furthermore, conditioning paradigms generally only include one level of aversive outcome: the electrical stimulation is either delivered or omitted."). This is of course correct description for the conditioning paradigm, but the same can be said for an instructed design: the aversive outcome was either delivered or not. That being said, adopting the instructed design itself is legitimate in my opinion. 

      We thank the reviewer for this comment. We have now modified the phrasing of this argument to clarify our reasoning (see lines 102-104: “First, these only included one level of aversive outcome: the electrical stimulation was either delivered at a fixed intensity, or omitted; but the intensity of the stimulation was never experimentally manipulated within the same task.”).  

      The reason why we mentioned that “the aversive outcome is either delivered or omitted” is because in most contemporary conditioning paradigms only one level of aversive US is used. In these cases, it is therefore not possible to investigate the effect of US Intensity. In our paradigm, we included multiple levels of aversive US, allowing us to assess how the level of aversiveness influences threat omission responding. It is indeed true that each level was delivered or not. However, our data clearly (and robustly across experiments, see Willems & Vervliet, 2021) demonstrate that the effects of the instructed and perceived unpleasantness of the US (as operationalized by the mean reported US unpleasantness during the task) on the reported relief and the omission fMRI responses are stronger than the effect of instructed probability.  

      My main concern, which the authors spent quite some length in the rebuttal letter to address, still remains about the validity for different instructed probabilities. Although subjects were told that the trials were independent, the big difference between 75% and 25% would more than likely confuse the subjects, especially given that most of us would fall prey to the Gambler's fallacy (or the law of small numbers) to some degree. When the instruction and subjective experience collides, some form of inference or learning must have occurred, making the otherwise straightforward analysis more complex. Therefore, I believe that a more rigorous/quantitative learning modeling work can dramatically improve the validity of the results. Of course, I also realize how much extra work is needed to append the computational part but without it there is always a theoretical loophole in the current experimental design. 

      We agree with the reviewer that some learning may have occurred in our task. However, we believe the most important question in relation to our study is: to what extent did this learning influence our manipulations of interest?  

      In our reply to reviewer 1, we already showed that a re-analysis of the fMRI results using the trial-by-trial estimates of the omission contrasts revealed no Probability x Run interaction, suggesting that – overall – the probability effect remained stable over the course of the experiment. However, inspired by the alternative explanation that was proposed by this reviewer, we now also assessed the role of the Gambler’s fallacy in a separate set of analyses. Indeed, it is possible that participants start to expect a stimulation more after more time has passed since the last stimulation was experienced. To test this alternative hypothesis, we specified two new regressors that calculated for each trial of each participant how many trials had passed since the last stimulation (or since the beginning of the experiment) either overall (across all trials of all probability types; hence called the overall-lag regressor) or per probability level (across trials of each probability type separately; hence called the lag-per-probability regressor). For both regressors a value of 0 indicates that the previous trial was either a stimulation trial or the start of experiment, a value of 1 means that the last stimulation trial was 2 trials ago, etc.  

      The results of these additional analyses are added in a supplemental note (see supplemental note 6), and referred to in the main text (see lines 231-236: “Likewise, a post-hoc trial-by-trial analysis of the omission-related fMRI activations confirmed that the Probability effect for the VTA/SN activations was stable over the course of the experiment (no Probability x Run interaction) and remained present when accounting for the Gambler’s fallacy (i.e., the possibility that participants start to expect a stimulation more when more time has passed since the last stimulation was experienced) (see supplemental note 6). Overall, these post-hoc analyses further confirm the PE-profile of omission-related VTA/SN responses”.  

      Addition to supplemental material (pages 16-18)

      Supplemental Note 6: The effect of Run and the Gambler’s Fallacy 

      A question that was raised by the reviewers was whether omission-related responses could be influenced by dynamical learning or the Gambler’s Fallacy, which might have affected the effectiveness of the Probability manipulation.  

      Inspired by this question, we exploratorily assessed the role of the Gambler’s Fallacy and the effects of Run in a separate set of analyses. Indeed, it is possible that participants start to expect a stimulation more when more time has passed since the last stimulation was experienced. To test this alternative hypothesis, we specified two new regressors that calculated for each trial of each participant how many trials had passed since the last stimulation (or since the beginning of the experiment) either overall (across all trials of all probability types; hence called the overall-lag regressor) or per probability level (across trials of each probability type separately; hence called the lag-per-probability regressor). For both regressors a value of 0 indicates that the previous trial was either a stimulation trial or the start of experiment, a value of 1 means that the last stimulation trial was 2 trials ago, etc.  

      The new models including these regressors for each omission response type (i.e., omission-related activations for each ROI, relief, and omission-SCR) were specified as follows:   

      (1) For the overall lag:

      Omission response ~ Probability * Intensity * Run + US-unpleasantness + Overall-lag + (1|Subject).  

      (2) For the lag per probability level:

      Omission response ~ Probability * Intensity * Run + US-unpleasantness + Lag-perprobability : Probability + (1|Subject).  

      Where US-unpleasantness scores were mean-centered across participants; “*” represents main effects and interactions, and “:” represents an interaction (without main effect). Note that we only included an interaction for the lag-per-probability model to estimate separate lag-parameters for each probability level.  

      The results of these analyses are presented in the tables below. Overall, we found that adding these lag-regressors to the model did not alter our main results. That is: for the VTA/SN, relief and omission-SCR, the main effects of Probability and Intensity remained. Interestingly, the overall-lag-effect itself was significant for VTA/SN activations and omission SCR, indicating that VTA/SN activations were larger when more time had passed since the last stimulation (beta = 0.19), whereas SCR were smaller when more time had passed (beta = -0.03). This pattern is reminiscent of the Perruchet effect, namely that the explicit expectancy of a US increases over a run of non-reinforced trials (in line with the gambler’s fallacy effect) whereas the conditioned physiological response to the conditional stimulus declines (in line with an extinction effect, Perruchet, 1985; McAndrew, Jones, McLaren, & McLaren, 2012). Thus, the observed dissociation between the VTA/SN activations and omission SCR might similarly point to two distinctive processes where VTA/SN activations are more dependent on a consciously controlled process that is subjected to the gambler’s fallacy, whereas the strength of the omission SCR responses is more dependent on an automatic associative process that is subjected to extinction. Importantly, however, even though the temporal distance to the last stimulation had these opposing effects on VTA/SN activations and omission SCRs, the main effects of the probability manipulation remained significant for both outcome variables. This means that the core results of our study still hold.   

      Next to the overall-lag effect, the lag-per-probability regressor was only significant for the vmPFC. A follow-up of the beta estimates of the lag-per-probability regressors for each probability level revealed that vmPFC activations increased with increasing temporal distance from the stimulation, but only for the 50% trials (beta = 0.47, t = 2.75, p < .01), and not the 25% (beta = 0.25, t = 1.49, p = .14) or the 75% trials (beta = 0.28, t = 1.62, p = .10).

      Author response table 1.

      F-statistics and corresponding p-values from the overall lag model

      (*) F-test and p-values were based on the model where outliers were rescored to 2SD from the mean. Note that when retaining the influential outliers for this model, the p-value of the probability effect was p = .06. For all other outcome variables, rescoring the outliers did not change the results. Significant effects are indicated in bold.

      Author response table 2.

      Table 2 F-statistics and corresponding p-values from the lag per probability level model

      (*) F-test and p-values were based on the model where outliers were rescored to 2SD from the mean. Note that when retaining the influential outliers for this model, the p-value of the Intensity x Run interaction was p = .05. For all other outcome variables, rescoring the outliers did not change the results. Significant effects are indicated in bold.

      As the authors mentioned in the rebuttal letter, "selecting participants only if their anticipatory SCR monotonically increased with each increase in instructed probability 0% < 25% < 50% < 75% < 100%, N = 11 participants", only ~1/3 of the subjects actually showed strong evidence for the validity of the instructions. This further raises the question of whether the instructed design, due to the interference of false instruction and the dynamic learning among trials, is solid enough to test the hypothesis .  

      We agree with the reviewer that a monotonic increase in anticipatory SCR with increasing probability instructions would provide the strongest evidence that the manipulation worked. However, it is well known that SCR is a noisy measure, and so the chances to see this monotonic increase are rather small, even if the underlying threat anticipation increases monotonically. Furthermore, between-subject variation is substantial in physiological measures, and it is not uncommon to observe, e.g., differential fear conditioning in one measure, but not in another (Lonsdorf & Merz, 2017). It is therefore not so surprising that ‘only’ 1/3 of our participants showed the perfect pattern of monotonically increasing SCR with increasing probability instructions. That being said, it is also important to note that not all participants were considered for these follow-up analyses because valid SCR data was not always available.

      Specifically, N = 4 participants were identified as anticipation non-responders (i.e. participant with smaller average SCR to the clock on 100% than on 0% trials; pre-registered criterium) and were excluded from the SCR-related analyses, and N = 1 participant had missing data due to technical difficulties. This means that only 26 (and not 31) participants were considered for the post hoc analyses. Taking this information into account, this means that 21 out of 26 participants (approximately 80%) showed stronger anticipatory SCR following 75% instructions compared to 25% instructions and that  11 out of 26 participants (approximately 40%) even showed the monotonical increase in their anticipatory SCR (see supplemental figure 4). Furthermore, although anticipatory SCR gradually decreased over the course of the experiment, there was no Run x Probability interaction, indicating that the instructions remained stable throughout the task (see supplemental figure 3).  

      Reviewer #2 (Recommendations For The Authors):

      A more operational approach might be to break the trials into different sections along the timeline and examine how much the results might have been affected across time. I expect the manipulation checks would hold for the first one or two runs and the authors then would have good reasons to focus on the behavioral and imaging results for those runs. 

      This recommendation resembles the recommendation by reviewer 1. In our reply to reviewer 1, we showed the results of a re-analysis of the fMRI data using the trial-by-trial estimates of the omission contrasts, which revealed no Probability x Run interaction, suggesting that – overall - the probability effect remained (more or less) stable over the course of the experiment.  For a more in depth discussion of the results of this additional analysis, we refer to our answer to reviewer 1.  

      Reviewer #3 (Public Review): 

      Comments on revised version: 

      The authors were extremely responsive to the comments and provided a comprehensive rebuttal letter with a lot of detail to address the comments. The authors clarified their methodology, and rationale for their task design, which required some more explanation (at least for me) to understand. Some of the design elements were not clear to me in the original paper. 

      The initial framing for their study is still in the domain of learning. The paper starts off with a description of extinction as the prime example of when threat is omitted. This could lead a reader to think the paper would speak to the role of prediction errors in extinction learning processes. But this is not their goal, as they emphasize repeatedly in their rebuttal letter. The revision also now details how using a conditioning/extinction framework doesn't suit their experimental needs. 

      We thank the reviewer for pointing out this potential cause of confusion. We have now rewritten the starting paragraph of the introduction to more closely focus on prediction errors, and only discuss fear extinction as a potential paradigm that has been used to study the role of threat omission PE for fear extinction learning (see lines 40-55). We hope that these adaptations are sufficient to prevent any false expectations. However, as we have mentioned in our previous response letter, not talking about fear extinction at all would also not make sense in our opinion, since most of the knowledge we have gained about threat omission prediction errors to date is based on studies that employed these paradigms.  

      Adaptation in the revised manuscript (lines 40-55):  

      “We experience pleasurable relief when an expected threat stays away1. This relief indicates that the outcome we experienced (“nothing”) was better than we expected it to be (“threat”). Such a mismatch between expectation and outcome is generally regarded as the trigger for new learning, and is typically formalized as the prediction error (PE) that determines how much there can be learned in any given situation2. Over the last two decades, the PE elicited by the absence of expected threat (threat omission PE) has received increasing scientific interest, because it is thought to play a central role in learning of safety. Impaired safety learning is one of the core features of clinical anxiety4. A better understanding of how the threat omission PE is processed in the brain may therefore be key to optimizing therapeutic efforts to boost safety learning. Yet, despite its theoretical and clinical importance, research on how the threat omission PE is computed in the brain is only emerging.  

      To date, the threat omission PE has mainly been studied using fear extinction paradigms that mimic safety learning by repeatedly confronting a human or animal with a threat predicting cue (conditional stimulus, CS; e.g. a tone) in the absence of a previously associated aversive event (unconditional stimulus, US; e.g., an electrical stimulation). These (primarily non-human) studies have revealed that there are striking similarities between the PE elicited by unexpected threat omission and the PE elicited by unexpected reward.”

      It is reasonable to develop a new task to answer their experimental questions. By no means is there a requirement to use a conditioning/extinction paradigm to address their questions. As they say, "it is not necessary to adopt a learning paradigm to study omission responses", which I agree with.  But the authors seem to want to have it both ways: they frame their paper around how important prediction errors are to extinction processes, but then go out of their way to say how they can't test their hypotheses with a learning paradigm.

      Part of their argument that they needed to develop their own task "outside of a learning context" goes as follows: 

      (1) "...conditioning paradigms generally only include one level of aversive outcome: the electrical stimulation is either delivered or omitted. As a result, the magnitude-related axiom cannot be tested." 

      (2) "....in conditioning tasks people generally learn fast, rendering relatively few trials on which the prediction is violated. As a result, there is generally little intra-individual variability in the PE responses" 

      (3) "...because of the relatively low signal to noise ratio in fMRI measures, fear extinction studies often pool across trials to compare omission-related activity between early and late extinction, which further reduces the necessary variability to properly evaluate the probability axiom" 

      These points seem to hinge on how tasks are "generally" constructed. However, there are many adaptations to learning tasks:

      (1) There is no rule that conditioning can't include different levels of aversive outcomes following different cues. In fact, their own design uses multiple cues that signal different intensities and probabilities. Saying that conditioning "generally only include one level of aversive outcome" is not an explanation for why "these paradigms are not tailored" for their research purposes. There are also several conditioning studies that have used different cues to signal different outcome probabilities. This is not uncommon, and in fact is what they use in their study, only with an instruction rather than through learning through experience, per se.

      (2) Conditioning/extinction doesn't have to occur fast. Just because people "generally learn fast" doesn't mean this has to be the case. Experiments can be designed to make learning more challenging or take longer (e.g., partial reinforcement). And there can be intra-individual differences in conditioning and extinction, especially if some cues have a lower probability of predicting the US than others. Again, because most conditioning tasks are usually constructed in a fairly simplistic manner doesn't negate the utility of learning paradigms to address PEaxioms.

      (3) Many studies have tracked trial-by-trial BOLD signal in learning studies (e.g., using parametric modulation). Again, just because other studies "often pool across trials" is not an explanation for these paradigms being ill-suited to study prediction errors. Indeed, most computational models used in fMRI are predicated on analyzing data at the trial level. 

      We thank the reviewer for these remarks. The “fear conditioning and extinction paradigms” that we were referring to in this paragraph were the ones that have been used to study threat omission PE responses in previous research (e.g., Raczka et al., 2011; Thiele et al. 2021; Lange et al. 2020; Esser et al., 2021; Papalini et al., 2021; Vervliet et al. 2017). These studies have mainly used differential/multiple-cue protocols where either one (or two) CS+  and one CS- are trained in an acquisition phase and extinguished in the next phase. Thus, in these paradigms: (1) only one level of aversive US is used; and (2) as safety learning develops over the course of extinction, there are relatively few omission trials during which “large” threat omission PEs can be observed (e.g. from the 24 CS+ trials that were used during extinction in Esser et al., the steepest decreases in expectancy – and thus the largest PE – were found in first 6 trials); and (3) there was never absolute certainty that the stimulation will no longer follow. Some of these studies have indeed estimated the threat omission PE during the extinction phase based on learning models, and have entered these estimates as parametric modulators to CS-offset regressors. This is very informative. However, the exact model that was used differed per study (e.g. Rescorla-Wagner in Raczka et al. and Thiele et al.; or a Rescorla- Wagner–Pearce- Hall hybrid model in Esser et al.). We wanted to analyze threat omission-responses without commitment to a particular learning model. Thus, in order to examine how threat omissionresponses vary as a function of probability-related expectations, a paradigm that has multiple probability levels is recommended (e.g. Rutledge et al., 2010; Ojala et al., 2022)

      The reviewer rightfully pointed out that conditioning paradigms (more generally) can be tailored to fit our purposes as well. Still, when doing so, the same adaptations as we outlined above need to be considered: i.e. include different levels of US intensity; different levels of probability; and conditions with full certainty about the US (non)occurrence. In our attempt to keep the experimental design as simple and straightforward as possible, we decided to rely on instructions for this purpose, rather than to train 3 (US levels) x 5 (reinforcement levels) = 15 different CSs. It is certainly possible to train multiple CSs of varying reinforcement rates (e.g. Grings et al. 1971, Ojala et al., 2022). However, given that US-expectation on each trial would primarily depend on the individual learning processes of the participants, using a conditioning task would make it more difficult to maintain experimental control over the level of USexpectation elicited by each CS. As a result, this would likely require more extensive training, and thus prolong the study procedure considerably. Furthermore, even though previous studies have trained different CSs for different reinforcement rates, most of these studies have only used one level of US. Thus, in order to not complexify our task to much, we decided to rely on instructions rather than to train CSs for multiple US levels (in addition to multiple reinforcement rates).

      We have tried to clarify our reasoning in the revised version of the manuscript (see introduction, lines 100-113):  

      “The previously discussed fear conditioning and extinction studies have been invaluable for clarifying the role of the threat omission PE within a learning context. However, these studies were not tailored to create the varying intensity and probability-related conditions that are required to systematically evaluate the threat omission PE in the light of the PE axioms. First, these only included one level of aversive outcome: the electrical stimulation was either delivered or omitted; but the intensity of the stimulation was never experimentally manipulated within the same task. As a result, the magnitude-related axiom could not be tested. Second, as safety learning progressively developed over the course of extinction learning, the most informative trials to evaluate the probability axiom (i.e. the trials with the largest PE) were restricted to the first few CS+ offsets of the extinction phase, and the exact number of these informative trials likely differed across participants as a result of individually varying learning rates. This limited the experimental control and necessary variability to systematically evaluate the probability axiom. Third, because CS-US contingencies changed over the course of the task (e.g. from acquisition to extinction), there was never complete certainty about whether the US would (not) follow. This precluded a direct comparison of fully predicted outcomes. Finally, within a learning context, it remains unclear whether brain responses to the threat omission are in fact responses to the violation of expectancy itself, or whether they are the result of subsequent expectancy updating.”

      Again, the authors are free to develop their own task design that they think is best suited to address their experimental questions. For instance, if they truly believe that omission-related responses should be studied independent of updating. The question I'm still left puzzling is why the paper is so strongly framed around extinction (the word appears several times in the main body of the paper), which is a learning process, and yet the authors go out of their way to say that they can only test their hypotheses outside of a learning paradigm. 

      As we have mentioned before, the reason why we refer to extinction studies is because most evidence on threat omission PE to date comes from fear extinction paradigms.  

      The authors did address other areas of concern, to varying extents. Some of these issues were somewhat glossed over in the rebuttal letter by noting them as limitations. For example, the issue with comparing 100% stimulation to 0% stimulation, when the shock contaminates the fMRI signal. This was noted as a limitation that should be addressed in future studies, bypassing the critical point. 

      It is unclear to us what the reviewer means with “bypassing the critical point”. We argued in the manuscript that the contrast we initially specified and preregistered to study axiom 3 (fully predicted outcomes elicit equivalent activation) could not be used for this purpose, as it was confounded by the delivery of the stimulation. Because 100% trials aways included the stimulation and 0% trials never included stimulation, there was no way to disentangle activations related to full predictability from activations related to the stimulation as such.   

      Reviewer #3 (Recommendations For The Authors): 

      I'm not sure the new paragraph explaining why they can't use a learning task to test their hypotheses is very convincing, as I noted in my review. Again, it is not a problem to develop a new task to address their questions. They can justify why they want to use their task without describing (incorrectly in my opinion) that other tasks "generally" are constructed in a way that doesn't suit their needs. 

      For an overview of the changes we made in response to this recommendation, we refer to our reply to the public review.   

      We look forward to your reply and are happy to provide answers to any further questions or comments you may have.

    1. eLife assessment

      This useful study describes the second earliest known winged ovule without a capule in the Famennian of Late Devonian. Using solid mathematical analysis, the authors demonstrate that three-winged seeds are more adapted to wind dispersal than one-, two- and four-winged seeds. The manuscript will help the scientific community to understand the origin and early evolutionary history of wind dispersal strategy of early land plants.

    2. Reviewer #1 (Public Review):

      Summary:

      Winged seeds or ovules from the Devonian are crucial to understanding the origin and early evolutionary history of wind dispersal strategy. Based on exceptionally well-preserved fossil specimens, the present manuscript documented a new fossil plant taxon (new genus and new species) from the Famennian Series of Upper Devonian in eastern China and demonstrated that three-winged seeds are more adapted to wind dispersal than one-, two- and four-winged seeds by using mathematical analysis.

      Strengths:

      The manuscript is well organised and well presented, with superb illustrations. The methods used in the manuscript are appropriate.

      Weaknesses:

      I would only like to suggest moving the "Mathematical analysis of wind dispersal of ovules with 1-4 wings" section from the supplementary information to the main text, leaving the supplementary figures as supplementary materials.

    3. Author response:

      The following is the authors’ response to the original reviews.

      The manuscript lacks the conclusion section to summarize their finding. The rebuttal is too simple to state where and in which way the authors have made their revisions. In this case, please return this revision to the authors and ask them revise their contribution carefully.

      We now indicate in detail the places and the way that we make revisions. Specific revisions in sentences/words are marked with blue color in the main text where necessary. A conclusion is now provided at the end of the main text (lines 264-275). Other major revisions include:

      (1) We add Fig. 5 as a new figure to reconstruct ovule structure of Alasemenia and to compare three- and four-winged ovules. This is followed by Fig. 6 relating to mathematical analysis.

      (2) We re-organize (sequences of some) paragraphs and revise sentences in Discussion, and then divide Discussion into three parts: “Late Devonian acupulate ovules and their functions” (lines 124-150), “Late Devonian winged ovules and evolution of ovular wings” (lines 151-179), “Mathematical analysis of wind dispersal of ovules with 1-4 wings” (lines 180-262).

      (3) We move “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section from the supplementary information to the main text as the third part of Discussion (lines 180-262). The original paragraph headed with Mathematical analysis in Results is now modified and inserted to “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 250-256). The last paragraph in the original Supplementary information is now greatly modified and presented at the end of “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 256-262).

      (4) With moving “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section from the supplementary information to the main text, five references are accordingly added to the list (lines 278-282, 296-300, 329-330).

      (5) We change the format of citing references in the main text.

      We have therefore returned your manuscript to you to allow you to make the updates necessary to address the editors comments. Please ensure that you also update your preprint with the newly revised version once complete.

      Many thanks for this allowance and we now make the necessary updates to address the editors’ and reviewers’ comments. At the same time, the new version is also provided as a preprint.

      Reviewer #1 (Public Review):

      Summary:

      Winged seeds or ovules from the Devonian are crucial to understanding the origin and early evolutionary history of wind dispersal strategy. Based on exceptionally well-preserved fossil specimens, the present manuscript documented a new fossil plant taxon (new genus and new species) from the Famennian Series of Upper Devonian in eastern China and demonstrated that three-winged seeds are more adapted to wind dispersal than one-, two- and four-winged seeds by using mathematical analysis.

      Many thanks for these positive comments by the reviewer.

      Strengths:

      The manuscript is well organised and well presented, with superb illustrations. The methods used in the manuscript are appropriate.

      Many thanks for the reviewer’s positive comments.

      Weaknesses:

      I would only like to suggest moving the "Mathematical analysis of wind dispersal of ovules with 1-4 wings" section from the supplementary information to the main text, leaving the supplementary figures as supplementary materials.

      Ok, following the suggestion, we have moved this “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section to the main text (lines 180-262). It now represents the third part of Discussion. The original paragraph headed with Mathematical analysis in Results is now modified and inserted to “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 250-256). The last paragraph in the original Supplementary information is now greatly modified and presented at the end of “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 256-262).

      Reviewer #2 (Public Review):

      Summary:

      This manuscript described the second earliest known winged ovule without a capule in the Famennian of Late Devonian. Using Mathematical analysis, the authors suggest that the integuments of the earliest ovules without a cupule, as in the new taxon and Guazia, evolved functions in wind dispersal.

      Yes, these include our description, mathematical analysis and suggestion.

      Strengths:

      The new ovule taxon's morphological part is convincing. It provides additional evidence for the earliest winged ovules, and the mathematical analysis helps to understand their function.

      Many thanks for these positive comments of the reviewer.

      Weaknesses:

      The discussion should be enhanced to clarify the significance of this finding. What is the new advance compared with the Guazia finding? The authors can illustrate the character transformations using a simplified cladogram. The present version of the main text looks flat.

      To clarify the significance of this finding, the discussion is now enhanced in the following respects. We now re-organize the contents of Discussion and divide it into three parts. These three parts are entitled “Late Devonian acupulate ovules and their functions” (lines 124-150), “Late Devonian winged ovules and evolution of ovular wings” (lines 151-179), “Mathematical analysis of wind dispersal of ovules with 1-4 wings” (lines 180-262). The third part is transformed from the original Supplementary information.

      Regarding new advance (Alasemenia) compared with Guazia and illustration of the character transformations:

      (1) we now provide a new figure (Fig. 5) to reconstruct ovule of Alasemenia and to compare the structure of these two ovules.

      (2) in the second part of Discussion, we now say “As in Alasemenia (Fig. 5a), the integumentary wings of acupulate ovule of Guazia are broad, thin and fold inwards along the abaxial side, but their numbers are four in each ovule and their free portions usually arch centripetally (Fig. 5c; Wang et al., 2022, Figure 5).”

      (3) also in the second part of Discussion, we now say “Compared to Warsteinia with short and straight wings and Guazia with long but distally inwards curving wings, Alasemenia with longer and outwards extending wings would efficiently reduce the rate of descent and be more capably moved by wind. Furthermore, the quantitative analysis in mathematics indicates that three-winged ovules such as Alasemenia are more adapted to wind dispersal than four-winged ovules including Warsteinia and Guazia (see following).”

      (4) in the third part of Discussion, we now say “Significantly, the maximum windward area of each wing of Alasemenia is greater than that of Guazia and Warsteinia with four wings. All these factors suggest that Alasemenia is well adapted for anemochory.”

      (5) in Conclusion, we now say “Compared to Famennian four-winged ovules of Warsteinia and Guazia, Alasemenia with three distally outwards extending wings shows advantage in anemochory.”

      Recommendations for the authors:

      Ok, we undertake some revisions and keep some original contents.

      Reviewer #1 (Recommendations For The Authors):

      I would only like to suggest moving the "Mathematical analysis of wind dispersal of ovules with 1-4 wings" section from the supplementary information to the main text, leaving the supplementary figures as supplementary materials.

      Ok, following the suggestion, we now move this “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section to the main text (lines 180-262). It now represents the third part of Discussion.

      Reviewer #2 (Recommendations For The Authors):

      (1) The mathematical part as the supplement can be incorporated into the text.

      Ok, following the suggestion, we now move this “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section to the main text (lines 180-262). It now represents the third part of Discussion. The original paragraph headed with Mathematical analysis in Results is now modified and inserted to “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 250-256). The last paragraph in the original Supplementary information is now greatly modified and presented at the end of “Mathematical analysis of wind dispersal of ovules with 1-4 wings” section (lines 256-262).

      (2) The comparisons between three- or four-winged ovules are not addressed enough.

      We now add Fig. 5 as a new figure. Based on this figure and revisions, the comparisons between three- and four-winged ovules now include:

      a) “Their integumentary wings illustrate diversity in number (three or four per ovule), length, folding or flattening, and being straight or curving distally. As in Alasemenia (Fig. 5a), the integumentary wings of acupulate ovule of Guazia are broad, thin and fold inwards along the abaxial side, but their numbers are four in each ovule and their free portions usually arch centripetally (Fig. 5c; Wang et al., 2022, Figure 5). In contrast to Alasemenia, Warsteinia has four integumentary wings without folding and their free portions are short and straight (Rowe, 1997, TEXT-FIG. 4).” (lines 154-160).

      b) “Furthermore, the quantitative analysis in mathematics indicates that three-winged ovules such as Alasemenia are more adapted to wind dispersal than four-winged ovules including Warsteinia and Guazia (see following).” (lines 166-168).

      c) “The relative wind dispersal efficiency of three-winged seeds is obviously better than that of single- and two- winged seeds, and is close to that of four-winged seeds (Fig. 6). In addition, three-winged seeds have the most stable area of windward, which also ensures the motion stability in wind dispersal. Significantly, the maximum windward area of each wing of Alasemenia is greater than that of Guazia and Warsteinia with four wings.” (lines 256-261).

      d) “Compared to Famennian four-winged ovules of Warsteinia and Guazia, Alasemenia with three distally outwards extending wings shows advantage in anemochory.” (lines 272-274).

      (3) The significance of this finding should be well summarized with solid evidence.

      It has been summarized in Abstract (lines 19-28) and is now further summarized especially in the newly provided Conclusion (lines 264-275).

    1. eLife assessment

      This valuable study provides in vivo evidence for the synchronization of projection neurons in the olfactory bulb at gamma frequency in an activity-dependent manner. This study uses optogenetics in combination with single-cell recordings to selectively activate sensory input channels within the olfactory bulb. The data are thoughtfully analyzed and presented; the evidence is solid, although some of the conclusions are only partially supported.

    2. Reviewer #1 (Public Review):

      Summary:

      Dalal and Haddad investigated how neurons in the olfactory bulb are synchronized in oscillatory rhythms at gamma frequency. Temporal coordination of action potentials fired by projection neurons can facilitate information transmission to downstream areas. In a previous paper (Dalal and Haddad 2022, https://doi.org/10.1016/j.celrep.2022.110693), the authors showed that gamma frequency synchronization of mitral/tufted cells (MTCs) in the olfactory bulb enhances the response in the piriform cortex. The present study builds on these findings and takes a closer look at how gamma synchronization is restricted to a specific subset of MTCs in the olfactory bulb. They combined odor and optogenetic stimulations in anesthetized mice with extracellular recordings.<br /> The main findings are that lateral synchronization of MTCs at gamma frequency is mediated by granule cells (GCs), independent of the spatial distance, and strongest for MTCs with firing rates close to 40 Hz. The authors conclude that this reveals a simple mechanism by which spatially distributed neurons can form a synchronized ensemble. In contrast to lateral synchronization, they found no evidence for the involvement of GCs in lateral inhibition of nearby MTCs.

      Strengths:

      Investigating the mechanisms of rhythmic synchronization in vivo is difficult because of experimental limitations for the readout and manipulation of neuronal populations at fast timescales. Using spatially patterned light stimulation of opsin-expressing neurons in combination with extracellular recordings is a nice approach. The paper provides evidence for an activity-dependent synchronization of MTCs in gamma frequency that is mediated by GCs.

      Weaknesses:

      An important weakness of the study is the lack of direct evidence for the main conclusion - the synchronization of MTCs in gamma frequency. The data shows that paired optogenetic stimulation of MTCs in different parts of the olfactory bulb increases the rhythmicity of individual MTCs (Figure 1) and that combined odor stimulation and GC stimulation increases rhythmicity and gamma phase locking of individual MTCs (Figure 4). However, a direct comparison of the firing of different MTCs is missing. This could be addressed with extracellular recordings at two different locations in the olfactory bulb. The minimum requirement to support this conclusion would be to show that the MTCs lock to the same phase of the gamma cycle. Also, showing the evoked gamma oscillations would help to interpret the data.

      Another weakness is that all experiments are performed under anesthesia with ketamine/medetomidine. Ketamine is an antagonist of NMDA receptors and NMDA receptors are critically involved in the interactions of MTCs and GCs at the reciprocal synapses (see for example Lage-Rupprecht et al. 2020, https://doi.org/10.7554/eLife.63737; Egger and Kuner 2021, https://doi.org/10.1007/s00441-020-03402-7). This should be considered for the interpretation of the presented data.

      Furthermore, the direct effect of optogenetic stimulation on GCs activity is not shown. This is particularly important because they use Gad2-cre mice with virus injection in the olfactory bulb and expression might not be restricted to granule cells and might not target all subtypes of granule cells (Wachowiak et al., 2013, https://doi.org/10.1523/JNEUROSCI.4824-12.2013). This should be considered for the interpretation of the data, particularly for the absence of an effect of GC stimulation on lateral inhibition.

      Several conclusions are only supported by data from example neurons. The paper would benefit from a more detailed description of the analysis and the display of some additional analysis at the population level:

      - What were the criteria based on which the spots for light-activation were chosen from the receptive field map?

      - The absence of an effect on firing rate for paired stimulations is only shown for one example (Figure 1c). A quantification of the population level would be interesting.

      - Only one example neuron is shown to support the conclusion that "two different neural circuits mediate suppression and entrainment" in Figure 3. A population analysis would provide more evidence.

      - Only one example neuron is shown to illustrate the effect of GC stimulation on gamma rhythmicity of MTCs in Figures 4 f,g.

      - In Figure 5 and the corresponding text, "proximal" and "distal" GC activation are not clearly defined.

    3. Reviewer #2 (Public Review):

      Summary

      This study provides a detailed analysis and dissociation between two effects of activation of lateral inhibitory circuits in the olfactory bulb on ongoing single mitral/tufted cell (MTC) spiking activity, namely enhanced synchronization in the gamma frequency range or lateral inhibition of firing rate.

      The authors use a clever combination of single-cell recordings, optogenetics with variable spatial stimulation of MTCs and sensory stimulation in vivo, and established mathematical methods to describe changes in autocorrelation/synchronization of a single MTC's spiking activity upon activation of lateral glomerular MTC ensembles. This assay is rounded off by a gain-of-function experiment in which the authors enhance granule cell (GC) excitation to establish a causal relation between GC activation and enhanced synchronization to gamma (they had used this manipulation in their previous paper Dalal & Haddad 2022, but use a smaller illumination spot here for spatially restricted activation).

      Strengths

      This study is of high interest for olfactory processing - since it shows directly that interactions between only two selected active receptor channels are sufficient to enhance the synchronization of single neurons to gamma in one channel (and thus by inference most likely in both). These interactions are distance-independent over many 100s of µms and thus can allow for non-topographical inhibitory action across the bulb, in contrast to the center-surround lateral inhibition known from other sensory modalities.

      In my view, parallels between vision and olfaction might have been overemphasized so far, since the combinatorial encoding of olfactory stimuli across the glomerular map might require different mechanisms of lateral interaction versus vision. This result is indicative of such a major difference.

      Such enhanced local synchronization was observed in a subset of activated channel pairs; in addition, the authors report another type of lateral interaction that does involve the reduction of firing rates, drops off with distance and most likely is caused by a different circuit-mediated by PV+ neurons (PVN; the evidence for which is circumstantial).

      Weaknesses/Room for improvement

      Thus this study is an impressive proof of concept that however does not yet allow for broad generalization. Therefore the framing of results should be slightly more careful in my opinion.

      Along this line, the conclusions regarding two different circuits underlying lateral inhibition vs enhanced synchronization are not quite justified by the data, e.g.

      (1) The authors mention that their granule cell stimulation results in a local cold spot (l. 527 ff) - how can they then said to be not involved in the inhibition of firing rate (bullet point in Highlights)? Please elaborate further. In l.406 they also state that GCs can inhibit MTCs under certain conditions. The argument, that this stimulation is not physiological, makes sense, but still does not rule out anything. You might want to cite Aghvami et al 2022 on the very small amplitude of GC-mediated IPSPs, also McIntyre and Cleland 2015.

      (2) Even from the shown data, it appears that laterally increased synchronization might co-occur with lateral suppression (See also comment on Figures 1d,e and Figure S1c)

      (3) There are no manipulations of PVN activity in this study, thus there is no direct evidence for the substrate of the second circuit.

      (4) The manipulation of GC activity was performed in a transgenic line with viral transfection, which might result in a lower permeation of the population compared to the line used for optogenetic stimulation of MTCs.

      In some instances, the authors tend to cite older literature - which was not yet aware of the prominent contribution of EPL interneurons including PVN to recurrent and lateral inhibition of MT cells - as if roles that then were ascribed to granule cells for lack of better knowledge can still be unequivocally linked to granule cells now. For example, they should discuss Arevian et al (2006), Galan et al 2006, Giridhar et al., Yokoi et al. 1995, etc in the light of PVN action.

      Therefore it is also not quite justified to state that their result regarding the role of GCs specifically for synchronization, not suppression, is "in contrast to the field" (e.g. l.70 f.,, l.365, l. 400 ff).

      Why did the authors choose to use the term "lateral suppression", often interchangeably with lateral inhibition? If this term is intended to specifically reflect reductions of firing rates, it might be useful to clearly define it at first use (and cite earlier literature on it) and then use it consistently throughout.

      A discussion of anesthesia effects is missing - e.g. GC activity is known to be reportedly stronger in awake mice (Kato et al). This is not a contentious point at all since the authors themselves show that additional excitation of GCs enhances synchrony, but it should be mentioned.

      Some citations should be added, in particular relevant recent preprints - e.g. Peace et al. BioRxiv 2024, Burton et al. BioRxiv 2024 and the direct evidence for a glutamate-dependent release of GABA from GCs (Lage-Rupprecht et al. 2020).

      The introduction on the role of gamma oscillations in sensory systems (in particular vision) could be more elaborated.

    4. Reviewer #3 (Public Review):

      Summary:

      This study by Dalal and Haddad analyzes two facets of cooperative recruitment of M/TCs as discerned through direct, ChR2-mediated spot stimulations:

      (1) mutual inhibition and<br /> (2) entrainment of action potential timing within the gamma frequency range.

      This investigation is conducted by contrasting the evoked activity elicited by a "central" stimulus spot, which induces an excitatory response alone, with that elicited when paired with stimulations of surrounding areas. Additionally, the effect of Gad2-expressing granule cells is examined.

      Based on the observed distance dependence and the impact of GC stimulations, the authors infer that mutual inhibition and gamma entrainment are mediated by distinct mechanisms.

      Strengths:

      The results presented in this study offer a nice in vivo validation of the significant in vitro findings previously reported by Arevian, Kapoor, and Urban in 2008. Additionally, the distance-dependent analysis provides some mechanistic insights.

      Weaknesses:

      The results largely reproduce previously reported findings, including those from the authors' own work, such as Dalal and Haddad (2022), where a key highlight was "Modulating GC activities dissociates MTCs odor-evoked gamma synchrony from firing rates." Some interpretations, particularly the claim regarding the distance independence of the entrainment effect, may be considered over-interpretations.

    5. Author response:

      We sincerely appreciate the reviewers' time, effort, and thoughtful feedback, which have significantly contributed to our research.

      A key concern raised was the potential overinterpretation of our data. While the reviewers acknowledged our identification of a possible synchronization mechanism among active mitral and tufted cells (MTCs) that is distance-independent, they correctly pointed out that we did not provide direct evidence showing how ensemble MTCs synchronize. We concur with their assessment and will address this in our forthcoming response to ensure a precise interpretation of our findings.

      Another concern raised involves the interpretation of results obtained under Ketamine anesthesia. Since Ketamine is an NMDA receptor antagonist, which plays a crucial role in MTC-GC reciprocal synapses, this might impact our conclusions. To address this, we will include analyses demonstrating that optogenetic activation of granule cells (GCs) in an anesthetized state inhibits recorded MTCs during baseline but does not affect odor-evoked MTC firing rates. Additionally, we will thoroughly discuss the potential influence of Ketamine anesthesia on GC-MTC synapses and its implications for our findings.

      Lastly, in our detailed response to the reviewers' comments, we will discuss several recent studies that are particularly relevant to our research. We will also expand on our hypothesis that parvalbumin-positive cells in the olfactory bulb may serve as key mediators of the activity- and distance-dependent lateral inhibition observed in our findings.

    1. eLife assessment

      To test if somatic mutations in cancer genomes are enriched with mutations in polyadenylation signal regions, the authors observed an increased enrichment of somatic mutations that may affect the function of polyA signals and confirmed that these mutations may influence gene expression through a minigene expression experiment. This important study advances our understanding of noncoding somatic mutations by identifying a novel class of mutations that affect 3'UTR polyadenylation signals enriched in tumor suppressor genes in cancer. The evidence supporting the conclusions is convincing, with rigorous statistical analyses and experimental validation.

    2. Reviewer #1 (Public Review):

      Kainov et al investigated the prevalence of mutations in 3'UTR that affect gene expression in cancer to identify noncoding cancer drivers.

      The authors used data from normal controls (1000 genome data) and compared it to cancer data (PCAWG). They found that in cancer 3'UTR mutations had a stronger effect on cleavage than the normal population. These mutations are negatively selected in the normal population and positively selected in cancers. The authors used PCAWG data set to identify such mutations and found that the mutations that lead to a reduction of gene expression are enriched in tumor suppressor genes and those that are increased in gene expression are enriched for oncogenes. 3'UTR mutations that reduce gene expression or occur in TSGs co-occur with non-synonymous mutations. The authors then validate the effect of 3'UTR mutations experimentally using a luciferase reporter assay. These data identify a novel class of noncoding driver genes with mutations in 3'UTR that impact polyadenylation and thus gene expression.

      This is an elegant study with fundamental insight into identifying cancer driver genes. The conclusions of this paper are mostly well supported by data, but some aspects of data analysis need to be extended.

      (1) It would be important for the authors to show if the findings of this study hold for metastatic cancers since most deaths occur due to metastasis and tumor heterogeneity changes when cancer progresses to metastasis. The authors should use the Hartwig data and show if metastatic cancers are enriched for 3'UTR mutations.

      (2) Figure 2 should show the distribution of 3'UTR mutations by cancer type especially since authors go on to use colorectal cancer only for validations. It would be helpful to bring Figures S3A and S3C to this panel since these findings make the connections to cancer biology. Are any molecular functions enriched in addition to biological processes? Are kinases, phosphatases, etc more or less affected by 3'UTR mutations?

      (3) Figure 3 looks at the co-occurrence of 3'UTR mutations with non-synonymous mutations but what about copy number change? You would expect the loss of the other allele to be enriched. Along the same line, are these data phased? Do you know that the non-synonymous mutations are in the other allele or in the same allele that shows 3'UTR mutation?

    3. Reviewer #2 (Public Review):

      Summary:

      To evaluate whether somatic mutations in cancer genomes are enriched with mutations in polyadenylation signal regions, the authors analyzed 1000 genomes data and PCAWG data as a control and experimental set, respectively. They observed increased enrichment of somatic mutations that may affect the function of polyA signals and confirmed that these mutations may influence the expression of the gene through a minigene expression experiment.

      Strengths:

      This study provides a systematic evaluation of polyA signal, which makes it valuable. Overall, the analytic approach and results are solid and supported by experimental validation.

      Weaknesses:

      (1) This study uses APARENT2 as a tool to evaluate functional alteration in polyA signal sequences. Based on the original paper and the results shown in this paper, the algorithm appears to be of high quality. However, the whole study is dependent on the output of APARENT2. Therefore, it would be nice to<br /> (a) run and show a positive control run, which can show that the algorithm works well, and<br /> (b) describe the rationale for selecting this algorithm in the main text.

      (2) Are there recurrent somatic mutation calls (= exactly the same mutation across different tumor samples) in the poly(A) region of certain genes?

      (3) The authors nicely showed that the minigene with A>G mutation altered gene expression. Maybe one can reach a similar conclusion by analyzing a cancer dataset that has mutation and gene expression data? That is, genes with or without polyA mutations show different expression levels.

    4. Author response:

      We thank both reviewers for their constructive comments. We will do our best incorporating the requested analyses and answering reviewers’ questions in the revision

    1. eLife assessment

      The manuscript reports fundamental findings that extra-embryonic visceral yolk sac endoderm is critical for NAD de novo synthesis during early organogenesis, and perturbations of this pathway may cause Congenital NAD Deficiency Disorder. The supporting evidence is solid. This work will be of interest to developmental biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      This study investigated the mechanism underlying Congenital NAD Deficiency Disorder (CNDD) using a mouse model with loss of function of the HAAO enzyme which mediates a key step in the NAD de novo synthesis pathway. This study builds on the observation that the kynurenine pathway is required in the conceptus, as HAAO null embryos are sensitive to maternal deficiency of NAD precursors (vitamin B3) and tryptophan, and narrows the window of sensitivity to a 3-day period.

      An important finding is that de novo NAD synthesis occurs in an extra-embryonic tissue, the visceral yolk sac, before the liver develops in the embryo. It is suggested that lack of this yolk sac activity leads to impaired NAD supply in the embryo leading to structural abnormalities found later in development.

      Strengths:

      Previous studies show a requirement for HAAO activity for the normal development of embryos. Abnormalities develop under conditions of maternal vitamin B3 deficiency, indicating a requirement for NAD synthesis in the conceptus. Analysis of scRNA-seq datasets combined with metabolite analysis of yolk sac tissue shows that the NAD synthesis pathway is expressed and functional in the yolk sac from E10.5 onwards (prior to liver development).

      HAAO enzyme assay enabled quantification of enzyme activity in relevant tissues including the liver (from E12.5), placenta, and yolk sac (from E11.5).

      Comprehensive metabolite analysis of the NAD synthesis pathway supports the predicted effects of Haao knockout and provides analysis of the yolk sac, placenta, and embryo at a series of stages.

      The dietary study (with lower vitamin B3 in maternal diet from E7.5-10.5) is an incremental addition to previous studies that imposed similar restrictions from E7.5-12.5.

      Nevertheless, this emphasises the importance of the synthesis pathway on the conceptus at stages before the liver activity is prominent.

      Weaknesses:

      The current dietary study narrows the period when deficiency can cause malformations (analysed at E18.5), and altered metabolite profiles (eg, increased 3HAA, lower NAD) are detected in the yolk sac and embryo at E10.5. However, without analysis of embryos at later stages in this experiment it is not known how long is needed for NAD synthesis to be recovered - and therefore until when the period of exposure to insufficient NAD lasts. This information would inform the understanding of the developmental origin of the observed defects.

      More importantly, there is still a question of whether in addition to the yolk sac, there is HAAO activity within the embryo itself prior to E12.5 (when it has first been assayed in the liver - Figure 1C). The prediction is that within the conceptus (embryo, chorioallantoic placenta, and visceral yok sac) the embryo is unlikely to be the site of NAD synthesis prior to liver development. Reanalysis of scRNA-seq (Fig 1B) shows expression of all the enzymes of the kynurenine pathway from E9.5 onwards. However, the expression of another available dataset at E10.5 (Fig S3) suggested that expression is 'negligible'. While the expression in Figure 1B, Figure S1 is weak this creates a lack of clarity about the possible expression of HAAO in the hepatocyte lineage, or especially elsewhere in the embryo prior to E10.5 (corresponding to the period when the authors have demonstrated that de novo NAD synthesis in the conceptus is needed). Given these questions, a direct analysis of RNA and/or protein expression in the embryos at E7.5-10.5 would be helpful.

    3. Reviewer #2 (Public Review):

      Summary:

      Disruption of the nicotinamide adenine dinucleotide (NAD) de novo Synthesis Pathway, by which L-tryptophan is converted to NAD results in multi-organ malformations which collectively has been termed Congenital NAD Deficiency Disorder (CNDD).

      While NAD de novo synthesis is primarily active in the liver postnatally, the site of activity prior to and during organogenesis is unknown. However, mouse embryos are susceptible to CNDD between E7.5-E12.5, before the embryo has developed a functional liver. Therefore, NAD de novo synthesis is likely active in another cell or tissue during this time window of susceptibility.

      The body of work presented in this paper continues the corresponding author's lab investigation of the cause and effects of NAD Deficiency and the primary goal was to determine the cell or tissue responsible for NAD de novo synthesis during early embryogenesis.

      The authors conclude that visceral yolk sac endoderm is the source of NAD de novo synthesis, which is essential for mouse embryonic development, and furthermore that the dynamics of NAD synthesis are conserved in human equivalent cells and tissues, the perturbation of which results in CNDD.

      Strengths:

      Overall, the primary findings regarding the source of NAD synthesis, the temporal requirement, and conservation between rodent and human species are quite novel and important for our understanding of NAD synthesis and its function and role in CNDD.

      The authors used UHPLC-MS/MS to quantify NAD+ and NAD-related metabolites and showed convincingly that the NAD salvage pathway can compensate for the loss of NAD synthesis in Haao-/- embryos, then determined that Haao activity was present in the yolk sac prior to hepatic development identifying this organ as the site of de novo NAD synthesis. Dietary modulation between E7.5-10.5 was sufficient to induce CNDD phenotypes, narrowing the window of susceptibility, and then re-analysis of RNA-seq datasets suggested the endoderm was the cell source of NAD synthesis.

      Weaknesses:

      Page 4 and Table S4. The descriptors for malformations of organs such as the kidney and vertebrae are quite vague and uninformative. More specific details are required to convey the type and range of anomalies observed as a consequence of NAD deficiency.

      Can the authors define whether the role of the NAD pathway in a couple of tissue or organ systems is the same? By this I mean is the molecular or cellular effect of NAD deficiency is the same in the vertebrae and organs such as the kidney. What unifies the effects on these specific tissues and organs and are all tissues and organs affected? If some are not, can the authors explain why they escape the need for the NAD pathway?

      Page 5 and Figure 6C. The expectation and conclusion for whether specific genes are expressed in particular cell types in scRNA-seq datasets depend on the number of cells sequenced, the technology (methodology) used, the depth of sequencing, and also the resolution of the analysis. It is therefore essential to perform secondary validation of the analysis of scRNA-seq data. At a minimum, the authors should perform in situ hybridization or immunostaining for Tdo2, Afmid, Kmo, Kynu, Haao, Qprt, and Nadsyn1 or some combination thereof at multiple time points during early mouse embryogenesis to truly understand the spatiotemporal dynamics of expression and NAD synthesis.

      Absolute functional proof of the yolk sac endoderm as being essential and required for NAD synthesis in the context of CNDD might require conditional deletion of Haoo in the yolk sac versus embryo using appropriate Cre driver lines or in the absence of a conditional allele, could be performed by tetraploid embryo-ES cell complementation approaches. But temporal dietary intervention can also approximate the same thing by perturbing NAD synthesis Shen the yolk sac is the primary source versus when the liver becomes the primary source in the embryo.

    4. Author response:

      General comments, factual mistakes:

      Reviewer 1 - Summary: “This study builds on the observation that the kynurenine pathway is required in the conceptus, as HOO null embryos are sensitive to maternal deficiency of NAD precursors (vitamin B3) and tryptophan, and narrows the window of sensitivity to a 3-day period.”

      Correction:

      Vitamin B3 should not be in parentheses, because vitamin B3 and tryptophan are both NAD precursors. We also suggest that the second half of this sentence is changed to “…and narrows the window of sensitivity to a 3-day period from embryonic day 7.5 to E10.5.” Currently, it reads as if Haao-null embryos are sensitive to any 3-day period of maternal NAD precursor restriction.

      Reviewer 1 – Strengths: “Abnormalities develop under conditions of maternal vitamin B3 deficiency, indicating…”

      Correction:

      We suggest replacing “vitamin B3 deficiency” with “NAD deficiency”, as this is more accurate.

      Reviewer 2 – Strengths: “…and then re-analysis of RNA-seq datasets suggested the endoderm was the cell source of NAD synthesis.”

      Correction:

      We suggest re-phrasing this sentence to “…and then re-analysis of RNA-seq datasets suggested the yolk sac endoderm cells are the source of NAD de novo synthesis.”

      Reviewer 1 (Public Review):

      However, without analysis of embryos at later stages in this experiment it is not known how long is needed for NAD synthesis to be recovered - and therefore until when the period of exposure to insufficient NAD lasts. This information would inform the understanding of the developmental origin of the observed defects.

      We are currently seeking funds to investigate the developmental origin of the observed defects. This study includes assessing how the timing of maternal NAD precursor restriction corresponds to the timing of NAD deficiency in the embryo.

      More importantly, there is still a question of whether in addition to the yolk sac, there is HAAO activity within the embryo itself prior to E12.5 (when it has first been assayed in the liver - Figure 1C).

      We have additional data showing that at E11.5 the embryo has no HAAO activity. We also tested E14.5 embryos with their livers removed, and these also do not have HAAO activity. We are planning to include these data sets in the revised version of this manuscript.

      Reviewer 2 (Public Review):

      Page 4 and Table S4. The descriptors for malformations of organs such as the kidney and vertebrae are quite vague and uninformative. More specific details are required to convey the type and range of anomalies observed as a consequence of NAD deficiency.

      Kidney defects were classified as described in Cuny et al. 2020 PNAS (PMID:32015132). In brief, kidneys with a length (tip to tip) of ≤ 1.5 mm in length were counted as hypoplastic, because the average length of a normal kidney at E18.5 is 2.98 mm (2.75-3.375 mm). The one dysmorphic kidney we observed in our dataset had a cyst. We plan to include this information plus more details of the observed vertebral defects in the revised version of this manuscript.

      Can the authors define whether the role of the NAD pathway in a couple of tissue or organ systems is the same? By this I mean is the molecular or cellular effect of NAD deficiency is the same in the vertebrae and organs such as the kidney. What unifies the effects on these specific tissues and organs and are all tissues and organs affected? If some are not, can the authors explain why they escape the need for the NAD pathway?

      We agree that this is a very important question, but consider it beyond the scope of this manuscript. To elucidate the underlying cellular and molecular mechanisms in individual organs will require a multiomic approach because NAD is involved in hundreds of molecular and cellular processes affecting gene expression, protein levels, metabolism, etc. For details of NAD functions that have relevance to embryogenesis see Dunwoodie et al 2023 https://doi.org/10.1089/ars.2023.0349. Furthermore, organs develop at different times during embryogenesis with both distinct, but in some cases shared, molecular and cellular processes. Relating these to specific NAD functions is the challenge. We are currently seeking funds to investigate how NAD deficiency disrupts organogenesis.

      Page 5 and Figure 6C. The expectation and conclusion for whether specific genes are expressed in particular cell types in scRNA-seq datasets depend on the number of cells sequenced, the technology (methodology) used, the depth of sequencing, and also the resolution of the analysis. It is therefore essential to perform secondary validation of the analysis of scRNA-seq data. At a minimum, the authors should perform in situ hybridization or immunostaining for Tdo2, Amid, Kmo, Kanu, Haao, Qprt, and Nadsyn1 or some combination thereof at multiple time points during early mouse embryogenesis to truly understand the spatiotemporal dynamics of expression and NAD synthesis.

      We have tested antibodies against HAAO, KYNU, and QPRT in adult mouse liver samples (the main site of NAD de novo synthesis) which produced non-specific bands with western blotting. Therefore, in situ immunostaining  studies on embryonic tissues are not feasible. We will investigate the possibility of effectively localizing transcripts of NAD de novo synthesis enzymes using in situ hybridization.

      Absolute functional proof of the yolk sac endoderm as being essential and required for NAD synthesis in the context of CNDD might require conditional deletion of Haoo in the yolk sac versus embryo using appropriate Cre driver lines or in the absence of a conditional allele, could be performed by tetraploid embryo-ES cell complementation approaches. But temporal dietary intervention can also approximate the same thing by perturbing NAD synthesis Shen the yolk sac is the primary source versus when the liver becomes the primary source in the embryo.

      Reviewer 1 has a related comment. We have additional data showing that at E11.5 the embryo has no HAAO activity, like the placenta. Similarly, E14.5 embryos with their livers removed, do not have HAAO activity either. We believe this provides sufficient proof that the yolk sac endoderm is the only site of NAD de novo activity in the conceptus until the liver has formed and takes over this function.

    1. eLife assessment

      The authors have developed a biosensor for programmed cell-death. They use this biosensor to provide valuable measurements of cell death in a specific early time window of development. However, the title and the discussion suggest a broader window of applicability of the results. The evidence supporting the claims is therefore incomplete. The authors should modify the introduction and discussion to examine their work in the context of extant literature and modify their title to reflect the conclusion that "Zebrafish live imaging reveals around 2%of motor neurons die through apoptosis during a 24-120 hour window in early development".

    2. Reviewer #1 (Public Review):

      Summary:

      The authors aim to measure the apoptotic fraction of motorneurons in developing zebrafish spinal cord to assess the extent of neuronal apoptosis during the development of a vertebrate embryo in an in vivo context.

      Strengths:

      The transgenic fish line tg (mnx1:sensor C3) appears to be a good reagent for motorneuron apoptosis studies, while further validation of its motorneuron specificity should be performed.

      Weaknesses:

      The results do not support the conclusions. The main "selling point" as summarized in the title is that the apoptotic rate of zebrafish motorneurons during development is strikingly low (~2% ) as compared to the much higher estimate (~50%) by previous studies in other systems. The results used to support the conclusion are that only a small percentage (under 2%) of apoptotic cells were found over a large population at a variety of stages 24-120hpf. This is fundamentally flawed logic, as a short-time window measure of percentage cannot represent the percentage in the long term. For example, at any year under 1% of the human population dies, but over 100 years >99% of the starting group will have died. To find the real percentage of motorneurons that died, the motorneurons born at different times must be tracked over the long term or the new motorneuron birth rate must be estimated.

      A similar argument can be applied to the macrophage results. Here the authors probably want to discuss well-established mechanisms of apoptotic neuron clearance such as by glia and microglia cells.

      The conclusion regarding the timing of axon and cell body caspase activation and apoptosis timing also has clear issues. The ~minutes measurement is too long as compared to the transport/diffusion timescale between the cell body and the axon, caspase activity could have been activated in the cell body, and either caspase or the cleaved sensor moves to the axon in several seconds. The authors' results are not high-frequency enough to resolve these dynamics

      Many statements suggest oversight of literature, for example, in the abstract "However, there is still no real-time observation showing this dying process in live animals.".

      Many statements should use more scholarly terms and descriptions from the spinal cord or motor neuron, neuromuscular development fields, such as line 87 "their axons converged into one bundle to extend into individual somite, which serves as a functional unit for the development and contraction of muscle cells"

      The transgenic line is perhaps the most meaningful contribution to the field as the work stands. However, the mnx1 promoter is well known for its non-specific activation - while the images suggest the authors' line is good, motor neuron markers should be used to validate the line. This is especially important for assessing this population later as mnx1 may be turned off in mature neurons.

      Overall, this work does not substantiate its biological conclusions and therefore does not advance the field. The transgenic line has the potential to address the questions raised but requires different sets of experiments. The line and the data as reported are useful on their own by providing a short-term rate of apoptosis of the motorneuron population.

    3. Reviewer #2 (Public Review):

      Summary:

      Jia and colleagues developed a fluorescence resonance energy transfer (FRET)-based biosensor to study programmed cell death in the zebrafish spinal cord. They applied this tool to study the death of zebrafish spinal motor neurons.

      Strengths:<br /> Their analysis shows that the tool is a useful biosensor of motor neuron apoptosis in living zebrafish.

      Weaknesses:<br /> However, they have ignored significant literature describing the death of an identified zebrafish motor neuron, expression of the mnx gene in interneurons that are closely related to motor neurons, the increase in number of zebrafish motor neurons over developmental time, and potential differences between the limb-innervating motor neurons whose death has been characterized in chicks and rodents and the body wall-innervating motor neurons whose death they characterized using their biosensor. Thus, although their new tool is likely to be useful in the future, it does not provide new insights into zebrafish motor neuron programmed cell death.

    4. Author response:

      We are grateful to the reviewers for recognizing the importance of our work and for their helpful suggestions. We will revise our manuscript in the revised version. However, we’d like to provide provisional responses now to answer the key questions and comments from the reviewers.

      (1) Both reviewers asked why we chose 24-120 hpf to measure the apoptotic rates. We chose this time window based on the following two reasons: 1) Previous studies showed that although the motor neuron death time windows vary in chick (E5-E10), mouse (E11.5-E15.5), rat (E15-E18) and human (11-25 weeks of gestation), the common feature of these time windows is that they are all the developmental periods when motor neurons contact with muscle cells. The contact between zebrafish motor neurons and muscle cells occurs before 72 hpf, which is included in our observation time window. 2) Zebrafish complete hatching during 48-72 hpf, and most organs form before 72 hpf. More importantly, zebrafish start swimming around 72 hpf, indicating that motor neurons are fully functional.

      Thus, we are confident that this 24-120 hpf time window covers the time window during which motor neurons undergo programmed cell death during zebrafish early development. We frequently used “early development” in this manuscript to describe our observation. However, we missed “early” in our title. We will add “early” in the title in the revised version.

      (2) Both reviewers also asked about the neurogenesis of motor neurons. Previous studies have shown that the production of spinal cord motor neurons largely ceases before 48 hpf and then the motor neurons remain largely constant until adulthood. Our observation time window covers the major motor neuron production process. Therefore, we believe that neurogenesis will not affect our data and conclusions.

      (3) Both reviewers questioned the specificity of using the mnx1 promoter to label motor neurons. The mnx1 promoter has been widely used to label motor neurons in transgenic zebrafish. Previous studies have shown that most of the cells labeled in the mnx1 transgenic zebrafish are motor neurons. In this study, we observed that the neuronal cells in our sensor zebrafish formed green cell bodies inside of the spinal cord and extended to the muscle region, which is an important morphological feature of the motor neurons. Furthermore, a few of those green cell bodies turned into blue apoptotic bodies inside the spinal cord and changed to blue axons in the muscle regions at the same time, which strongly suggests that those apoptotic neurons are not interneurons. Although the mnx1 promoter might have labeled some interneurons, this will not affect our major finding that only a small portion of motor neurons died during zebrafish early development.

      (4) Reviewer 2 is concerned that the estimated 50% of motor neuron death was in limb-innervating motor neurons but not in body wall-innervating motor neurons. The death of motor neurons in limb-innervating motor neurons has been extensively studied in chicks and rodents, as it is easy to undergo operations such as amputation. However, previous studies have shown this dramatic motor neuron death does not only occur in limb-innervating motor neurons but also occurs in other spinal cord motor neurons. In our manuscript, we studied the naturally occurring motor neuron death in the whole spinal cord during the early stage of zebrafish development.

      (5) Reviewer 2 mentioned that we ignored the death of an identified motor neuron. Our study was to examine the overall motor neuron apoptosis rather than a specific type of motor neuron death, so we did not emphasize the death of VaP motor neurons. We agree that the dead motor neurons observed in our manuscript contain VaP motor neurons. However, there were also other types of dead motor neurons observed in our study. The reasons are as follows: 1) VaP primary motor neurons die before 36 hpf, but our study found motor neuron cells died after 36 hpf and even at 84 hpf. 2) The position of the VaP motor neuron is together with that of the CaP motor neuron, that is, at the caudal region of the motor neuron cluster. Although it’s rare, we did observe the death of motor neurons in the rostral region of the motor neuron cluster. 3) There is only one or zero VaP motor neuron in each hemisegment. Although our data showed that usually one motor neuron died in each hemisegment, we did observe that sometimes more than one motor neuron died in the motor neuron cluster. We will include this information in the revised manuscript.

      (6) For the morpholinos, we did not confirm the downregulation of the target genes. These morpholino-related data are a minor part of our manuscript and shall not affect our major findings. Thus, we didn’t think we missed “important” controls. We will perform experiments to confirm the efficiency of the morpholinos or remove these morpholino-related data from the revised version.

    1. eLife assessment

      This study evaluated the role of transcutaneous auricular vagal nerve stimulation (taVNS) in patients with subarachnoid hemorrhage (SAH) randomized to taVNS vs sham, finding that those with active taVNS exhibited increased parasympathetic activity. The findings are important and cross-disciplinary, while the level of evidence is solid.

    2. Reviewer #1 (Public Review):

      The authors report the results of a randomized clinical trial of taVNS as a neuromodulation technique in SAH patients. They found that taVNS appears to be safe without inducing bradycardia or QT prolongation. taVNS also increased parasympathetic activity, as assessed by heart rate variability measures. Acute elevation in heart rate might be a biomarker to identify SAH patients who are likely to respond favorably to taVNS treatment. The latter is very important in light of the need for acute biomarkers of response to neuromodulation treatments.

      Comments:

      (1) Frequency domain heart rate variability measures should be analyzed and reported. Given the short duration of the ECG recording, the frequency domain may more accurately reflect autonomic tone.

      (2) How was the "dose" chosen (20 minutes twice daily)?

      (3) The use of an acute biomarker of response is very important. A bimodal response to taVNS has been previously shown in patients with atrial fibrillation (Kulkarni et al. JAHA 2021).

    3. Reviewer #2 (Public Review):

      Summary:

      This study investigated the effects of transcutaneous auricular vagus nerve stimulation (taVNS) on cardiovascular dynamics in subarachnoid hemorrhage (SAH) patients. The researchers conducted a randomized clinical trial with 24 SAH patients, comparing taVNS treatment to a Sham treatment group (20 minutes per day twice a day during the ICU stay). They monitored electrocardiogram (ECG) readings and vital signs to assess acute as well as middle-term changes in heart rate, heart rate variability, QT interval, and blood pressure between the two groups. The results showed that repetitive taVNS did not significantly alter heart rate, corrected QT interval, blood pressure, or intracranial pressure. However, it increased overall heart rate variability and parasympathetic activity after 5-10 days of treatment compared to the sham treatment. Acute taVNS led to an increase in heart rate, blood pressure, and peripheral perfusion index without affecting corrected QT interval, intracranial pressure, or heart rate variability. The acute post-treatment elevation in heart rate was more pronounced in patients who showed clinical improvement. In conclusion, the study found that taVNS treatment did not cause adverse cardiovascular effects, suggesting it is a safe immunomodulatory treatment for SAH patients. The mild acute increase in heart rate post-treatment could potentially serve as a biomarker for identifying SAH patients who may benefit more from taVNS therapy.

      Strengths:

      The paper is overall well written, and the topic is of great interest. The methods are solid and the presented data are convincing.

      Weaknesses:

      (1) It should be clearly pointed out that the current paper is part of the NAVSaH trial (NCT04557618) and presents one of the secondary outcomes of that study while the declared first outcomes (change in the inflammatory cytokine TNF-α in plasma and cerebrospinal fluid between day 1 and day 13, rate of radiographic vasospasm, and rate of requirement for long-term CSF diversion via a ventricular shunt) are available as a pre-print and currently under review (doi: 10.1101/2024.04.29.24306598.). The authors should better stress this point as well as the potential association of the primary with the secondary outcomes.

      (2) The references should be implemented particularly concerning other relevant papers (including reviews and meta-analysis) of taVNS safety, particularly from a cardiovascular standpoint, such as doi: 10.1038/s41598-022-25864-1 and doi: 10.3389/fnins.2023.1227858).

      (3) The dose-response issue that affects both VNS and taVNS applications in different settings should be mentioned (doi: 10.1093/eurheartjsupp/suac036.) as well as the need for more dose-finding preclinical as well as clinical studies in different settings (the best stimulation protocol is likely to be disease-specific).

      Overall, the present work has the important potential to further promote the usage of taVNS even on critically ill patients and might set the basis for future randomized studies in this setting.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors aimed to characterize the cardiovascular effects of acute and repetitive taVNS as an index of safety. The authors concluded that taVNS treatment did not induce adverse cardiovascular effects, such as bradycardia or QT prolongation.

      Strengths:

      This study has the potential to contribute important information about the clinical utility of taVNS as a safe immunomodulatory treatment approach for SAH patients.

      Weaknesses:

      A number of limitations were identified:

      (1) A primary hypothesis should be clearly stated. Even though the authors state the design is a randomized clinical trial, several aspects of the study appear to be exploratory. The method of randomization was not stated. I am assuming it is a forced randomization given the small sample size and approximately equal numbers in each arm.

      (2) The authors "first investigated whether taVNS treatment induced bradycardia or QT prolongation, both potential adverse effects of vagus nerve stimulation. This analysis showed no significant differences in heart rate calculated from 24-hour ECG recording between groups." A justification should be provided for why a difference is expected from 20 minutes of taVNS over a period of 24 hours. Acute ECG changes are a concern for increasing arrhythmic risk, for example, due to cardiac electrical restitution properties.

      (3) More rigorous evaluation is necessary to support the conclusion that taVNS did not change heart rate, HRV, QTc, etc. For example, shifts in peak frequencies of the high-frequency vs. low-frequency power may be effective at distinguishing the effects of taVNS. Further, compensatory sympathetic responses due to taVNS should be explored by quantifying the changes in the trajectory of these metrics during and following taVNS.

      (4) The authors do not state how the QT was corrected and at what range of heart rates. Because all forms of corrections are approximations, the actual QT data should be reported along with the corrected QT.

      (5) The QT extraction method needs to be more robust. For example, in Figure 2C, the baseline voltage of the ECG is shifting while the threshold appears to be fixed. If indeed the threshold is not dynamic and does not account for baseline fluctuations (e.g., due to impedance changes from respiration), then the measures of the QT intervals were likely inaccurate.

      (6) More statistical rigor is needed. For example, in Figure 2D, the change in heart rate for days 5-7, 8-10, and 11-13 is clearly a bimodal distribution and as such, should not be analyzed as a single distribution. Similarly, Figure 2E also shows a bimodal distribution. Without the QT data, it is unclear whether this is due to the application of the heart rate correction method.

      (7) Figure 3A shows a number of outliers. A SDNN range of 200 msec should raise concern for a non-sinus rhythm such as arrhythmia or artifact, instead of sinus arrhythmia. Moreover, Figure 3B shows that the Sham RMSSD data distribution is substantially skewed by the presence of at least 3 outliers, resulting in lower RMSSD values compared to taVNS. What types of artifact or arrhythmia discrimination did the authors employ to ensure the reported analysis is on sinus rhythm? The overall results seem to be driven by outliers.

      (8) The above concern will also affect the power analysis, which was reported by authors to have been performed based on the t-test assuming the medium effect size, but the details of sample size calculations were not reported, e.g., X% power, t-test assumed Bonferroni correction in the power analysis, etc.

      (9) If the study was designed to show a cardiovascular effect, I am surprised that N=10 per group was considered to be sufficiently powered given the extensive reports in the literature on how HRV measures (except when pathologically low) vary within individuals. Moreover, HRV measures are especially susceptible to noise, artifacts, and outliers.

      If the study was designed to show a lack of cardiovascular effect (as the conclusions and introduction seem to suggest), then a several-fold larger sample size is warranted.

    1. eLife assessment

      This valuable study investigates the role of Drp1 in early embryo development, providing solid evidence on how this protein influences mitochondrial localization and partitioning during the first embryonic divisions. The research employs the Trim-Away technique to eliminate Drp1 in zygotes, revealing critical insights into mitochondrial clustering, spindle formation, and embryonic development.

    2. Reviewer #1 (Public Review):

      Summary:

      Gekko, Nomura et al., show that Drp1 elimination in zygotes using the Trim-Away technique leads to mitochondrial clustering and uneven mitochondrial partitioning during the first embryonic cleavage, resulting in embryonic arrest. They monitor organellar localization and partitioning using specific targeted fluorophores. They also describe the effects of mitochondrial clustering in spindle formation and the detrimental effect of uneven mitochondrial partitioning to daughter cells.

      Strengths:

      The authors have gathered solid evidence for the uneven segregation of mitochondria upon Drp1 depletion through different means: mitochondrial labelling, ATP labelling and mtDNA copy number assessment in each daughter cell. Authors have also characterised the defects in cleavage mitotic spindles upon Drp1 loss

      Weaknesses:

      While this study convincingly describes the phenotype seen upon Drp1 loss, my major concern is that the mechanism underlying these defects in zygotes remains unclear. The authors refer to mitochondrial fragmentation as the mechanism ensuring organelle positioning and partitioning into functional daughters during the first embryonic cleavage. However, could Drp1 have a role beyond mitochondrial fission in zygotes? I raise these concerns because, as opposed to other Drp1 KO models (including those in oocytes) which lead to hyperfused/tubular mitochondria, Drp1 loss in zygotes appears to generate enlarged yet not tubular mitochondria. Lastly, while the authors discard the role of mitochondrial transport in the clustering observed, more refined experiments should be performed to reach that conclusion.

    3. Reviewer #2 (Public Review):

      Gekko et al investigate the impact of perturbing mitochondrial during early embryo development, through modulation of the mitochondrial fission protein Drp1 using Trim-Away technology. They aimed to validate a role for mitochondrial dynamics in modulating chromosomal segregation, mitochondrial inheritance and embryo development and achieve this through the examination of mitochondrial and endoplasmic reticulum distribution, as well as actin filament involvement, using targeted plasmids, molecular probes and TEM in pronuclear stage embryos through the first cleavages divisions. Drp1 deletion perturbed mitochondrial distribution, leading to asymmetric partitioning of mitochondria to the 2-cell stage embryo, prevented appropriate chromosomal segregation and culminated in embryo arrest. Resultant 2-cell embryos displayed altered ATP, mtDNA and calcium levels. Microinjection of Drp1 mRNA partially rescued embryo development. A role for actin filaments in mitochondrial inheritance is described, however the actin-based motor Myo19 does not appear to contribute.

      Overall, this study builds upon their previous work and provides further support for the role of mitochondrial dynamics in mediating chromosomal segregation and mitochondrial inheritance. In particular, Drp1 is required for redistribution of mitochondria to support symmetric partitioning and support ongoing development.

      Strengths:<br /> The study is well designed, the methods appropriate and the results clearly presented. The findings are nicely summarised in a schematic.

      Understanding the role of mitochondria in binucleation and mitochondrial inheritance is of clinical relevance for patients undergoing infertility treatment, particularly those undergoing mitochondrial replacement therapy.

      Weaknesses:

      The authors first describe the redistribution of mitochondria during normal development, followed by alterations induced by Drp1 depletion. It would be useful to indicate the time post-hCG for imaging of fertilised zygotes (first paragraph of the results/Figure 1) to compare with subsequent Drp1 depletion experiments.

      It is noted that Drp1 protein levels were undetectable 5h post-injection, suggesting earlier times were not examined, yet in Figure 3A it would seem that aggregation has occurred within 2 hours (relative to Figure 1).

      Mitochondria appear to be slightly more aggregated in Drp1 fl/fl embryos than in control, though comparison with untreated controls does not appear to have been undertaken. There also appears to be some variability in mitochondrial aggregation patterns following Drp1 depletion (Figure 2-suppl 1 B) which are not discussed.

      The authors use western blotting to validate the depletion of Drp1, however do not quantify band intensity. It is also unclear whether pooled embryo samples were used for western blot analysis.

      Likewise, intracellular ROS levels are examined however quantification is not provided. It is therefore unclear whether 'highly accumulated levels' are of significance or related to Drp1 depletion.

      In previous work, Drp1 was found to have a role as a spindle assembly checkpoint (SAC) protein. It is therefore unclear from the experiments performed whether aggregation of mitochondria separating the pronuclei physically (or other aspects of mitochondrial function) prevents appropriate chromosome segregation or whether Drp1 is acting directly on the SAC.

    4. Reviewer #3 (Public Review):

      Why mitochondria are finely maintained in the female germ cell (oocyte), zygotes, and preimplantation embryos? Mitochondrial fusion seems beneficial in somatic cells to compensate for mitochondria with mutated mtDNA that potentially defuel the respiratory activity if accumulated above a certain threshold. However, in the germ cells, it may rather increase the risk of transmitting mutated mtDNA to the next generation, as authors discussed. Also, finely maintained mitochondria would also be beneficial for efficient removal when damaged. Due in part to the limited suitable model, the physiological role of mitochondrial fission in embryos were obscure. In this study, authors demonstrated that mitochondrial fission prevents multiple adverse outcomes, even including the aberrant demixing of parental genome in zygotic stage. This is an important study that could contribute by proposing a new mechanism for solving problems that actually arise in the field of reproductive medicine. The conclusion is simple and clear, but the high level of technology has made it possible to overcome the difficulties of proving the results, making this an extremely excellent study.

      Seemingly, there are few apparent shortcomings. Following are the specific comments to activate the further open discussion.<br /> - Line 246: Comments on cristae morphology of mitochondria in Drp1-depleted embryos would better be added.<br /> - Regarding Figure 2H: If possible, a representative picture of Ateam would better be included in the figure. As the authors discussed in line 458, Ateam may be able to detect whether any alterations of local energy demand occurred in the Drp1-depleted embryos.<br /> - Line 282: In Figure 3-Video 1, mitochondria were seemingly more aggregated around female pronucleus. Is it OK to understand that there is no gender preference of pronuclei being encircled by more aggregated mitochondria?<br /> - Line 317: A little more explanation of the "variability" would be fine. Does that basically mean that the Ca2+ response in both Drp1-depleted blastomeres were lower than control and blastomere with more highly aggregated mitochondria show severer phenotype compared to the other blastomere with fewer mito?<br /> - Regarding Figure 5B (& Figure 1-figure supplement 1B): Do authors think that there would be less abnormalities in the embryos if Drp1 is trim-awayed after 2-cell or 4-cell, in which mitochondria are less involved in the spindle?

    5. Author Response:

      We would like to thank the editors and reviewers for the careful consideration of our manuscript and their many helpful comments. We would like to provide provisional author responses to address the public reviews.

      Response to Reviewer 1:

      Weaknesses:

      While this study convincingly describes the phenotype seen upon Drp1 loss, my major concern is that the mechanism underlying these defects in zygotes remains unclear. The authors refer to mitochondrial fragmentation as the mechanism ensuring organelle positioning and partitioning into functional daughters during the first embryonic cleavage. However, could Drp1 have a role beyond mitochondrial fission in zygotes? I raise these concerns because, as opposed to other Drp1 KO models (including those in oocytes) which lead to hyperfused/tubular mitochondria, Drp1 loss in zygotes appears to generate enlarged yet not tubular mitochondria. Lastly, while the authors discard the role of mitochondrial transport in the clustering observed, more refined experiments should be performed to reach that conclusion.

      It would be difficult to answer from this study whether Drp1 has a role beyond mitochondrial fission in zygotes. However, there are several possible reasons why the Drp1 KO zygotes differs from the somatic cell Drp1 KO models.  

      First, the reviewer mentions that the loss of Drp1 in oocytes leads to hyperfused/tubular mitochondria, but in fact, unlike in somatic cells, the EM images in Drp1 KO oocytes show enlarged mitochondria rather than tubular structures  (Udagawa et al. Current Biology 2014, Fig. 2C and Fig. S1B-D), as in the case of zygotes in this study. 

      These mitochondrial morphologies in Drp1-deficient oocytes/zygotes may be attributed to the unique mitochondrial architecture in these cells. Mitochondria in oocytes have the shape of a small sphere with an irregular cristae located peripherally or transversely. These structural features might be the cause of insensitivity or resistance to inner membrane fusion. In addition, in our previous study (Wakai et al., Molecular Human Reproduction 2014, Fig. 2), overexpression of mitochondrial fusion factors in oocytes resulted in mitochondrial aggregation when outer membrane fusion factor Mfn1/Mfn2 was overexpressed, while overexpression of Opa1 did not cause any morphological changes. Thus, while mitochondria in oocytes/zygotes divide actively, complete fusion, including the inner membrane, as seen in somatic cells, is unlikely to occur.

      As for mitochondrial transport, we do not entirely discard its role. Althogh mitochondrial intrinsic dynamics such as fission are of primary importance for the mitochondrial distribution and partitioning in embryos, the regulation of dynamics by the cytoskeletons may be important and thus needs further study, as the reviewer pointed out.

      Response to Reviewer 2:

      Weaknesses:

      The authors first describe the redistribution of mitochondria during normal development, followed by alterations induced by Drp1 depletion. It would be useful to indicate the time post-hCG for imaging of fertilised zygotes (first paragraph of the results/Figure 1) to compare with subsequent Drp1 depletion experiments.

      We will indicate the time after hCG as the reviewer pointed out. The only problem is that in this experiment, there may be a slight deviation from the actual mitochondrial distribution change (Fig. S1A) due to the manipulation time for Trim-Away (since it was performed outside of the incubator). Also, no significant delay in pronuclear formation or embryonic development was observed with Drp1 depleted zygotes.

      It is noted that Drp1 protein levels were undetectable 5h post-injection, suggesting earlier times were not examined, yet in Figure 3A it would seem that aggregation has occurred within 2 hours (relative to Figure 1).

      As the reviewer pointed out, the depletion of Drp1 is likely to have occurred at an earlier stage. In this study, due to the injection of various RNAs to visualize organelles such as mitochondria and chromosomes, observations were started after about 5 hours of incubation for their fluorescent proteins to be sufficiently expressed. Therefore, for the western blotting analysis, samples were taken into account their condition at the start of the observation.

      Mitochondria appear to be slightly more aggregated in Drp1 fl/fl embryos than in control, though comparison with untreated controls does not appear to have been undertaken. There also appears to be some variability in mitochondrial aggregation patterns following Drp1 depletion (Figure 2-suppl 1 B) which are not discussed.

      We would like to add quantitative data on mitochondrial aggregation in Drp1-depleted embryos.

      The authors use western blotting to validate the depletion of Drp1, however do not quantify band intensity. It is also unclear whether pooled embryo samples were used for western blot analysis.

      We would like to add the quantitative results of the intensity of the bands for the Western blot analysis. The number of embryos analyzed is described in Fig legends, from 20 (Fig. 4) to 30 (Fig. 2) pooled samples were used.

      Likewise, intracellular ROS levels are examined however quantification is not provided. It is therefore unclear whether 'highly accumulated levels' are of significance or related to Drp1 depletion.

      We will present to indicate quantitative results on the accumulation of ROS.

      In previous work, Drp1 was found to have a role as a spindle assembly checkpoint (SAC) protein. It is therefore unclear from the experiments performed whether aggregation of mitochondria separating the pronuclei physically (or other aspects of mitochondrial function) prevents appropriate chromosome segregation or whether Drp1 is acting directly on the SAC.

      It has been reported that Drp1 regulates meiotic spindle through spindle assembly checkpoint (SAC) (Zhou et al., Nature Communications 2022). We would like to mention the possibility pointed out in the discussion part.

      Response to Reviewer 3:

      Seemingly, there are few apparent shortcomings. Following are the specific comments to activate the further open discussion.

      - Line 246: Comments on cristae morphology of mitochondria in Drp1-depleted embryos would better be added.

      We would like to add a comment regarding cristae morphology.

      - Regarding Figure 2H: If possible, a representative picture of Ateam would better be included in the figure. As the authors discussed in line 458, Ateam may be able to detect whether any alterations of local energy demand occurred in the Drp1-depleted embryos.

      ATeam fluorescence is analyzed using a regular fluorescence microscope, not a confocal laser microscope, in order to analyze the intensity in the whole embryo (or the whole blastomere). Therefore, we are currently unable to obtain images of localized areas within the cell (e.g., around the spindle) as expected by the reviewer; as shown in the images in Figure 3-figure supplement 1C, there is a tendency to see high ATP levels at the cell periphery, but further analysis is needed for clear and definitive results.

      - Line 282: In Figure 3-Video 1, mitochondria were seemingly more aggregated around female pronucleus. Is it OK to understand that there is no gender preference of pronuclei being encircled by more aggregated mitochondria?

      Aggregated mitochondria are localized toward the cell center, but do not behave in such a way that they are preferentially concentrated near the female pronucleus.

      - Line 317: A little more explanation of the "variability" would be fine. Does that basically mean that the Ca2+ response in both Drp1-depleted blastomeres were lower than control and blastomere with more highly aggregated mitochondria show severer phenotype compared to the other blastomere with fewer mito?

      We assume that what the reviewer have pointed out is right. However, although we were able to show the bias in Ca2+ store levels between blastomeres of Drp1 depleted embryos, we did not stain mitochondria simultaneously, so we were unable to say details such as more Ca2+ stores in blastomere that inherited more mitochondria or less Ca2+ stores in blastomere with more aggregated mitochondria

      - Regarding Figure 5B (& Figure 1-figure supplement 1B): Do authors think that there would be less abnormalities in the embryos if Drp1 is trim-awayed after 2-cell or 4-cell, in which mitochondria are less involved in the spindle?

      The marked accumulation of mitochondria around the spindle is unique to the first cleavage and seems to be coincident with the migration of the pronuclei toward the center. Since the process of assembly of the male and female pronuclei is also an event unique to the first cleavage, abnormalities such as binucleation due to mitochondrial misplacement are thought to be a phenomenon seen only in the first cleavage. Therefore, if Drp1 is depleted at the 2-cell or 4-cell stage, chromosome segregation errors may be less frequent. However, since unequal partitioning of mitochondria is thought to occur, some abnormalities in embryonic development is likely to be observed.

    1. eLife assessment

      This important paper implicates S-acylation of Cys-130 in recruitment of the inflammasome receptor NLRP3 to the Golgi, and it provides convincing evidence that S-acylation plays a key role in response to the stress induced by nigericin treatment. While Cys-130 does seem to play a previously unappreciated role in membrane association of NLRP3, further work will be needed to clarify the details of the mechanism.

    2. Reviewer #2 (Public Review):

      This paper examines the recruitment of the inflammasome seeding pattern recognition receptor NLRP3 to the Golgi. Previously, electrostatic interactions between the polybasic region of NLRP3 and negatively charged lipids were implicated in membrane association. The current study concludes that reversible S-acylation of the conserved Cys-130 residue, in conjunction with upstream hydrophobic residues plus the polybasic region, act together to promote Golgi localization of NLRP3, although additional parts of the protein are needed for full Golgi localization. Treatment with the bacterial ionophore nigericin inhibits membrane traffic and apparently prevents Golgi-associated thioesterases from removing the acyl chain, causing NLRP3 to become immobilized at the Golgi. This mechanism is put forth as an explanation for how NLRP3 is activated in response to nigericin.

      The experiments are generally well presented. It seems likely that Cys-130 does indeed play a previously unappreciated role in Golgi association of NLRP3. However, the evidence for S-acylation at Cys-130 is largely indirect, and the process by which nigericin enhances membrane association is not yet fully understood. Therefore, this interesting study points the way for further analysis.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      This is an interesting study investigating the mechanisms underlying membrane targeting of the NLRP3 inflammasome and reporting a key role for the palmitoylation-depalmitoylation cycle of cys130 in NRLP3. The authors identify ZDHHC3 and APT2 as the specific ZDHHC and APT/ABHD enzymes that are responsible for the s-acylation and de-acylation of NLRP3, respectively. They show that the levels of ZDHHC3 and APT2, both localized at the Golgi, control the level of palmitoylation of NLRP3. The S-acylation-mediated membrane targeting of NLRP3 cooperates with polybasic domain (PBD)-mediated PI4P-binding to target NLRP3 to the TGN under steady-state conditions and to the disassembled TGN induced by the NLRP3 activator nigericin.

      However, the study has several weaknesses in its current form as outlined below.

      (1) The novelty of the findings concerning cys130 palmitoylation in NLRP3 is unfortunately compromised by recent reports on the acylation of different cysteines in NLRP3 (PMID: 38092000), including palmitoylation of the very same cys130 in NLRP3 (Yu et al https://doi.org/10.1101/2023.11.07.566005), which was shown to be relevant for NLRP3 activation in cell and animal models. What remains novel and intriguing is the finding that NLRP3 activators induce an imbalance in the acylation-deacylation cycle by segregating NLRP3 in late Golgi/endosomes from de-acylating enzymes confined in the Golgi. The interesting hypothesis put forward by the authors is that the increased palmitoylation of cys130 would finally contribute to the activation of NLRP3. However, the authors should clarify the trafficking pathway of acylated-NLRP3. This pathway should, in principle, coincide with that of TGN46 which constitutively recycles from the TGN to the plasma membrane and is trapped in endosomes upon treatment with nigericin. 

      We think the data presented in our manuscript are consistent with the majority of S-acylated NLRP3 remaining on the Golgi via S-acylation in both untreated and nigericin treated cells. We have performed an experiment with BrefeldinA (BFA), a fungal metabolite that disassembles the Golgi without causing dissolution of early endosomes, that further supports the conclusion that NLRP3 predominantly resides on Golgi membranes pre and post activation. Treatment of cells with BFA prevents recruitment of NLRP3 to the Golgi in untreated cells and blocks the accumulation of NLRP3 on the structures seen in the perinuclear area after nigericin treatment (see new Supplementary Figure 4A-D). We do see some overlap of NLRP3 signal with TGN46 in the perinuclear area after nigericin treatment (see new Supplementary Figure 2E), however this likely represents TGN46 at the Golgi rather than endosomes given that the NLRP3 signal in this area is BFA sensitive.  As with 2-BP and GFP-NLRP3C130S, GFP-NLRP3 spots also form in BFA / nigericin co-treated cells but not with untagged NLRP3. These spots also do not show any co-localisation with EEA1, suggesting that under these conditions, endosomes don’t appear to represent a secondary site of NLRP3 recruitment in the absence of an intact Golgi. However, we cannot completely rule out that some NLRP3 may recruited to endosomes at some point during its activation.

      (2) To affect the S-acylation, the authors used 16 hrs treatment with 2-bromopalmitate (2BP). In Figure 1f, it is quite clear that NLRP3 in 2-BP treated cells completely redistributed in spots dispersed throughout the cells upon nigericin treatment. What is the Golgi like in those cells? In other words, does 2-BP alter/affect Golgi morphology? What about PI4P levels after 2-BP treatment? These are important missing pieces of data since both the localization of many proteins and the activity of one key PI4K in the Golgi (i.e. PI4KIIalpha) are regulated by palmitoylation.

      We thank the reviewer for highlighting this point and agree that it is possible the observed loss of NLRP3 from the Golgi might be due to an adverse effect of 2-BP on Golgi morphology or PI4P levels. We have tested the effect of 2-BP on the Golgi markers GM130, p230 and TGN46. 2BP has marginal effects on Golgi morphology with cis, trans and TGN markers all present at similar levels to untreated control cells (Supplementary Figure 2B-D). We also tested the effect of 2-BP on PI4P levels using mCherry-P4M, a PI4P biosensor. Surprisingly, as noted by the reviewer, despite recruitment of PI4K2A being dependent on S-acylation, PI4P was still present on the Golgi after 2-BP treatment, suggesting that a reduction in Golgi PI4P levels does not underly loss of NLRP3 from the Golgi (Supplementary Figure 2A). The pool of PI4P still present on the Golgi following 2-BP treatment is likely generated by other PI4K enzymes that localise to the Golgi independently of S-acylation, such as PI4KIIIB. We have included this data in our manuscript as part of a new Supplementary Figure 2. 

      (3) The authors argue that the spots observed with NLRP-GFP result from non-specific effects mediated by the addition of the GFP tag to the NLRP3 protein. However, puncta are visible upon nigericin treatment, as a hallmark of endosomal activation. How do the authors reconcile these data? Along the same lines, the NLRP3-C130S mutant behaves similarly to wt NLRP3 upon 2-BP treatment (Figure 1h). Are those NLRP3-C130S puncta positive for endosomal markers? Are they still positive for TGN46? Are they positive for PI4P?

      This is a fair point given the literature showing overlap of NLRP3 puncta formed in response to nigericin with endosomal markers and the similarity of the structures we see in terms of size and distribution to endosomes after 2BP + nigericin treatment. We have tested whether these puncta overlap with EEA1, TGN46 or PI4P (Supplementary Figure 2A, E-G). The vast majority of spots formed by GFP-NLRP3 co-treated with 2-BP and nigericin do not co-localise with EEA1, TGN46 or PI4P. This is consistent with these spots potentially being an artifact, although it has recently been shown that human NLRP3 unable to bind to the Golgi can still respond to nigericin (Mateo-Tórtola et al., 2023). These puncta might represent a conformational change cytosolic NLRP3 undergoes in response to stimulation, although our results suggest that this doesn’t appear to happen on endosomes.

      (4) The authors expressed the minimal NLRP3 region to identify the domain required for NLRP3 Golgi localization. These experiments were performed in control cells. It might be informative to perform the same experiments upon nigericin treatment to investigate the ability of NLRP3 to recognize activating signals. It has been reported that PI4P increases on Golgi and endosomes upon NG treatment. Hence, all the differences between the domains may be lost or preserved. In parallel, also the timing of such recruitment upon nigericin treatment (early or late event) may be informative for the dynamics of the process and of the contribution of the single protein domains.

      This is an interesting point which we thank the reviewer for highlighting. However, we think that each domain on its own is not capable of responding to nigericin as shown by the effect of mutations in helix115-125 or the PB region in the full-length NLRP3 protein. NLRP3HF, which still contains a functional PB region, isn’t capable of responding to nigericin in the same way as wild type NLRP3 (Supplementary Figure 6C-D). Similarly, mutations in the PB region of full length NLRP3 that leave helix115-125 intact show that helix115-125 is not sufficient to allow enhanced recruitment of NLRP3 to Golgi membranes after nigericin treatment (Supplementary Figure 9A). We speculate that helix115-125, the PB region and the LRR domain all need to be present to provide maximum affinity of NLRP3 for the Golgi prior to encounter with and S-acylation by ZDHHC3/7. Mutation or loss of any one of the PB region, helix115-125 or the LRR lowers NLRP3 membrane affinity, which is reflected by reduced levels of NLRP3 captured on the Golgi by S-acylation at steady state and in response to nigericin. 

      (5) As noted above for the chemical inhibitors (1) the authors should check the impact of altering the balance between acyl transferase and de-acylases on the Golgi organization and PI4P levels. What is the effect of overexpressing PATs on Golgi functions?

      We have checked the effect of APT2 overexpression on Golgi morphology and can show that it has no noticeable effect, ruling out an impact of APT on Golgi integrity as the reason for loss of NLRP3 from the Golgi in the presence of overexpressed APT2. We have included these images as Supplementary Figure 11H-J. 

      It is plausible that the effects of ZDHHC3 or ZDHHC7 on enhanced recruitment of NLRP3 to the Golgi may be via an effect on PI4P levels since, as mentioned above, both enzymes are involved in recruitment of PI4K2A to the Golgi and have previously been shown to enhance levels of PI4K2A and PI4P on the Golgi when overexpressed (Kutchukian et al., 2021). However, NLRP3 mutants with most of the charge removed from the PB region, which are presumably unable to interact with PI4P or other negatively charged lipids, are still capable of being recruited to the Golgi by excess ZDHHC3. This would suggest that the effect of overexpressed ZDHHC3 on NLRP3 is largely independent of changes in PI4P levels on the Golgi and instead driven by helix115-125 and S-acylation at Cys-130. The latter point is supported by the observation that NLRP3HF and NLRP3Cys130 are insensitive to ZDHHC3 overexpression.

      At the levels of HA-ZDHHC3 used in our experiments with NLRP3 (200ng pEF-Bos-HAZDHHC3 / c.a. 180,000 cells) we don’t see any adverse effect on Golgi morphology (Author response image 1), although it has been noted previously by others that higher levels of ZDHHC3 can have an impact on TGN46 (Ernst et al., 2018). ZDHHC3 overexpression surprisingly has no adverse effects on Golgi function and in fact enhances secretion from the Golgi (Ernst et al., 2018).  

      Author response image 1.

      Overexpression of HA-ZDHHC3 does not impact Golgi morphology. A) Representative confocal micrographs of HeLaM cells transfected with 200 ng HA-ZDHHC3 fixed and stained with antibodies to STX5 or TGN46. Scale bars = 10 µm. 

      Reviewer #2 (Public Review):

      Summary:

      This paper examines the recruitment of the inflammasome seeding pattern recognition receptor NLRP3 to the Golgi. Previously, electrostatic interactions between the polybasic region of NLRP3 and negatively charged lipids were implicated in membrane association. The current study reports that reversible S-acylation of the conserved Cys-130 residue, in conjunction with upstream hydrophobic residues plus the polybasic region, act together to promote Golgi localization of NLRP3, although additional parts of the protein are needed for full Golgi localization. Treatment with the bacterial ionophore nigericin inhibits membrane traffic and prevents Golgi-associated thioesterases from removing the acyl chain, causing NLRP3 to become immobilized at the Golgi. This mechanism is put forth as an explanation for how NLRP3 is activated in response to nigericin.

      Strengths:

      The experiments are generally well presented. It seems likely that Cys-130 does indeed play a previously unappreciated role in the membrane association of NLRP3.

      Weaknesses:

      The interpretations about the effects of nigericin are less convincing. Specific comments follow.

      (1) The experiments of Figure 4 bring into question whether Cys-130 is S-acylated. For Cys130, S-acylation was seen only upon expression of a severely truncated piece of the protein in conjunction with overexpression of ZDHHC3. How do the authors reconcile this result with the rest of the story?

      Providing direct evidence of S-acylation at Cys-130 in the full-length protein proved difficult. We attempted to detect S-acylation of this residue by mass spectrometry. However, the presence of the PB region and multiple lysines / arginines directly after Cys-130 made this approach technically challenging and we were unable to convincingly detect S-acylation at Cys-130 by M/S. However, Cys-130 is clearly important for membrane recruitment as its mutation abolishes the localisation of NLRP3 to the Golgi. It is feasible that it is the hydrophobic nature of the cysteine residue itself which supports localisation to the Golgi, rather than S-acylation of Cys-130. A similar role for cysteine residues present in SNAP-25 has been reported (Greaves et al., 2009). However, the rest of our data are consistent with Cys-130 in NLRP3 being S-acylated. We also refer to another recently published study which provides additional biochemical evidence that mutation of Cys-130 impacts the overall levels of NLRP3 S-acylation (Yu et al., 2024). 

      (2) Nigericin seems to cause fragmentation and vesiculation of the Golgi. That effect complicates the interpretations. For example, the FRAP experiment of Figure 5 is problematic because the authors neglected to show that the FRAP recovery kinetics of nonacylated resident Golgi proteins are unaffected by nigericin. Similarly, the colocalization analysis in Figure 6 is less than persuasive when considering that nigericin significantly alters Golgi structure and could indirectly affect colocalization. 

      We agree that it is likely that the behaviour of other Golgi resident proteins are altered by nigericin. This is in line with a recent proteomics study showing that nigericin alters the amount of Golgi resident proteins associated with the Golgi (Hollingsworth et al., 2024) and other work demonstrating that changes in organelle pH can influence the membrane on / off rates of Rab GTPases (Maxson et al., 2023). However, Golgi levels of other peripheral membrane proteins

      that associate with the Golgi through S-acylation, such as N-Ras, appear unaltered (Author response image 2.), indicating a degree of selectivity in the proteins affected. Our main point here is that NLRP3 is amongst those proteins whose behaviour on the Golgi is sensitive to nigericin and that this change in behaviour may be important to the NLRP3 activation process, although this requires further investigation and will form the basis of future studies. 

      The reduction in co-localisation between NLRP3 and APT2, due to alterations in Golgi organisation and trafficking, was the point we were trying to make with this figure, and we apologise if this was not clear. We think that the changes in Golgi structure and function caused by nigericin potentially affect the ability of APT2 to encounter NLRP3 and de-acylate it. We have added a new paragraph to the results section to hopefully explain this more clearly. We recognise that our results supporting this hypothesis are at present limited and we have toned down the language used in the results section to reflect the nature of these findings..  

      Author response image 2.

      S-acylated peripheral membrane proteins show differential sensitivity to nigericin. A) Representative confocal micrographs of HeLaM cells coexpressing GFP-NRas and an untagged NLRP3 construct. Cells were left untreated or treated with 10 µM nigericin for 1 hour prior to fixation. Scale bars = 10 µm. B) Quantification of GFP-NRas or NLRP3 signal in the perinuclear region of cells treated with or without nigericin

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      (1) Does overnight 2-BP treatment potentially have indirect effects that could prevent NLRP3 recruitment? It would be useful here to show some sort of control confirming that the cells are not broadly perturbed.

      Please see our response to point (2) raised by reviewer #1 which is along similar lines. 

      (2) In Figure 5, "Veh" presumably is short for "Vehicle". This term should be defined in the legend.

      We have now corrected this.

      References

      Ernst, A.M., S.A. Syed, O. Zaki, F. Bottanelli, H. Zheng, M. Hacke, Z. Xi, F. Rivera-Molina, M. Graham, A.A. Rebane, P. Bjorkholm, D. Baddeley, D. Toomre, F. Pincet, and J.E. Rothman. 2018. SPalmitoylation Sorts Membrane Cargo for Anterograde Transport in the Golgi. Dev Cell. 47:479-493 e477.

      Greaves, J., G.R. Prescott, Y. Fukata, M. Fukata, C. Salaun, and L.H. Chamberlain. 2009. The hydrophobic cysteine-rich domain of SNAP25 couples with downstream residues to mediate membrane interactions and recognition by DHHC palmitoyl transferases. Mol Biol Cell. 20:1845-1854.

      Hollingsworth, L.R., P. Veeraraghavan, J.A. Paulo, J.W. Harper, and I. Rauch. 2024. Spatiotemporal proteomic profiling of cellular responses to NLRP3 agonists. bioRxiv.

      Kutchukian, C., O. Vivas, M. Casas, J.G. Jones, S.A. Tiscione, S. Simo, D.S. Ory, R.E. Dixon, and E.J. Dickson. 2021. NPC1 regulates the distribution of phosphatidylinositol 4-kinases at Golgi and lysosomal membranes. EMBO J. 40:e105990.

      Mateo-Tórtola, M., I.V. Hochheiser, J. Grga, J.S. Mueller, M. Geyer, A.N.R. Weber, and A. TapiaAbellán. 2023. Non-decameric NLRP3 forms an MTOC-independent inflammasome. bioRxiv:2023.2007.2007.548075.

      Maxson, M.E., K.K. Huynh, and S. Grinstein. 2023. Endocytosis is regulated through the pHdependent phosphorylation of Rab GTPases by Parkinson’s kinase LRRK2. bioRxiv:2023.2002.2015.528749.

      Yu, T., D. Hou, J. Zhao, X. Lu, W.K. Greentree, Q. Zhao, M. Yang, D.G. Conde, M.E. Linder, and H. Lin. 2024. NLRP3 Cys126 palmitoylation by ZDHHC7 promotes inflammasome activation. Cell Rep. 43:114070.

    1. Reviewer #3 (Public Review):

      Summary:

      The pathomechanism underlying Sjögren's syndrome (SS) remains elusive. The Authors have studied if altered calcium signaling might be a factor in SS development in a commonly used mouse model. They provide a thorough and straightforward characterization of the salivary gland fluid secretion, cytoplasmic calcium signaling and mitochondrial morphology and respiration. A special strength of the study is the spectacular in vivo imaging, very few if any groups could have succeeded with the studies. The Authors show that the cytoplasmic calcium signaling is upregulated in the SS model and the Ca2+ regulated Cl- channels normally localized and function, still fluid secretion is suppressed. They also find altered localization of the IP3R and speculate about lesser exposure of Cl- channels to high local [Ca2+]. In addition, they describe changes in mitochondrial morphology and function that might also contribute to the attenuated secretory response. Although, the exact contribution of calcium and mitochondria to secretory dysfunction remains to be determined, the results seem to be useful for a range of scientists.

      Comments on revised version:

      I appreciate the Authors' responses and am satisfied with the revised manuscript.

    2. eLife assessment

      This manuscript presents important observations on the early changes that occur in calcium signaling, TMEM16a channel activation, and mitochondrial dysfunction in salivary gland cells in a murine model of autoimmune Sjögren's disease. The study reports that in response to DMXAA treatment which induces a murine model of Sjögren's disease, salivary gland cells show significant changes in saliva release, calcium signaling, TMEM16a activation, mitochondrial function, and sub-cellular morphology of the endoplasmic reticulum. The work is compelling and will be of strong interest to physiologists working on secretion, calcium signaling, and mitochondria.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors address cellular mechanisms underlying the early stages of Sjogren's syndrome, using a mouse model in which 5,6-Dimethyl-9-oxo-9H-xanthene-4-acetic acid (DMXAA) is applied to stimulate the interferon gene (STING) pathway. They show that in this model salivary secretion in response to neural stimulation is greatly reduced, even though calcium responses of individual secretory cells was enhanced. They attribute the secretion defect to reduced activation of Ca2+ -activated Cl- channels (TMEM16a), due to an increased distance between Ca2+ release channels (IP3 receptors) and TMEM16a which is expected to reduce the [Ca2+] sensed by TMEM16a. A variety of disruptions in mitochondria were also observed after DMXAA treatment, including reduced abundance, altered morphology, depolarization and reduced oxygen consumption rate. The results of this study shed new light on some of the early events leading to the loss of secretory function in Sjogren's syndrome, at a time before inflammatory responses cause the death of secretory cells.

      Strengths:

      Two-photon microscopy enabled Ca2+ measurements in the salivary glands of intact animals in response to physiological stimuli (nerve stimulation. This approach has been shown previously by the authors as necessary to preserve the normal spatiotemporal organization of calcium signals that lead to secretion under physiological conditions.

      Superresolution (STED) microscopy allowed precise measurements of the spacing of IP3R and TMEM16a and the cell membranes that would otherwise be prevented by the diffraction limit. The measured increase of distance (from 84 to 155 nm) would be expected to reduce [Ca2+] at the TMEM16a channel.

      The authors effectively ruled out a variety of alternative explanations for reduced secretion, including changes in AQP5 expression, and TMEM16a expression, localization and Ca2+ sensitivity as indicated by Cl- current in response to defined levels of Ca2+. Suppression of Cl- currents by a fast buffer (BAPTA) but not a slow one (EGTA) supports the idea that increased distance between IP3R and TMEM16A contributes to the secretory defect in DMXAA-treated cells.

      Weaknesses:

      While the Ca2+ distribution in the cells was less restricted to the apical region in DMXAA-treated cells, it is not clear that this is relevant to the reduced activation of TMEM16a or to pathophysiological changes associated with Sjogren's syndrome.

      Despite the decreased level of secretion, Ca2+ signal amplitudes were higher in the treated cells, raising the question of how much this might compensate for the increased distance between IP3R and TMEM16a. The authors assume that the increased separation of IP3R and TMEM16a (and the resulting decrease in local [Ca2+]) outweighed the effect of higher global [Ca2+], but this point was not addressed directly.

      The description of mitochondrial changes in abundance, morphology, membrane potential, and oxygen consumption rate were not well integrated into the rest of the paper. While they may be a facet of the multiple effects of STING activation and may occur during Sjogren's syndrome, their possible role in reducing secretion was not examined. As it stands, the mitochondrial results are largely descriptive and more studies are needed to connect them to the secretory deficits in SJogren's syndrome.

    4. Reviewer #2 (Public Review):

      Summary:

      This manuscript describes a very eloquent study of disrupted stimulus -secretion coupling in salivary acinar cells in the early stages of an animal model (DMXAA) of Sjogren's syndrome (SS). The study utilizes a range of technically innovative in vivo imaging of Ca signaling, in vivo salivary secretion, patch clamp electrophysiology to assess TMEM16a activity, immunofluorescence and electron microscopy and a range of morphological and functional assays of mitochondrial function. Results show that in mice with DMXAA-induced Sjogren's syndrome, there was a reduced nerve stimulation induced salivary secretion, yet surprisingly the nerve stimulation induced Ca signaling was enhanced. There was also a reduced carbachol (CCh)-induced activation of TMEM16a currents in acinar cells from DMXAA-induced SS mice, whereas the intrinsic Ca-activated TMEM16a currents were unaltered, further supporting that stimulus-secretion coupling was impaired. Consistent with this, high resolution STED microscopy revealed that there was a loss of close physical spatial coupling between IP3Rs and TMEM16a, which may contribute to the impaired stimulus-secretion coupling. Furthermore, the authors show that the mitochondria were both morphologically and functionally impaired, suggesting that bioenergetics may be impaired in salivary acinar cells of DMXAA-induced SS mice.

      Strengths:

      Overall, this is an outstanding manuscript, that will have a huge impact on the field. The manuscript is beautifully well-written with a very clear narrative. The experiments are technically innovative, very well executed and with a logical design The data are very well presented and appropriately analyzed and interpreted.

      Review of Revised Manuscript:

      The authors have now addressed all my comments and concerns in the revised manuscript to my satisfaction.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer 1:

      Strengths

      We thank the reviewer for recognizing the strengths of our in vivo Ca2+ measurements, super resolution microscopy and assessment of the secretory dysfunction in the Sjogrens syndrome mouse model.

      Weaknesses

      Point 1: The less restricted Ca2+ signal to the apical region of the acinar cell is not really relevant to the reduced activation of TMEM16a by a local signal at the apical plasma membrane.

      We agree that the spatially averaged Ca2+ signal is not indicative of the local Ca2+ signal that activates TMEM16a. The description of the disordered Ca2+ signal in the disease model was intended to simply convey that the Ca2+ signal is altered in the model. Whether or indeed how the altered spatial characteristics of the signal are deleterious is not known but we speculate in the discussion that this contributes to the ultrastructural damage observed.

      Point 2. Secretion is decreased but the amplitude of the globally averaged Ca2+ signals are increased. No proof is offered that the greater distance between IP3R and TMEM16a is the reason for decreased secretion in the face of this increased peak signal.

      We have now added new data that indicates that the local Ca2+ signal is indeed disrupted in the disease model. We show that in control animals, activation of TMEM16a by application of agonist occurs when the pipette is buffered with the slower buffer EGTA but not with the fast buffer BAPTA In contrast, in cells isolated from DMXAA -treated animals both EGTA and BAPTA abolish the agonist-induced currents (new Figure 6). These data are consistent with our super resolution data showing the distance between IP3R and TMEM16a being greaterand thus presumably is enough to allow buffering of Ca2+ release from IP3R such that it does not effectively activate TMEM16a. These data also would suggest that the increased amplitude of the spatially averaged Ca2+ signal is not sufficient to overcome this structural change.

      Point 3. Lack of evidence that the mitochondrial changes are associated with the defect in fluid secretion.

      We agree that a causal link between the decreased secretion and altered mitochondrial morphology and function is not established. Nevertheless, we feel it is reasonable to contend that profound changes in mitochondrial morphology observed at the light and EM level, together with changes in mitochondrial membrane potential and oxygen consumption are consistent with contributing to altered fluid secretion given that this is an energetically costly process. We have altered the discussion to reflect these caveats and ideas.

      Reviewer 2:

      We thank the reviewer for their assessment of our work and constructive comments.

      Reviewer 3:

      We thank the reviewer for their careful appraisal of our manuscript and insightful comments. 

      Point 1: Are all the effects of DMXAA mediated through the STING pathway?

      This is an important point because as noted DMXAA has been reported to inhibit NAD(P)H quinone oxireductase that could contribute to the phenotype reported here. In future studies we intend to test other STING pathway agonists such as MSA-2 and perhaps antagonists of the STING pathway. We have added text to the discussion indicating that all the effects observed may not be a result of activation of the STING pathway.

      Point 2: As noted, and clarified in the text, the driving force for ATP production is the electrochemical H+ gradient which establishes the mitochondrial membrane potential.

      Point 3:  The reviewer suggested there was a decrease mitochondrial membrane potential in the absence of a change in TMRE steady state.

      We apologize for the confusion generated from the presentation of the figure. We normalized TMRE fluorescence against Mitotraker green fluorescence but as shown, the figure does not reflect that the absolute TMRE fluorescence was indeed decreased. Supplemental figure 4 now shows the basal TMRE fluorescence.

      Point 4: Indications that the disruption to ER structure seen in Electron Micrographs contributes to the changes in Ca2+ signal and fluid secretion.

      We did not focus on the relative distance between ER and apical PM in the EMs primarily because the ER that projects towards the apical PM is a relatively minor component of the specialized ER expressing IP3R and is difficult to identify. We note that the disruption of the bulk ER as quantitated by altered ER-mitochondrial interfaces and fragmentation is consistent with our super resolution data and thus likely plays a role in the mechanism that results in dysregulated Ca2+ signals and reduced secretion.

      Recommendations to Authors:

      Reviewing Editor:

      (1) The Editor suggests that we should use the activity of TMEM16a to directly measure the [Ca2+] experienced by the channel.

      We now present new additional data.  First, we show an extended range of pipette [Ca2+] demonstrating identical Ca2+ sensitivity in DMXAA vs vehicle treated cells (Figure 5). Second, importantly, we now present data evaluating the ability of muscarinic stimulation to activate TMEM16a in the presence of either EGTA (slow Ca2+ buffer) or BAPTA (fast Ca2+ buffer). Notably, currents can be stimulated in control cells when the pipette is buffered with EGTA, but not in DMXAA treated cells. BAPTA inhibits activation in both situations (new Figure 6). These data are consistent with TMEM16a being activated by Ca2+ in a microdomain and that this is disrupted in the disease model.   

      (2) The Editor asks whether a decrease in IP3R3 in a subset of the samples could account for the decreased fluid secretion.

      We think this is unlikely given, as noted by the Editor, that a reduction only occurred in a subset of the samples and statistically there was no significant difference to vehicle-treated animals. Moreover, we would note that there is also no difference in the expression of IP3R2 between experimental groups and in studies of transgenic mice where either IP3R2 or IP3R3 were knocked out individually, there was no effect on salivary fluid secretion, indicating that expression of a single subtype can support stimulus-secretion coupling.

      (3) Absolute values for changes in fluorescence (over time) should be included together with SD images.

      These have been added in Figure 3.

      (4) DMXAA has additional effects to STING activation and thus other STING pathway modulators should be used.

      We agree that additional STING agonists should be explored in the future but believe that this is beyond the scope of the present studies. Additional text has been added to the discussion acknowledging the additional targets of DMXAA and that they could contribute to the phenotype.

      (5) No causal link between the observed Ca2+ changes and mitochondrial dysfunction.

      We agree that no experimental evidence is offered to directly support this contention. Nevertheless, dysregulated Ca2+ signals are well-documented to lead to altered mitochondrial structure and function and thus we feel it not unreasonable to speculate that this is a possibility.

      (6) The paper would be improved by directly assessing mechanistic connections between altered Ca2+ signaling and TMEM16a activation.

      We agree, please refer to point 1 and new figure 6.

      Reviewer 1:

      (1) Standard Deviation images should be explained and the location of ROI identified.

      We contend that Standard Deviation images provide an effective visualization (in a single image) of both the magnitude of the Ca2+ increase and the degree of recruitment of cells in the field of view during the entire period of stimulation.  We have added text to describe the utility of this technique. Nevertheless, we now show kinetic traces of the changes in fluorescence over time in both apical and basal regions in Figure 3. We also clarify that the traces shown in Figure 2 are averaged over the entire cell. 

      (2) The Authors should consider that reduced secretion is because cells are dying.

      We believe this is unlikely given the lack of morphological changes in glandular structure and the minor lymphocyte infiltration observed in this model. Nevertheless, we now add data showing that the mass of SMG is not altered in the DMXAA-treated animals compared with vehicle-treated (Figure 1E).

      (3) The role of mitochondria in the DMXAA phenotype is unclear. What is the effect of acutely de-energizing mitochondria on fluid secretion.

      Since fluid secretion is an energetically expensive undertaking, it is not unreasonable to suggest that compromised mitochondrial function may impact secretion. That being said this could occur at multiple levels- production of ATP to fuel the Na/K pump to establish membrane gradients or to provide energy to sequester Ca2+ among a multitude of targets. This will be a subject of ongoing experiments. We contend that experiments to acutely disrupt salivary mitochondria in vivo while assessing fluid secretion would be difficult experiments to perform and interpret given that local administration of agents to SMG would not effect the other major salivary glands and systemic administration would be predicted to have wide-ranging off target effects. 

      (4) Could a subset of cells with low IP3R numbers contribute to reduced fluid secretion?

      Please see the response to Reviewing Editors point 2. 

      (5) An attempt to estimate the effect of the spatial distruption of IP3R and TMEM16a localization should be made.

      Please see the response to Reviewing Editors point 1.

      Minor Points

      We have amended the statement form “Highly expressed” to increased.

      Regions of the cell have been labelled for orientation in the line scans.

      The molecular weight markers have been added in Figure 4.

      Reviewer 2:

      (1) Whether mitochondrial dysfunction is the initiator of the phenotype or a result of the dysregulated Ca2+ signal is unclear.

      We agree that our data does not clarify a classic “Chicken vs Egg” conundrum. We plan further experiments to address this issue. Future plans include repeating the mitochondrial and Ca2+ signaling experiments at earlier time points where we know fluid secretion is not yet impacted. This may potentially reveal the temporal sequence of events. Similarly, we plan experiments to mechanistically address why the global Ca2+ signal is augmented- reduced Ca2+ clearance or enhanced Ca2+ release/influx are possibilities. We speculate that reduced Ca2+ clearance, either because mitochondrial Ca2+ uptake is reduced or as a secondary consequence of reduced ATP levels on SERCA and PMCA is a likely possibility.

      (2) Measurement of ECAR and direct measurements of ATP and Seahorse methods.

      In a separate series of experiments, we monitored ECAR. These data were unfortunately very variable and difficult to interpret, although no obvious compensatory increase was observed. We plan in the future to directly monitor ATP levels in acinar cells using Mg-Green. To normalize for cell numbers in the Seahorse experiments, following centrifugation, cell pellets of equal volume were resuspended in equal volumes of buffer. Acinar cells were seeded onto Cell Tak coated dishes. This information is added to the Methods section.

    1. eLife assessment

      This manuscript provides an important advance in our understanding of the molecular events that promote osteoclast fusion. Compelling data support the conclusion that an oxidized form of the ubiquitous protein La promotes osteoclast fusion following enrichment at the cell surface of osteoclast progenitors. These data improve our understanding of the processes that regulate bone resorption and will be of broad interest to researchers in the fields of cell biology and musculoskeletal physiology.

    2. Reviewer #1 (Public Review):

      In this manuscript, Leikina et al. investigate the role of redox changes in the ubiquitous protein La in promotion of osteoclast fusion. In a recently published manuscript, the investigators found that osteoclast multinucleation and resorptive activity are regulated by a de-phosphorylated and proteolytically cleaved form of the La protein that is present on the cell surface of differentiating osteoclasts. In the present work, the authors build upon these findings to determine the physiologic signals that regulate La trafficking to the cell membrane and ultimately, the ability of this protein to promote fusion. Building upon other published studies that show 1) that intracellular redox signaling can elicit changes in the confirmation and localization of La, and 2) that osteoclast formation is dependent on ROS signaling, the authors hypothesize that oxidation of La in response to intracellular ROS underlies the re-localization of La to the cell membrane and that this is necessary for its pro-fusion activity. The authors test this hypothesis in a rigorous manner using antioxidant treatments, recombinant La protein, and modification of cysteine residues predicted to be key sites of oxidation. Osteoclast fusion is then monitored in each condition using fluorescence microscopy. These data strongly support the conclusion that oxidized La is de-phosphorylated, increases in abundance at the cell surface of differentiating osteoclasts, and promotes cell-cell fusion. A strength of this manuscript is the use of multiple complementary approaches to test the hypothesis, especially the use of Cys mutant forms of La to directly tie the observed phenotypes to changes in residues that are key targets for oxidation. The manuscript is also well written and describes a clearly articulated hypothesis based on a precise summation of the existing literature. The findings of this manuscript will be of interest to researchers in the field of bone biology, but also more generally to cell biologists. The data in this manuscript may also lead to future studies that target La for bone diseases in which there is increased osteoclast activity. Weaknesses of the first version of the manuscript were minor and predominantly related to data presentation choices and some statistical analyses. These weaknesses were comprehensively addressed in the revised manuscript, and therefore the study has increased clarity and rigor.

    3. Reviewer #2 (Public Review):

      Summary:<br /> Bone resorption by osteoclasts plays an important role in bone modeling and homeostasis. The multinucleated mature osteoclasts have higher bone-resorbing capacity than their mononuclear precursors. The previous work by authors has identified that increased cell-surface level of La protein promotes fusion of mononuclear osteoclast precursor cells to form fully active multinucleated osteoclasts. In the present study, the authors further provided convincing data obtained from cellular and biochemical experiments to demonstrate that the nuclear localized La protein where it regulates RNA metabolism was oxidized by redox signaling during osteoclast differentiation and the modified La protein was translocated to osteoclast surface where it associated with other proteins and phospholipids to trigger cell-cell fusion process. The work provides novel mechanistic insights into osteoclast biology and provides a potential therapeutic target to suppress excessive bone resorption in metabolic bone diseases such as osteoporosis and arthritis.

      Strengths:<br /> Increased intracellular ROS induced by osteoclast differentiation cytokine RANKL has been widely studied in enhancing RANKL signaling during osteoclast differentiation. The work provides novel evidence that redox signaling can post-translationally modify proteins to alter the translocation and functions of critical regulators in the late stage of osteoclastogenesis. The results and conclusions are mostly supported by the convincing cellular and biochemical assays,

      Weaknesses:<br /> Lack of in vivo studies in animal models of bone diseases such as postmenopausal osteoporosis, inflammatory arthritis, and osteoarthritis reduces the translational potential of this work.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):  

      (1) When introducing the different antibody clones recognizing Pan, oxidized, or reduced forms, please clearly indicate which clone number belongs to which form.  

      - We see where the original language could be confusing. Please see our new introduction to the antibodies used.

      “we evaluated the redox state of La in fusing osteoclasts using recently validated monoclonal α-La antibodies that recognize oxidized La (clone 7B6) or reduced La (clone 312B), or do not distinguish between these La species (Pan, clone 5B9)”

      (2) "Finding that the surface La pool, which promotes multinucleation in osteoclasts, is an oxidized species..." I would suggest rewording as "...is enriched in oxidized species".  

      - Agreed. We have edited the sentence as follows.

      “Finding that the surface La pool, which promotes multinucleation in osteoclasts, is enriched in an oxidized species raised the question”

      (3) Although not necessary to support the conclusions of the manuscript, it would be interesting to know if the application of La194-408 to osteoclast progenitors following NAC treatment results in the rescue of La staining at the cell surface, or if this exogenous La is acting independently from cell surface association.  

      - We agree that this is an interesting idea. We previously demonstrated that we could add La 1-375 to osteoclast progenitors following RANKL addition and promote osteoclast fusion. We also demonstrated that La 1-375 under these conditions enriched La surface staining (PMID: 36739273)

      - Therefore, we hypothesize that La 194-408 would act similarly.

      (4) Is the confirmation of La modified by the conversion of Cys 232 and 245 to alanine? What about the potential to form oligomers?  

      - To directly answer the Reviewer’s question – we simply do not know and do not have a simple way to test this. To speculate, the differential recognition of La that is reduced vs oxidized by the antibodies used here (specifically clone 312b vs clone 7b6) suggests that some conformational change is taking place when redox signaling modifies La in osteoclasts. Moreover, in Supp. Fig. 4b, we show that recombinant La 194-408 does form a small amount of dimer under our conditions while La 194-408 Cys 232 and 245 to Ala does not. These data together weakly support that La, when converted from reduced to oxidized forms or when we artificially Cys 232 and 245 to Ala, undergoes some conformational and oligomeric change. However, we are not comfortable making

      such claims in the manuscript currently and prefer to investigate this with more rigor and comment in the biological significance of these potential changes in the future.

      (5) "In conclusion, in this study, we identified redox signaling as a molecular switch that redirects La protein away from the nucleus, where it protects precursor tRNAs from exonuclease digestion, and towards its osteoclast-specific function at the cell surface..." I would suggest rewording this sentence given that there is no evidence that the function of oxidized La at the cell surface is osteoclast-specific. This phenomenon could be applicable to other cell types and other biological processes.  

      - The Reviewer makes a good point here, that we very much appreciate. We hoped to communicate that this was a unique function of La that was different from the well-recognized role this protein plays in RNA metabolism, but somewhat overstated past our intention. Please see where we have modified this statement to read:

      “In conclusion, in this study, we identified redox signaling as a molecular switch that redirects La protein away from the nucleus, where it protects precursor tRNAs from exonuclease digestion, and towards its separable function at the osteoclast surface, where La regulates the multinucleation and resorptive functions of these managers of the skeleton.”

      (6) In methods, the definition of TCEP is missing a closed parenthesis sign.  

      - Thank you, corrected.

      (7) In methods under "Cells" there is a missing superscript in 1x106 cells/ml. Presumably, this is 1x10e6.   

      - Thank you, corrected.

      (8) Please provide the sequences of primers used for RT-PCR in this study.  

      - Understood. Please see where a table of all primer sequences used has been added to the Methods under the Transcript Analysis section.

      (9) In methods, "Bone resorption" should be relabeled given that the osteoclasts are plated on calciumphosphate plates and not on a bone surface.  

      - Thank you. Please see where in the Methods both the title and all references to “bone resorption” in the method description have now been changed to “mineral resorption”.

      (10) In several figures, it would be more appropriate to correct for multiple comparisons in the statistical analyses.  

      - We appreciate this concern. Please see where Fig. 2b,c; Fig. 3 b,c; Fig. 4d; Fig. 5b,d; and Fig. 6d have been reanalyzed using paired one-way ANOVAs corrected for multiple comparisons. Now all data where t-tests are used to evaluate statistical significance are only evaluating  differences between 2 values and all experiments considering 3+ values are compared using one-way ANOVAs corrected for multiple comparisons.

      (11) Figure 5: Panels D and E are flipped relative to the legend. Please also define the reagent used for ROS signal in the legend.  

      - Thank you. D and E are now corrected and we added “(Grey = CellRox Dye)” to the end of the legend for Fig. 5a.

      (12) Supplemental Figure 5c: in the control condition, why are some nuclei not staining with the reduced La antibody?  

      - Great question, direct answer – we simply do not know.  

      Longer answer, this image is in fact representative and not exclusive to the reduced La antibody (clone 312b). When we look at La staining in mature, multinucleated osteoclast nuclei at later timepoints post fusion using even pan antibodies, we find that its localization to the nuclei of syncytial osteoclasts is not uniform, but that nuclear La preferentially enriches in some mature osteoclast nuclei and seems to be excluded from others. This may suggest that – akin to myonuclei in skeletal muscle – osteoclast nuclei in a syncytium are not all equal. However, we are far, far away from being able to make any conclusions from the data we have.

      (13) Figure 7 legend: consider breaking this legend up into multiple sentences.  

      - Thank you for the suggestion. The legend for Figure 7 has been rewritten.

      Reviewer #2 (Recommendations For The Authors):  

      (1) Can the authors use the official name of La protein in NCBI GENE and PROTEIN?  

      - While some in the field refer to lupus La protein as La protein, we choose to refer to it simply as La, as is common throughout the Lupus La Protein literature. It is our opinion that continuously referring to a protein as a name + the word protein throughout the manuscript is unnecessary and alters the flow of our manuscript’s points.

      Thanks. We have included the official name of human La in NCBI GENE ((SSB small RNA binding exonuclease protection factor La, Gene ID 6741, NCBI GENE)  into the revised text.  

      (2) The references 26 and 27 are not representative. The pioneering work from Mundy, Chambers, and Almeida (PBMID 2312718, 15528306, and 24781012) should be cited.  

      - Thanks. We have added these 3 references to better acknowledge these significant contributions.

      (3) It is hard to understand Figure 2. What are the white arrows in Figure 2a pointed to? In Figure 2b, what do the columns a-LA(Red), a-La (Pan), and a-La (Ox) mean, treatment, or staining? Figure 2c, the legend "conditions where surface proteins are oxidized (TCEP) seems to be "deoxidized.  

      - We agree. We now realized this legend was rather confusing. It has been edited to read

      “(a) Representative fluorescence and DIC confocal micrographs of primary human osteoclasts following synchronized cell-cell fusion where hemifusion inhibitor was left (Inhibition), removed (Wash) or removed but the α-La antibodies indicated were simultaneously added.

      Cyan=Hoechst Arrows=Multinucleated Osteoclasts (b) Quantification of a.” • Thanks. 2c has now been corrected to “reduced” rather than the errant “oxidized”.

      (4) How do authors normalize bone resorption, % of total area?  

      - We normalized to a separate, paired well where monocytes are differentiated to precursors (MCSF), but no RANKL is added. We have added this omitted information to the methods sections for our mineral resorption assay.

      (5) Figure 5. There are two legends (b). In Figure 5c RT-qPCR, the DC-STAMP or OC-STAMP and mature osteoclast marker calcitonin receptor should be included.

      - Thank you. There were several problems with Figure legend 5 that both you and Reviewer #1 brought our attention to. We have now corrected these errors.

      - We understand the Reviewer’s interest in these markers. However, our point is that the steadystate transcript levels of two well recognized osteoclast differentiation factors and the fusion regulator La, which our manuscript focuses on, are not significantly altered by NAC treatment at these later, fusion associated timepoints. While DC-STAMP, OC-STAMP, and Calcitonin would be interesting, we believe they are outside the scope of this manuscript.

    1. Reviewer #1 (Public Review):

      Summary:

      In this manuscript the authors investigate the contributions of the long noncoding RNA snhg3 in liver metabolism and MAFLD. The authors conclude that liver-specific loss or overexpression of Snhg3 impacts hepatic lipid content and obesity through epigenetic mechanisms. More specifically, the authors invoke that nuclear activity of Snhg3 aggravates hepatic steatosis by altering the balance of activating and repressive chromatin marks at the Pparg gene locus. This regulatory circuit is dependent on a transcriptional regulator SNG1.

      Strengths:

      The authors developed a tissue specific lncRNA knockout and KI models. This effort is certainly appreciated as few lncRNA knockouts have been generated in the context of metabolism. Furthermore, lncRNA effects can be compensated in a whole organism or show subtle effects in acute versus chronic perturbation, rendering the focus on in vivo function important and highly relevant. In addition, Snhg3 was identified through a screening strategy and as a general rule the authors the authors attempt to follow unbiased approaches to decipher the mechanisms of Snhg3.

      Weaknesses:

      Despite efforts at generating a liver-specific knockout, the phenotypic characterization is not focused on the key readouts. Notably missing are rigorous lipid flux studies and targeted gene expression/protein measurement that would underpin why loss of Snhg3 protects from lipid accumulation. Along those lines, claims linking the Snhg3 to MAFLD would be better supported with careful interrogation of markers of fibrosis and advanced liver disease. In other areas, significance is limited since the presented data is either not clear or rigorous enough. Finally, there is an important conceptual limitation to the work since PPARG is not established to play a major role in the liver.

    2. eLife assessment

      This study provides useful evidence substantiating a role for long noncoding RNAs in liver metabolism and organismal physiology. Using murine knockout and knock-in models, the authors invoke a previously unidentified role for the lncRNA Snhg3 in fatty liver. The revised manuscript has improved and most studies are backed by solid evidence and will be of interest to the field of metabolism.

    3. Reviewer #2 (Public Review):

      Through RNA analysis, Xie et al found LncRNA Snhg3 was one of the most down-regulated Snhgs by high fat diet (HFD) in mouse liver. Consequently, the authors sought to examine the mechanism through which Snhg3 is involved in the progression of metabolic dysfunction-associated fatty liver diseases (MASLD) in HFD-induced obese (DIO) mice. Interestingly, liver-specific Sngh3 knockout reduced, while Sngh3 over-expression potentiated fatty liver in mice on a HFD. Using the RNA pull-down approach, the authors identified SND1 as a potential Sngh3 interacting protein. SND1 is a component of the RNA-induced silencing complex (RISC). The authors found that Sngh3 increased SND1 ubiquitination to enhance SND1 protein stability, which then reduced the level of repressive chromatin H3K27me3 on PPARg promoter. The upregulation of PPARg, a lipogenic transcription factor, thus contributed to hepatic fat accumulation.

      The authors propose a signaling cascade that explains how LncRNA sngh3 may promote hepatic steatosis. Multiple molecular approaches have been employed to identify molecular targets of the proposed mechanism, which is a strength of the study. There are, however, several potential issues to consider before jumping to the conclusion.

      (1) First of all, it's important to ensure the robustness and rigor of each study. The manuscript was not carefully put together. The image qualities for several figures were poor, making it difficult for the readers to evaluate the results with confidence. The biological replicates and numbers of experimental repeats for cell-based assays were not described. When possible, the entire immunoblot imaging used for quantification should be presented (rather than showing n=1 representative). There were multiple mis-labels in figure panels or figure legends (e.g., Fig. 2I, Fig. 2K and Fig. 3K). The b-actin immunoblot image was reused in Fig. 4J, Fig. 5G and Fig. 7B with different exposure times. These might be from the same cohort of mice. If the immunoblots were run at different times, the loading control should be included on the same blot as well.

      (2) The authors can do a better job in explaining the logic for how they came up with the potential function of each component of the signaling cascade. Sngh3 is down-regulated by HFD. However, the evidence presented indicates its involvement in promoting steatosis. In Fig. 1C, one would expect PPARg expression to be up-regulated (when Sngh3 was down-regulated). If so, the physiological observation conflicts with the proposed mechanism. In addition, SND1 is known to regulate RNA/miRNA processing. How do the authors rule out this potential mechanism? How about the hosting snoRNA, Snord17? Does it involve in the progression of NASLD?

      (3) The role of PPARg in fatty liver diseases might be a rodent-specific phenomenon. PPARg agonist treatment in humans may actually reduce ectopic fat deposition by increasing fat storage in adipose tissues. The relevance of the finding to human diseases should be discussed.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, the authors investigate the contributions of the long noncoding RNA snhg3 in liver metabolism and MAFLD. The authors conclude that liver-specific loss or overexpression of Snhg3 impacts hepatic lipid content and obesity through epigenetic mechanisms. More specifically, the authors invoke that the nuclear activity of Snhg3 aggravates hepatic steatosis by altering the balance of activating and repressive chromatin marks at the Pparg gene locus. This regulatory circuit is dependent on a transcriptional regulator SND1.

      Strengths:

      The authors developed a tissue-specific lncRNA knockout and KI models. This effort is certainly appreciated as few lncRNA knockouts have been generated in the context of metabolism. Furthermore, lncRNA effects can be compensated in a whole organism or show subtle effects in acute versus chronic perturbation, rendering the focus on in vivo function important and highly relevant. In addition, Snhg3 was identified through a screening strategy and as a general rule the authors the authors attempt to follow unbiased approaches to decipher the mechanisms of Snhg3.

      Weaknesses:

      Despite efforts at generating a liver-specific knockout, the phenotypic characterization is not focused on the key readouts. Notably missing are rigorous lipid flux studies and targeted gene expression/protein measurement that would underpin why the loss of Snhg3 protects from lipid accumulation. Along those lines, claims linking the Snhg3 to MAFLD would be better supported with careful interrogation of markers of fibrosis and advanced liver disease. In other areas, significance is limited since the presented data is either not clear or rigorous enough. Finally, there is an important conceptual limitation to the work since PPARG is not established to play a major role in the liver.

      We thank the reviewer for the detailed comment. In this study, hepatocyte-specific Snhg3 deficiency decreased body and liver weight and alleviated hepatic steatosis in DIO mice, whereas overexpression induced the opposite effect (Figure 2 and 3). Furthermore, we investigated the hepatic differentially expressed genes (DEGs) between the DIO Snhg3-HKI and control WT mice using RNA-Seq and revealed that Snhg3 exerts a global effect on the expression of genes involved in fatty acid metabolism using GSEA (Figure 4B). We validated the expression of some DEGs involved in fatty acid metabolism by RT-qPCR. The results showed that the hepatic expression levels of some genes involved in fatty acid metabolism, including Cd36, Cidea/c and Scd1/2 were upregulated in Snhg3-HKO mice and were downregulated in Snhg3-HKI mice compared to the controls (Figure 4C), respectively. Please check them in the first paragraph in p8.

      As a transcription regulator of Cd36 and Cidea/c, it is well known that PPARγ plays major adipogenic and lipogenic roles in adipose tissue. Although the expression of PPARγ in the liver is very low under healthy conditions, induced expression of PPARγ in both hepatocytes and non-parenchymal cells (Kupffer cells, immune cells, and HSCs) in the liver has a crucial role in the pathophysiology of MASLD (Lee et al., 2023b, Chen et al., 2023, Gross et al., 2017). The activation of PPARγ in the liver induces the adipogenic program to store fatty acids in lipid droplets as observed in adipocytes (Lee et al., 2018). Moreover, the inactivation of liver PPARγ abolished rosiglitazone-induced an increase in hepatic TG and improved hepatic steatosis in lipoatrophic AZIP mice (Gavrilova et al., 2003). Furthermore, there is a strong correlation between the onset of hepatic steatosis and hepatocyte-specific PPARγ expression. Clinical trials have also indicated that increased insulin resistance and hepatic PPARγ expressions were associated with NASH scores in some obese patients (Lee et al., 2023a, Mukherjee et al., 2022). Even though PPARγ’s primary function is in adipose tissue, patients with MASLD have much higher hepatic expression levels of PPARγ, reflecting the fact that PPARγ plays different roles in different tissues and cell types (Mukherjee et al., 2022). As these studies mentioned above, our result also hinted at the importance of PPARγ in the pathophysiology of MASLD. Snhg3 deficiency or overexpression respectively induced the decrease or increase in hepatic PPARγ. Moreover, administration of PPARγ antagonist T0070907 mitigated the hepatic Cd36 and Cidea/c increase and improved Snhg3-induced hepatic steatosis. However,  conflicting findings suggest that the expression of hepatic PPARγ is not increased as steatosis develops in humans and in clinical studies and that PPARγ agonists administration didn’t aggravate liver steatosis (Gross et al., 2017). Thus, understanding how the hepatic PPARγ expression is regulated may provide a new avenue to prevent and treat the MASLD (Lee et al., 2018). We also discussed it in revised manuscript, please refer the first paragraph in the section of Discussion in p13.

      Hepatotoxicity accelerates the development of progressive inflammation, oxidative stress and fibrosis (Roehlen et al., 2020). Chronic liver injury including MASLD can progress to liver fibrosis with the formation of a fibrous scar. Injured hepatocytes can secrete fibrogenic factors or exosomes containing miRNAs that activate HSCs, the major source of the fibrous scar in liver fibrosis (Kisseleva and Brenner, 2021). Apart from promoting lipogenesis, PPARγ has also a crucial function in improving inflammation and fibrosis (Chen et al., 2023). In this study, no hepatic fibrosis phenotype was seen in Snhg3-HKO and Snhg3-HKI mice (figures supplement 1D and 2D). Moreover, deficiency and overexpression of Snhg3 respectively decreased and increased the expression of profibrotic genes, such as collagen type I alpha 1/2 (Col1a1 and Col1a2), but had no effects on the pro-inflammatory factors, including transforming growth factor β1 (Tgfβ1), tumor necrosis factor α (Tnfα), interleukin 6 and 1β (Il6 and Il1β) (figures supplement 3A and B). Inflammation is an absolute requirement for fibrosis because factors from injured hepatocytes alone are not sufficient to directly activate HSCs and lead to fibrosis (Kisseleva and Brenner, 2021). Additionally, previous studies indicated that exposure to HFD for more 24 weeks causes less severe fibrosis (Alshawsh et al., 2022). In future, the effect of Snhg3 on hepatic fibrosis in mice need to be elucidated by prolonged high-fat feeding or by adopting methionine- and choline deficient diet (MCD) feeding. Please check them in the second paragraph in the section of Discussion in p13.

      References

      ALSHAWSH, M. A., ALSALAHI, A., ALSHEHADE, S. A., SAGHIR, S. A. M., AHMEDA, A. F., AL ZARZOUR, R. H. & MAHMOUD, A. M. 2022. A Comparison of the Gene Expression Profiles of Non-Alcoholic Fatty Liver Disease between Animal Models of a High-Fat Diet and Methionine-Choline-Deficient Diet. Molecules, 27. DIO:10.3390/molecules27030858, PMID:35164140

      CHEN, H., TAN, H., WAN, J., ZENG, Y., WANG, J., WANG, H. & LU, X. 2023. PPAR-gamma signaling in nonalcoholic fatty liver disease: Pathogenesis and therapeutic targets. Pharmacol Ther, 245, 108391. DIO:10.1016/j.pharmthera.2023.108391, PMID:36963510

      GAVRILOVA, O., HALUZIK, M., MATSUSUE, K., CUTSON, J. J., JOHNSON, L., DIETZ, K. R., NICOL, C. J., VINSON, C., GONZALEZ, F. J. & REITMAN, M. L. 2003. Liver peroxisome proliferator-activated receptor gamma contributes to hepatic steatosis, triglyceride clearance, and regulation of body fat mass. J Biol Chem, 278, 34268-76. DIO:10.1074/jbc.M300043200, PMID:12805374

      GROSS, B., PAWLAK, M., LEFEBVRE, P. & STAELS, B. 2017. PPARs in obesity-induced T2DM, dyslipidaemia and NAFLD. Nat Rev Endocrinol, 13, 36-49. DIO:10.1038/nrendo.2016.135, PMID:27636730

      KISSELEVA, T. & BRENNER, D. 2021. Molecular and cellular mechanisms of liver fibrosis and its regression. Nat Rev Gastroenterol Hepatol, 18, 151-166. DIO:10.1038/s41575-020-00372-7, PMID:33128017

      LEE, S. M., MURATALLA, J., KARIMI, S., DIAZ-RUIZ, A., FRUTOS, M. D., GUZMAN, G., RAMOS-MOLINA, B. & CORDOBA-CHACON, J. 2023a. Hepatocyte PPARgamma contributes to the progression of non-alcoholic steatohepatitis in male and female obese mice. Cell Mol Life Sci, 80, 39. DIO:10.1007/s00018-022-04629-z, PMID:36629912

      LEE, S. M., MURATALLA, J., SIERRA-CRUZ, M. & CORDOBA-CHACON, J. 2023b. Role of hepatic peroxisome proliferator-activated receptor gamma in non-alcoholic fatty liver disease. J Endocrinol, 257. DIO:10.1530/JOE-22-0155, PMID:36688873

      LEE, Y. K., PARK, J. E., LEE, M. & HARDWICK, J. P. 2018. Hepatic lipid homeostasis by peroxisome proliferator-activated receptor gamma 2. Liver Res, 2, 209-215. DIO:10.1016/j.livres.2018.12.001, PMID:31245168

      MUKHERJEE, A. G., WANJARI, U. R., GOPALAKRISHNAN, A. V., KATTURAJAN, R., KANNAMPUZHA, S., MURALI, R., NAMACHIVAYAM, A., GANESAN, R., RENU, K., DEY, A., VELLINGIRI, B. & PRINCE, S. E. 2022. Exploring the Regulatory Role of ncRNA in NAFLD: A Particular Focus on PPARs. Cells, 11. DIO:10.3390/cells11243959, PMID:36552725

      ROEHLEN, N., CROUCHET, E. & BAUMERT, T. F. 2020. Liver Fibrosis: Mechanistic Concepts and Therapeutic Perspectives. Cells, 9. DIO:10.3390/cells9040875, PMID:32260126

      Reviewer #2 (Public Review):

      Through RNA analysis, Xie et al found LncRNA Snhg3 was one of the most down-regulated Snhgs by a high-fat diet (HFD) in mouse liver. Consequently, the authors sought to examine the mechanism through which Snhg3 is involved in the progression of metabolic dysfunction-associated fatty liver diseases (MASLD) in HFD-induced obese (DIO) mice. Interestingly, liver-specific Snhg3 knockout was reduced, while Snhg3 over-expression potentiated fatty liver in mice on an HFD. Using the RNA pull-down approach, the authors identified SND1 as a potential Sngh3 interacting protein. SND1 is a component of the RNA-induced silencing complex (RISC). The authors found that Sngh3 increased SND1 ubiquitination to enhance SND1 protein stability, which then reduced the level of repressive chromatin H3K27me3 on PPARg promoter. The upregulation of PPARg, a lipogenic transcription factor, thus contributed to hepatic fat accumulation.

      The authors propose a signaling cascade that explains how LncRNA sngh3 may promote hepatic steatosis. Multiple molecular approaches have been employed to identify molecular targets of the proposed mechanism, which is a strength of the study. There are, however, several potential issues to consider before jumping to a conclusion.

      (1) First of all, it's important to ensure the robustness and rigor of each study. The manuscript was not carefully put together. The image qualities for several figures were poor, making it difficult for the readers to evaluate the results with confidence. The biological replicates and numbers of experimental repeats for cell-based assays were not described. When possible, the entire immunoblot imaging used for quantification should be presented (rather than showing n=1 representative). There were multiple mislabels in figure panels or figure legends (e.g., Figure 2I, Figure 2K, and Figure 3K). The b-actin immunoblot image was reused in Figure 4J, Figure 5G, and Figure 7B with different exposure times. These might be from the same cohort of mice. If the immunoblots were run at different times, the loading control should be included on the same blot as well.

      We thank the reviewer for the detailed comment. We have provided the clear figures in revised manuscript, please check them.

      The biological replicates and numbers of experimental repeats for cell-based assays had been updated and please check them in the manuscript.

      The entire immunoblot imaging used for quantification had been provided in the primary data. Please check them.

      The original Figure 2I, Figure 2K, Figure 3K have been revised and replaced with new Figure 2F, Figure 2H, Figure 3H, and their corresponding figure legends has also been corrected in revised manuscript.

      The protein levels of CD36, PPARγ and β-ACTIN were examined at the same time and we had revised the manuscript, please check them in revised Figure 7B and 7C.

      (2) The authors can do a better job in explaining the logic for how they came up with the potential function of each component of the signaling cascade. Snhg3 is down-regulated by HFD. However, the evidence presented indicates its involvement in promoting steatosis. In Figure 1C, one would expect PPARg expression to be up-regulated (when Sngh3 was down-regulated). If so, the physiological observation conflicts with the proposed mechanism. In addition, SND1 is known to regulate RNA/miRNA processing. How do the authors rule out this potential mechanism? How about the hosting snoRNA, Snord17? Does it involve the progression of NASLD?

      We thank the reviewer for the detailed comment. Our results showed that the expression of Snhg3 was decreased in DIO mice which led us to speculate that the downregulation of Snhg3 in DIO mice might be a stress protective reaction to high nutritional state, but the specific details need to be clarified. This is probably similar to fibroblast growth factor 21 (FGF21) and growth differentiation factor 15 (GDF15), whose endogenous expression and circulating levels are elevated in obese humans and mice despite their beneficial effects on obesity and related metabolic complications (Keipert and Ost, 2021). Although FGF21 can be induced by oxidative stress and be activated in obese mice and in NASH patients, elevated FGF21 paradoxically protects against oxidative stress and reduces hepatic steatosis (Tillman and Rolph, 2020).  We had added the content the section of Discussion, please check it in the second paragraph in p12.

      SND1 has multiple roles through associating with different types of RNA molecules, including mRNA, miRNA, circRNA, dsRNA and lncRNA. SND1 could bind negative-sense SARS-CoV-2 RNA and promoted viral RNA synthesis, and to promote viral RNA synthesis (Schmidt et al., 2023). SND1 is also involved in hypoxia by negatively regulating hypoxia‐related miRNAs (Saarikettu et al., 2023). Furthermore, a recent study revealed that lncRNA SNAI3-AS1 can competitively bind to SND1 and perturb the m6A-dependent recognition of Nrf2 mRNA 3'UTR by SND1, thereby reducing the mRNA stability of Nrf2 (Zheng et al., 2023). Huang et al. also reported that circMETTL9 can directly bind to and increase the expression of SND1 in astrocytes, leading to enhanced neuroinflammation (Huang et al., 2023). However, whether there is an independent-histone methylation role of SND1/lncRNA-Snhg3 involved in lipid metabolism in the liver needs to be further investigated. We also discussed the limitation in the manuscript and please refer the section of Discussion in the third paragraph in p17.

      Snhg3 serves as host gene for producing intronic U17 snoRNAs, the H/ACA snoRNA. A previous study found that cholesterol trafficking phenotype was not due to reduced Snhg3 expression, but rather to haploinsufficiency of U17 snoRNA. Upregulation of hypoxia-upregulated mitochondrial movement regulator (HUMMR) in U17 snoRNA-deficient cells promoted the formation of ER-mitochondrial contacts, resulting in decreasing cholesterol esterification and facilitating cholesterol trafficking to mitochondria (Jinn et al., 2015). Additionally, disruption of U17 snoRNA caused resistance to lipid-induced cell death and general oxidative stress in cultured cells. Furthermore, knockdown of U17 snoRNA in vivo protected against hepatic steatosis and lipid-induced oxidative stress and inflammation (Sletten et al., 2021). We determined the expression of hepatic U17 snoRNA and its effect on SND1 and PPARγ. The results showed that the expression of U17 snoRNA decreased in the liver of DIO Snhg3-HKO mice and unchanged in the liver of DIO Snhg3-HKI mice, but overexpression of U17 snoRNA had no effect on the expression of SND1 and PPARγ (figure supplement 5A-C), indicating that Sngh3 induced hepatic steatosis was independent on U17 snoRNA. We also discussed it in revised manuscript, please refer the section of Discussion in p15.

      References

      HUANG, C., SUN, L., XIAO, C., YOU, W., SUN, L., WANG, S., ZHANG, Z. & LIU, S. 2023. Circular RNA METTL9 contributes to neuroinflammation following traumatic brain injury by complexing with astrocytic SND1. J Neuroinflammation, 20, 39. DIO:10.1186/s12974-023-02716-x, PMID:36803376

      JINN, S., BRANDIS, K. A., REN, A., CHACKO, A., DUDLEY-RUCKER, N., GALE, S. E., SIDHU, R., FUJIWARA, H., JIANG, H., OLSEN, B. N., SCHAFFER, J. E. & ORY, D. S. 2015. snoRNA U17 regulates cellular cholesterol trafficking. Cell Metab, 21, 855-67. DIO:10.1016/j.cmet.2015.04.010, PMID:25980348

      KEIPERT, S. & OST, M. 2021. Stress-induced FGF21 and GDF15 in obesity and obesity resistance. Trends Endocrinol Metab, 32, 904-915. DIO:10.1016/j.tem.2021.08.008, PMID:34526227

      SAARIKETTU, J., LEHMUSVAARA, S., PESU, M., JUNTTILA, I., PARTANEN, J., SIPILA, P., POUTANEN, M., YANG, J., HAIKARAINEN, T. & SILVENNOINEN, O. 2023. The RNA-binding protein Snd1/Tudor-SN regulates hypoxia-responsive gene expression. FASEB Bioadv, 5, 183-198. DIO:10.1096/fba.2022-00115, PMID:37151849

      SCHMIDT, N., GANSKIH, S., WEI, Y., GABEL, A., ZIELINSKI, S., KESHISHIAN, H., LAREAU, C. A., ZIMMERMANN, L., MAKROCZYOVA, J., PEARCE, C., KREY, K., HENNIG, T., STEGMAIER, S., MOYON, L., HORLACHER, M., WERNER, S., AYDIN, J., OLGUIN-NAVA, M., POTABATTULA, R., KIBE, A., DOLKEN, L., SMYTH, R. P., CALISKAN, N., MARSICO, A., KREMPL, C., BODEM, J., PICHLMAIR, A., CARR, S. A., CHLANDA, P., ERHARD, F. & MUNSCHAUER, M. 2023. SND1 binds SARS-CoV-2 negative-sense RNA and promotes viral RNA synthesis through NSP9. Cell, 186, 4834-4850 e23. DIO:10.1016/j.cell.2023.09.002, PMID:37794589

      SLETTEN, A. C., DAVIDSON, J. W., YAGABASAN, B., MOORES, S., SCHWAIGER-HABER, M., FUJIWARA, H., GALE, S., JIANG, X., SIDHU, R., GELMAN, S. J., ZHAO, S., PATTI, G. J., ORY, D. S. & SCHAFFER, J. E. 2021. Loss of SNORA73 reprograms cellular metabolism and protects against steatohepatitis. Nat Commun, 12, 5214. DIO:10.1038/s41467-021-25457-y, PMID:34471131

      TILLMAN, E. J. & ROLPH, T. 2020. FGF21: An Emerging Therapeutic Target for Non-Alcoholic Steatohepatitis and Related Metabolic Diseases. Front Endocrinol (Lausanne), 11, 601290. DIO:10.3389/fendo.2020.601290, PMID:33381084

      ZHENG, J., ZHANG, Q., ZHAO, Z., QIU, Y., ZHOU, Y., WU, Z., JIANG, C., WANG, X. & JIANG, X. 2023. Epigenetically silenced lncRNA SNAI3-AS1 promotes ferroptosis in glioma via perturbing the m(6)A-dependent recognition of Nrf2 mRNA mediated by SND1. J Exp Clin Cancer Res, 42, 127. DIO:10.1186/s13046-023-02684-3, PMID:37202791

      (3) The role of PPARg in fatty liver diseases might be a rodent-specific phenomenon. PPARg agonist treatment in humans may actually reduce ectopic fat deposition by increasing fat storage in adipose tissues. The relevance of the findings to human diseases should be discussed.

      We thank the reviewer for the detailed comment. As a transcription regulator of Cd36 and Cidea/c, it is well known that PPARγ plays major adipogenic and lipogenic roles in adipose tissue. Although the expression of PPARγ in the liver is very low under healthy conditions, induced expression of PPARγ in both hepatocytes and non-parenchymal cells (Kupffer cells, immune cells, and hepatic stellate cells (HSCs)) in the liver has a crucial role in the pathophysiology of MASLD (Lee et al., 2023b, Chen et al., 2023, Gross et al., 2017). The activation of PPARγ in the liver induces the adipogenic program to store fatty acids in lipid droplets as observed in adipocytes (Lee et al., 2018). Moreover, the inactivation of liver PPARγ abolished rosiglitazone-induced an increase in hepatic TG and improved hepatic steatosis in lipoatrophic AZIP mice (Gavrilova et al., 2003). Apart from promoting lipogenesis, PPARγ has also a crucial function in improving inflammation and fibrosis (Chen et al., 2023). Furthermore, there is a strong correlation between the onset of hepatic steatosis and hepatocyte-specific PPARγ expression. Clinical trials have also indicated that increased insulin resistance and hepatic PPARγ expressions were associated with NASH scores in some obese patients (Lee et al., 2023a, Mukherjee et al., 2022). Even though PPARγ’s primary function is in adipose tissue, patients with MASLD have much higher hepatic expression levels of PPARγ, reflecting the fact that PPARγ plays different roles in different tissues and cell types (Mukherjee et al., 2022). As these studies mentioned above, our result also hinted at the importance of PPARγ in the pathophysiology of MASLD. Snhg3 deficiency or overexpression respectively induced the decrease or increase in hepatic PPARγ. Moreover, administration of PPARγ antagonist T0070907 mitigated the hepatic Cd36 and Cidea/c increase and improved Snhg3-induced hepatic steatosis. However,  conflicting findings suggest that the expression of hepatic PPARγ is not increased as steatosis develops in humans and in clinical studies and that PPARγ agonists administration didn’t aggravate liver steatosis (Gross et al., 2017). Thus, understanding how the hepatic PPARγ expression is regulated may provide a new avenue to prevent and treat the MASLD (Lee et al., 2018). We also discussed it in revised manuscript, please refer the first paragraph in the section of Discussion in p13.

      References

      CHEN, H., TAN, H., WAN, J., ZENG, Y., WANG, J., WANG, H. & LU, X. 2023. PPAR-gamma signaling in nonalcoholic fatty liver disease: Pathogenesis and therapeutic targets. Pharmacol Ther, 245, 108391. DIO:10.1016/j.pharmthera.2023.108391, PMID:36963510

      GAVRILOVA, O., HALUZIK, M., MATSUSUE, K., CUTSON, J. J., JOHNSON, L., DIETZ, K. R., NICOL, C. J., VINSON, C., GONZALEZ, F. J. & REITMAN, M. L. 2003. Liver peroxisome proliferator-activated receptor gamma contributes to hepatic steatosis, triglyceride clearance, and regulation of body fat mass. J Biol Chem, 278, 34268-76. DIO:10.1074/jbc.M300043200, PMID:12805374

      GROSS, B., PAWLAK, M., LEFEBVRE, P. & STAELS, B. 2017. PPARs in obesity-induced T2DM, dyslipidaemia and NAFLD. Nat Rev Endocrinol, 13, 36-49. DIO:10.1038/nrendo.2016.135, PMID:27636730

      LEE, S. M., MURATALLA, J., KARIMI, S., DIAZ-RUIZ, A., FRUTOS, M. D., GUZMAN, G., RAMOS-MOLINA, B. & CORDOBA-CHACON, J. 2023a. Hepatocyte PPARgamma contributes to the progression of non-alcoholic steatohepatitis in male and female obese mice. Cell Mol Life Sci, 80, 39. DIO:10.1007/s00018-022-04629-z, PMID:36629912

      LEE, S. M., MURATALLA, J., SIERRA-CRUZ, M. & CORDOBA-CHACON, J. 2023b. Role of hepatic peroxisome proliferator-activated receptor gamma in non-alcoholic fatty liver disease. J Endocrinol, 257. DIO:10.1530/JOE-22-0155, PMID:36688873

      LEE, Y. K., PARK, J. E., LEE, M. & HARDWICK, J. P. 2018. Hepatic lipid homeostasis by peroxisome proliferator-activated receptor gamma 2. Liver Res, 2, 209-215. DIO:10.1016/j.livres.2018.12.001, PMID:31245168

      MUKHERJEE, A. G., WANJARI, U. R., GOPALAKRISHNAN, A. V., KATTURAJAN, R., KANNAMPUZHA, S., MURALI, R., NAMACHIVAYAM, A., GANESAN, R., RENU, K., DEY, A., VELLINGIRI, B. & PRINCE, S. E. 2022. Exploring the Regulatory Role of ncRNA in NAFLD: A Particular Focus on PPARs. Cells, 11. DIO:10.3390/cells11243959, PMID:36552725

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      As a general strategy for the revision, I would advise the authors to focus on strengthening the analysis of the liver with the two most important figures being Figure 2 and Figure 3. The mechanism as it stands is problematic which reduces the impact of the animal studies despite substantial efforts from the authors. Consider removing or toning down some of the studies focused on mechanisms in the nucleus, including changing the title.

      We thank the reviewer for the detailed comment. In this study, hepatocyte-specific Snhg3 deficiency decreased body and liver weight, alleviated hepatic steatosis and promoted hepatic fatty acid metabolism in DIO mice, whereas overexpression induced the opposite effect. The hepatic differentially expressed genes (DEGs) between the DIO Snhg3-HKI and control WT mice using RNA-Seq and revealed that Snhg3 exerts a global effect on the expression of genes involved in fatty acid metabolism using GSEA (Figure 4B). RT-qPCR analysis confirmed that the hepatic expression levels of some genes involved in fatty acid metabolism, including Cd36, Cidea/c and Scd1/2, were upregulated in Snhg3-HKO mice and were downregulated in Snhg3-HKI mice compared to the controls (Figure 4C). Moreover, deficiency and overexpression of Snhg3 respectively decreased and increased the expression of profibrotic genes, such as Col1a1 and Col1a2, but had no effects on the pro-inflammatory factors, including Tgfβ1, Tnfα, Il6 and Il1β (figure supplement 3A and B). The results indicated that Snhg3 involved in hepatic steatosis through regulating fatty acid metabolism. Furthermore, PPARγ was selected to study its role in Snhg3-induced hepatic steatosis by integrated analyzing the data from CUT&Tag-Seq, ATAC-Seq and RNA-Seq. Finally, inhibition of PPARγ with T0070907 alleviated Snhg3 induced Cd36 and Cidea/c increases and improved Snhg3-aggravated hepatic steatosis. In summary, we confirmed that SND1/H3K27me3/PPARγ is partially responsible for Sngh3-inuced hepatic steatosis. As the reviewer suggested, we replaced the title with “LncRNA-Snhg3 Aggravates Hepatic Steatosis via PPARγ Signaling”.

      (1) How is steatosis changing in the liver? Is this due to a change in fatty acid uptake, lipogenesis/synthesis, beta-oxidation, trig secretion, etc..? The analysis in Figures 2 and 3 is mostly focused on metabolic chamber studies which seem distracting, particularly in the absence of a mechanism and given a liver-specific perturbation. The authors should use a combination of targeted gene expression, protein blots, and lipid flux measurements to provide better insights here. The histology in Figure 2H suggests a very dramatic effect but does match with lipid measurements in 2I.

      We thank the reviewer for the detailed comment. The pathogenesis of MASLD has not been entirely elucidated. Multifarious factors such as genetic and epigenetic factors, nutritional factors, insulin resistance, lipotoxicity, microbiome, fibrogenesis and hormones secreted from the adipose tissue, are recognized to be involved in the development and progression of MASLD (Buzzetti et al., 2016, Lee et al., 2017, Rada et al., 2020, Sakurai et al., 2021, Friedman et al., 2018). In this study, we investigated the hepatic differentially expressed genes (DEGs) between the DIO Snhg3-HKI and control WT mice using RNA-Seq and revealed that Snhg3 exerts a global effect on the expression of genes involved in fatty acid metabolism using GSEA (Figure 4B). We validated the expression of some DEGs involved in fatty acid metabolism by RT-qPCR. The results showed that the hepatic expression levels of some genes involved in fatty acid metabolism, including Cd36, Cidea/c and Scd1/2 were upregulated in Snhg3-HKO mice and were downregulated in Snhg3-HKI mice compared to the controls (Figure 4C), respectively. Additionally, we re-analyzed the metabolic chamber data using CalR and the results showed that there were no obvious differences in heat production, total oxygen consumption, carbon dioxide production or RER between DIO Snhg3-HKO or DIO Snhg3-HKI and the corresponding control mice (figure supplement 1C and 2C). Unfortunately, we did not detect lipid flux due to limited experimental conditions. However, in summary, our results indicated that Snhg3 is involved in hepatic steatosis by regulating fatty acid metabolism. Please check them in the first paragraph in p8.

      Additionally, we determined the hepatic TC levels in other batch of DIO Snhg3-HKO and control mice and found there was no difference in hepatic TC (as below) between DIO Snhg3-HKO and control mice fed HFD 18 weeks. Perhaps the apparent difference in TC requires a prolonged high-fat diet feeding time.

      Author response image 1.

      Hepatic TC contents of in DIO Snhg3-Flox and Snhg3-HKO mice.

      References

      BUZZETTI, E., PINZANI, M. & TSOCHATZIS, E. A. 2016. The multiple-hit pathogenesis of non-alcoholic fatty liver disease (NAFLD). Metabolism, 65, 1038-48. DIO:10.1016/j.metabol.2015.12.012, PMID:26823198

      FRIEDMAN, S. L., NEUSCHWANDER-TETRI, B. A., RINELLA, M. & SANYAL, A. J. 2018. Mechanisms of NAFLD development and therapeutic strategies. Nat Med, 24, 908-922. DIO:10.1038/s41591-018-0104-9, PMID:29967350

      LEE, J., KIM, Y., FRISO, S. & CHOI, S. W. 2017. Epigenetics in non-alcoholic fatty liver disease. Mol Aspects Med, 54, 78-88. DIO:10.1016/j.mam.2016.11.008, PMID:27889327

      RADA, P., GONZALEZ-RODRIGUEZ, A., GARCIA-MONZON, C. & VALVERDE, A. M. 2020. Understanding lipotoxicity in NAFLD pathogenesis: is CD36 a key driver? Cell Death Dis, 11, 802. DIO:10.1038/s41419-020-03003-w, PMID:32978374

      SAKURAI, Y., KUBOTA, N., YAMAUCHI, T. & KADOWAKI, T. 2021. Role of Insulin Resistance in MAFLD. Int J Mol Sci, 22. DIO:10.3390/ijms22084156, PMID:33923817

      (2) Throughout the manuscript the authors make claims about liver disease models, but this is not well supported since markers of advanced liver disease are not examined. The authors should stain and show expression for fibrosis and inflammation.

      We thank the reviewer for the detailed comment. Metabolic dysfunction-associated fatty liver disease (MASLD) is characterized by excess liver fat in the absence of significant alcohol consumption. It can progress from simple steatosis to metabolic dysfunction-associated steatohepatitis (MASH) and fibrosis and eventually to chronic progressive diseases such as cirrhosis, end-stage liver failure, and hepatocellular carcinoma (Loomba et al., 2021). As the reviewer suggested, we detected the effect of Snhg3 on liver fibrosis and inflammation. The results showed no hepatic fibrosis phenotype was seen in Snhg3-HKO and Snhg3-HKI mice (figures supplement 1D and 2D). Moreover, deficiency and overexpression of Snhg3 respectively decreased and increased the expression of profibrotic genes, such as collagen type I alpha 1/2 (Col1a1 and Col1a2), but had no effects on the pro-inflammatory factors including Tgf-β, Tnf-α, Il-6 and Il-1β (figure supplement 3A and 3B). Inflammation is an absolute requirement for fibrosis because factors from injured hepatocytes alone are not sufficient to directly activate HSCs and lead to fibrosis (Kisseleva and Brenner, 2021). Additionally, previous studies indicated that exposure to HFD for more 24 weeks causes less severe fibrosis (Alshawsh et al., 2022). In future, the effect of Snhg3 on hepatic fibrosis in mice need to be elucidated by prolonged high-fat feeding or by adopting methionine- and choline deficient diet (MCD) feeding. Please check them in the second paragraph in the section of Discussion in p13.

      References

      ALSHAWSH, M. A., ALSALAHI, A., ALSHEHADE, S. A., SAGHIR, S. A. M., AHMEDA, A. F., AL ZARZOUR, R. H. & MAHMOUD, A. M. 2022. A Comparison of the Gene Expression Profiles of Non-Alcoholic Fatty Liver Disease between Animal Models of a High-Fat Diet and Methionine-Choline-Deficient Diet. Molecules, 27. DIO:10.3390/molecules27030858, PMID:35164140

      KISSELEVA, T. & BRENNER, D. 2021. Molecular and cellular mechanisms of liver fibrosis and its regression. Nat Rev Gastroenterol Hepatol, 18, 151-166. DIO:10.1038/s41575-020-00372-7, PMID:33128017

      LOOMBA, R., FRIEDMAN, S. L. & SHULMAN, G. I. 2021. Mechanisms and disease consequences of nonalcoholic fatty liver disease. Cell, 184, 2537-2564. DIO:10.1016/j.cell.2021.04.015, PMID:33989548

      (3) Publicly available datasets show that PPARG protein is not expressed in the liver (Science 2015 347(6220):1260419, PMID: 25613900). Are the authors sure this is not an effect on another PPAR isoform like alpha? ChIP and RNA-seq pathway readouts do not distinguish between different isoforms.

      We thank the reviewer for the detailed comment. As a transcription regulator of Cd36 and Cidea/c, it is well known that PPARγ plays major adipogenic and lipogenic roles in adipose tissue. Although the expression of PPARγ in the liver is very low under healthy conditions, induced expression of PPARγ in both hepatocytes and non-parenchymal cells (Kupffer cells, immune cells, and hepatic stellate cells (HSCs)) in the liver has a crucial role in the pathophysiology of MASLD (Lee et al., 2023b, Chen et al., 2023, Gross et al., 2017). The activation of PPARγ in the liver induces the adipogenic program to store fatty acids in lipid droplets as observed in adipocytes (Lee et al., 2018). Moreover, the inactivation of liver PPARγ abolished rosiglitazone-induced an increase in hepatic TG and improved hepatic steatosis in lipoatrophic AZIP mice (Gavrilova et al., 2003). Apart from promoting lipogenesis, PPARγ has also a crucial function in improving inflammation and fibrosis (Chen et al., 2023). Furthermore, there is a strong correlation between the onset of hepatic steatosis and hepatocyte-specific PPARγ expression. Clinical trials have also indicated that increased insulin resistance and hepatic PPARγ expressions were associated with NASH scores in some obese patients (Lee et al., 2023a, Mukherjee et al., 2022). Even though PPARγ’s primary function is in adipose tissue, patients with MASLD have much higher hepatic expression levels of PPARγ, reflecting the fact that PPARγ plays different roles in different tissues and cell types (Mukherjee et al., 2022). As these studies mentioned above, our result also hinted at the importance of PPARγ in the pathophysiology of MASLD. Snhg3 deficiency or overexpression respectively induced the decrease or increase in hepatic PPARγ. Moreover, administration of PPARγ antagonist T0070907 mitigated the hepatic Cd36 and Cidea/c increase and improved Snhg3-induced hepatic steatosis. However,  conflicting findings suggest that the expression of hepatic PPARγ is not increased as steatosis develops in humans and in clinical studies and that PPARγ agonists administration didn’t aggravate liver steatosis (Gross et al., 2017). Thus, understanding how the hepatic PPARγ expression is regulated may provide a new avenue to prevent and treat the MASLD (Lee et al., 2018). We also discussed it in revised manuscript, please refer the first paragraph in the section of Discussion in p13 in revised manuscript.

      PPARα, most highly expressed in the liver, transcriptionally regulates lipid catabolism by regulating the expression of genes mediating triglyceride hydrolysis, fatty acid transport, and β-oxidation. Activators of PPARα decrease plasma triglycerides by inhibiting its synthesis and accelerating its hydrolysis (Chen et al., 2023). Mice with deletion of the Pparα gene exhibited more hepatic steatosis under HFD induction. As the reviewer suggested, we investigated the effect of Snhg3 on Pparα expression.  The result showed that both deficiency of Snhg3 or overexpression of Snhg3 doesn’t affect the mRNA level of Pparα as showing below, indicating that Snhg3-induced lipid accumulation independent on PPARα. Additionally, the exon, upstream 2k, 5’-UTR and intron regions of Pparγ, not Pparα, were enriched with the H3K27me3 mark (fold_enrichment = 4.15697) in the liver of DIO Snhg3-HKO mice using the CUT&Tag assay (table supplement 8), which was further confirmed by ChIP (Figure 6F and G). Therefore, we choose PPARγ to study its role in Sngh3-induced hepatic steatosis by integrated analyzing the data from CUT&Tag-Seq, ATAC-Seq and RNA-Seq.

      Author response image 2.

      The mRNA levels of hepatic Pparα expression in DIO Snhg3-HKO mice and Snhg3-HKI mice compared to the controls.

      References

      CHEN, H., TAN, H., WAN, J., ZENG, Y., WANG, J., WANG, H. & LU, X. 2023. PPAR-gamma signaling in nonalcoholic fatty liver disease: Pathogenesis and therapeutic targets. Pharmacol Ther, 245, 108391. DIO:10.1016/j.pharmthera.2023.108391, PMID:36963510

      GAVRILOVA, O., HALUZIK, M., MATSUSUE, K., CUTSON, J. J., JOHNSON, L., DIETZ, K. R., NICOL, C. J., VINSON, C., GONZALEZ, F. J. & REITMAN, M. L. 2003. Liver peroxisome proliferator-activated receptor gamma contributes to hepatic steatosis, triglyceride clearance, and regulation of body fat mass. J Biol Chem, 278, 34268-76. DIO:10.1074/jbc.M300043200, PMID:12805374

      GROSS, B., PAWLAK, M., LEFEBVRE, P. & STAELS, B. 2017. PPARs in obesity-induced T2DM, dyslipidaemia and NAFLD. Nat Rev Endocrinol, 13, 36-49. DIO:10.1038/nrendo.2016.135, PMID:27636730

      LEE, S. M., MURATALLA, J., KARIMI, S., DIAZ-RUIZ, A., FRUTOS, M. D., GUZMAN, G., RAMOS-MOLINA, B. & CORDOBA-CHACON, J. 2023a. Hepatocyte PPARgamma contributes to the progression of non-alcoholic steatohepatitis in male and female obese mice. Cell Mol Life Sci, 80, 39. DIO:10.1007/s00018-022-04629-z, PMID:36629912

      LEE, S. M., MURATALLA, J., SIERRA-CRUZ, M. & CORDOBA-CHACON, J. 2023b. Role of hepatic peroxisome proliferator-activated receptor gamma in non-alcoholic fatty liver disease. J Endocrinol, 257. DIO:10.1530/JOE-22-0155, PMID:36688873

      LEE, Y. K., PARK, J. E., LEE, M. & HARDWICK, J. P. 2018. Hepatic lipid homeostasis by peroxisome proliferator-activated receptor gamma 2. Liver Res, 2, 209-215. DIO:10.1016/j.livres.2018.12.001, PMID:31245168

      MUKHERJEE, A. G., WANJARI, U. R., GOPALAKRISHNAN, A. V., KATTURAJAN, R., KANNAMPUZHA, S., MURALI, R., NAMACHIVAYAM, A., GANESAN, R., RENU, K., DEY, A., VELLINGIRI, B. & PRINCE, S. E. 2022. Exploring the Regulatory Role of ncRNA in NAFLD: A Particular Focus on PPARs. Cells, 11. DIO:10.3390/cells11243959, PMID:36552725

      (4) Previous work suggests that SNHG3 regulates its neighboring gene MED18 which is an important regulator of global transcription. Could some of the observed effects be due to changes in MED18 or other neighboring genes?

      We thank the reviewer for the detailed comment. Previous work suggested that human SNHG3 promotes progression of gastric cancer by regulating neighboring MED18 gene methylation (Xuan and Wang, 2019). Here, we studied the effect of mouse Snhg3 on Med18 and the result showed that Snhg3 had no effect on the mRNA levels of Med18 (as below). Additionally, we also tested the effect of mouse Snhg3 on its neighboring gene, regulator of chromosome condensation 1 (Rcc1). Although deficiency of Snhg3 inhibited the mRNA level of Rcc1, overexpression of Snhg3 doesn’t affect the mRNA level of Rcc1 as showing below. RCC1, the only known guanine nucleotide exchange factor in the nucleus for Ran, a nuclear Ras-like G protein, directly participates in cellular processes such as nuclear envelope formation, nucleocytoplasmic transport, and spindle formation (Ren et al., 2020). RCC1 also regulates chromatin condensation in the late S and early M phases of the cell cycle. Many studies have found that RCC1 plays an important role in tumors. Furthermore, whether Rcc1 mediates the alleviated effect on MASLD of Snhg3 needs to be further investigated.

      Author response image 3.

      The mRNA levels of hepatic Rcc1 and Med18 expression in DIO Snhg3-HKO mice and Snhg3-HKI mice compared to the controls.

      References

      REN, X., JIANG, K. & ZHANG, F. 2020. The Multifaceted Roles of RCC1 in Tumorigenesis. Front Mol Biosci, 7, 225. DIO:10.3389/fmolb.2020.00225, PMID:33102517

      XUAN, Y. & WANG, Y. 2019. Long non-coding RNA SNHG3 promotes progression of gastric cancer by regulating neighboring MED18 gene methylation. Cell Death Dis, 10, 694. DIO:10.1038/s41419-019-1940-3, PMID:31534128

      (5) The claim that Snhg3 regulates SND1 protein stability seems subtle. There is data inconsistency between different panels regarding this regulation including Figure 5I, Figure 6A, and Figure 7E. In addition, is ubiquitination happening in the nucleus where Snhg3 is expressed?

      We thank the reviewer for the detailed comment. The effect of Snhg3-induced SND1 expression had been confirmed by western blotting, please check them in Figure 5I, Figure 6A, Figure 7E and corresponding primary data. Additionally, Snhg3-induced SND1 protein stability seemed subtle, indicating there may be other mechanism by which Snhg3 promotes SND1, such as riboregulation. We had added it in the section of Discussion, please check it in the second paragraph in p16.

      Additionally, we did not detect the sites where SND1 is modified by ubiquitination. Our results showed that Snhg3 was more localized in the nucleus (Figure 1D) and Snhg3 also promoted the nuclear localization of SND1 (Figure 5O). We had revised the diagram of Snhg3 action in Figure 8G. Please check them in revised manuscript.

      (6) The authors show that the loss of Snhg3 changes the global H3K27me3 level. Few enzymes modify H3K27me3 levels. Did the authors check for an interaction between EZH2, Jmjd3, UTX, and Snhg3/SND1?

      We thank the reviewer for the detailed comment. It is crucial to ascertain whether SND1 itself functions as a new demethylase or if it influences other demethylases, such as Jmjd3, enhancer of zeste homolog 2 (EZH2), and ubiquitously transcribed tetratricopeptide repeat on chromosome X (UTX). The precise mechanism by which SND1 regulates H3K27me3 is still unclear and hence requires further investigation. We had added the limitations in the section of Discussion and please check it in the third paragraph in p17.

      (7) Can the authors speculate if the findings related to Snhg3/SND1 extend to humans?

      We thank the reviewer for the detailed comment. Since the sequence of Snhg3 is not conserved between mice and humans, the findings in this manuscript may not be applicable to humans, but the detail need to be further exploited.

      (8) As a general rule the figures are too small or difficult to read with limited details in the figure legends which limits evaluation. For example, Figure 1B and almost all of 4 cannot read labels. Figure 2, cannot see the snapshots show of mice or livers. What figure is supporting the claim that snhg3KI are more 'hyper-accessible'? Can the authors clarify what Figure 4H is referring to?

      We thank the reviewer for the detailed comment. We have provided high quality figures in our revised manuscript.

      The ‘hyper-accessible’ state in the liver of Snhg3-HKI mice was inferred by the differentially accessible regions (DARs), that is, we discovered 4305 DARs were more accessible in Snhg3-HKI mice and only 2505 DARs were more accessible in control mice and please refer table supplement 3).

      The result of Figure 4H about heatmap for Cd36 was from hepatic RNA-seq of DIO Snhg3-HKI and control WT mice. For avoiding ambiguity, we have removed it.

      (9) Authors stated that upon Snhg3 knock out, more genes are upregulated(1028) than downregulated(365). This description does not match Figure 4A. It seems in Figure 4A there are equal numbers of up and downregulated genes.

      We thank the reviewer for the detailed question. We apologized for this mistake and have corrected it.

      (10) Provide a schematic of the knockout and KI strategy in the supplement.

      We thank the reviewer for the detailed comment. We had included the knockout and KI strategy in figure supplement 1A and B, and 2A.

      Reviewer #2 (Recommendations For The Authors):

      (1) Metabolic cage data need to be reanalyzed with CalR (particularly when the body weights are significantly different).

      We thank the reviewer for the detailed comment. We reanalyzed the metabolic cage data using CalR (Mina et al., 2018). The results showed that there were no obvious differences in heat production, total oxygen consumption, carbon dioxide production and the respiratory exchange ratio between DIO Snhg3-HKO and control mice. Similar to DIO Snhg3-HKO mice, there was also no differences in heat production, total oxygen consumption, carbon dioxide production, and RER between DIO Snhg3-HKI mice and WT mice. Please check them in figure supplement 1C and 2C, and Mouse Calorimetry in Materials and Methods.

      Reference

      MINA, A. I., LECLAIR, R. A., LECLAIR, K. B., COHEN, D. E., LANTIER, L. & BANKS, A. S. 2018. CalR: A Web-Based Analysis Tool for Indirect Calorimetry Experiments. Cell Metab, 28, 656-666 e1. DIO:10.1016/j.cmet.2018.06.019, PMID:30017358

      (2) ITT in Figure 2F should also be presented as % of the initial glucose level, which would reveal that there is no difference between WT and KO.

      We thank the reviewer for the detailed comment. We repeated ITT experiment and include the new data in revised manuscript, please check it in Figure 2C.

      (3) The fasting glucose results are inconsistent between ITT and GTT. Is there any difference in fasting glucose?

      We thank the reviewer for the questions. The difference between GTT and ITT was caused owing to different fasting time, that is, mice were fasted for 6 h in ITT and were fasted for 16 h in GTT. It seems that Snhg3 doesn’t affect short- and longer-time fasting glucose levels and please refer Figures 2C and 3C.

    1. eLife assessment

      This study presents important findings linking circHMGCS1 and miR-4521 in diabetes-induced vascular endothelial dysfunction. Overall, the evidence supporting the claims of the authors is convincing. The work will be of interest to biomedical scientists working with cardiovascular and/or RNA biology, particularly those studying diabetes.

    2. Reviewer #1 (Public Review):

      This study presents a valuable finding on the expression levels of circHMGCS1 regulating arginase-1 by sponging miR-4521observed in diabetes-induced vascular endothelial dysfunction, leading to decrease in vascular nitric oxide secretion and inhibition of endothelial nitric oxide synthase activity. Further, increase in the expression of adhesion molecules and generation of cellular reactive oxygen species reduced vasodilation and accelerated the impairment of vascular endothelial function.<br /> Modulating circHMGCS1/miR-4521/ARG1 axis could serve as a potential strategy to prevent diabetes-associated cardiovascular diseases.

      Comments on revised version:

      The authors answered all questions satisfactorily.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors observed an aggravated vascular endothelial dysfunction upon overexpressing circHMGCS1 and inhibiting miR-4521. This study discovered that circHMGCS1 promotes arginase 1 expression by sponging miR-4521, which accelerated the impairment of vascular endothelial function.

      Strengths:

      The study is systematic and establishes the regulatory role of the circHMGCS1-miR-4521 axis in diabetes-induced cardiovascular diseases.

      Weaknesses:

      (1) The authors show direct evidence of interaction between circHMGCS1 and miR-4521 by pulldown assay. However, the changes in miRNA expression opposite to the levels of target circRNA could be through Target RNA-Directed MicroRNA Degradation. Since the miRNA level is downregulated, the downstream target gene is expected to be upregulated even in the absence of circRNA.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      HMGCS1, 3-hydroxy-3-methylglutaryl-CoA synthase1 is predicted to be involved in Acetyl-CoA metabolic process and mevalonate-cholesterol pathway. To induce diet-induced diabetes, they fed wild-type littermates either a standard chow (Control) or a high fat-high sucrose (HFHG) diet, where the diet composition consisted of 60% fat, 20% protein, and 20% carbohydrate (H10060, Hfkbio, China). The dietary regimen was maintained for 14 weeks. Throughout this period, body weight and fasting blood glucose (FBG) levels were measured on a weekly basis. Although the authors induced diabetes with a diet also rich in fat, the cholesterol concentration or metabolism was not investigated. After the treatment, were the animals with endothelial dysfunction? How was the blood pressure of the animals?

      Thank you for your comments and kind suggestions. We have conducted a study on the impact of HFHG diet on the serum levels of total cholesterol(T-CHO) in mice over a 14-week period. Our findings indicated that the HFHG diet significantly elevated T-CHO levels in the serum of mice (Supplementary Figure 5E). Additionally, HFHG diet was associated with an increased in blood pressure (Figure 5F) and it exacerbated the progression of endothelial dysfunction in mice (Figure 5H-L).

      Strengths:

      To explore the potential role of circHMGCS1 in regulating endothelial cell function, the authors cloned exons 2-7 of HMGCS1 into lentiviral vectors for ectopic overexpression of circHMGCS1 (Figure S2). The authors could use this experiment as a concept proof and investigate the glucose concentration in the cell culture medium. Is the pLV-circ HMGCS1 transduction in HUVEC increasing the glucose release? (Line 163)

      In the manuscript, we utilized a DMEM culture medium containing 4500 mg/L glucose. Given that the HUVEC cell culture is glucose-dependent for its metabolic processes, it was challenging to precisely evaluate the relationship between pLV-circHMGCS1 transduction and the glucose concentration in the medium.

      Weaknesses:

      (1) Pg 20. The cells were transfected with miR-4521 mimics, miR-inhibitor, or miR-NC and incubated for 24 hours. Subsequently, the cells were treated with PAHG for another 24 hours. Were the cells transfected with lipofectanine? The protocol or the lipofectamine kit used should be described. The lipofectamine protocol suggests using an incubation time of 72 hours. Why did the authors incubate for only 24 hours? If the authors did the mimic and inhibitor curves, these should be added to the supplementary figures. Please, describe the miRNA mimic and antagomir concentration used in cell culture.

      For detailed transfection methods of miRNA mimic and its inhibitor, please refer to “Transfection of miRNA mimic or inhibitor” (Line 587) in the revised Experimental Section. We employed the Hieff Trans®siRNA/miRNA in vitro transfection reagent (yeason, China, 40806ES03), with a transfection duration of 48h. The miR-4521 content in HUVEC post-transfection was quantified using qRT-PCR. The transfection of the miR-4521 mimic for 48h notably enhanced its expression in HUVEC (Supplementary Figure 3B), whereas the transfection of the miR-4521 inhibitor for the same duration significantly suppressed its expression (Supplementary Figure 3C). The concentration used for both miRNA mimic and inhibitor transfection was 50 nM. In the revised manuscript, we have corrected the transfection time and clarified that we did not utilize miRNA antagomirs in our experiments.

      (2) Pg 20, line 507. What was the miR-4521 agomiR used to treatment of the animals?

      miRNA agomir serves as a valuable experimental tool for elucidating miRNA function, used to simulate the overexpression of a specific miRNA. miRNA agomir is a chemically modified RNA molecule identical in sequence to the target miRNA, engineered for enhanced stability and transfection efficacy. Utilizing miRNA agomir enables the overexpression of the target miRNA, facilitating the investigation of miRNA functions and mechanism in vivo. In our study, we have employed miRNA mimic for cellular studies and miRNA agomir in vivo applications to achieve high expression of miRNA (Fu et al, 2019).

      (3) Figure 1B. The results are showing the RT-qPCR for only 5 circRNA, however, the results show 48 circRNAs were upregulated, and 18 were downregulated (Figure S1D). Why were the other cicRNAs not confirmed? The circRNAs upregulated with high expression are not necessarily with the best differential expression comparing control vs. PAHG groups. Furthermore, Figure 1A and S1D show circRNAs downregulated also with high expression. Why were these circRNAs not confirmed?

      Our study aims to the identification of potential biomarkers for endothelial dysfunction in type 2 diabetes, To the end, we focused on circRNAs that exhibited significant upregulation following PAHG treatment. In our sequencing data, the p-values for these top upregulated circRNAs were notably below the threshold of 0.001, prompting their selection for further validation. We employed qRT-PCR to ascertain the consistency of their expression levels with the RNA-sequencing findings. Among these, circHMGCS1 was identified as a promising candidate with regulatory potential in endothelial dysfunction. Additionally, circRNAs that were significantly downregulated will be the subject of our ongoing research endeavors.

      (4) Figure 1B shows the relative circRNAs expression. Were host genes expressed in the same direction?

      circRNAs are generated from specific exons or introns of their host genes, either individually or in combination, and the main function of circRNA depends on its non-coding RNA characteristics. The expression levels of circRNAs is not necessarily correlated with those of their host genes, and similarly, the function of circRNAs do not inherently relate to the functions of the host genes (Kristensen et al, 2019; Liu & Chen, 2022). Consequently, the data presented in Figure 1B were primarily aimed at validating the accuracy of circRNA-seq. Although we did not conduct host gene expression analysis for the identified circRNAs, our subsequent results indicated that the overexpression of circHMGCS1 did not influence the expression levels of HMGCS1 (Figure 2A).

      (5) Line 128. The circRNA RT-qPCR methodology was not described. The methodology should be described in detail in the Methods Session.

      The only difference between the circRNA RT-qPCR method and other gene detection is that random primers need to be used for reverse transcription during the reverse transcription process. Unlike linear RNAs that possess a 3' polyA tail, which allows for the use of oligo(dT) primers, circRNAs require random primers to initiate the reverse transcription process. Beyond this distinction, the other processes are no different from the common qRT-PCR process. We have revised the Isolation of RNA and miRNA for quantitative Real Time-PCR (qRT-PCR) analysis method in the revised version (Line 695).

      (6) Line 699. The relative gene expression was calculated using the 2-ΔΔCt method. This is not correct, the expression for miRNA and gene expression are represented in percentage of control.

      We initially employed the 2^-ΔΔCt method to ascertain the relative gene expression levels. Subsequently, we scaled all values by a factor of 100 to amplify the visual representation of the observed variations, thereby enhancing the visualization of the data.

      (7) Line 630. Detection of ROS for tissue and cells. The methodology for tissue was described, but not for cells.

      We have added the detailed description of the cellular ROS detection methods in the revised manuscript as follows:

      For ROS detection in cells, the treated cells were washed once by PBS, then 20 μM DHE was added, and incubated at 37°C for 30 min away from light, then washed three times by PBS and then colorless DMEM medium was added, followed by fluorescence microscopy for observation (Line 640-643).

      (8) Line 796. RNA Fluorescent In Situ Hybridization (RNA-FISH). Figure 1F shows that the RNA-Fluorescence in situ hybridization (RNA-FISH) confirmed the robust expression of cytoplasmic circHMGCS1 in HUVECs (Figure 1F). However, in the methods, lines 804 and 805 described the probes targeting circMAP3K5 and miR-4521 were applied to the sections. Hybridization was performed in a humid chamber at 37C overnight. Is it correct?

      We have made a correction in the revised manuscript. The accreted description is "the probes targeting circHMGCS1 and miR-4521 were applied to the sections"(Line816).

      (9) Line 14. Fig 1-H. The authors discuss qRT-PCR demonstrated that circHMGCS1 displayed a stable half-life exceeding 24 h, whereas the linear transcript HMGCS1 mRNA had a half-life less than 8 h (Figure 1H). Several of the antibodies may contain trace amounts of RNases that could degrade target RNA and could result in loss of RNA hybridization signal or gene expression. Thus, all of the solutions should contain RNase inhibitors. The HMGCS1 mRNA expression could be degraded over the incubation time (0-24hs) leading to incorrect results. Moreover, in the methods is not mentioned if the RNAse inhibitor was used. Please, could the authors discuss and provide information?

      This experiment was performed in cell culture as described in our Experimental Methods (Line 753), where we added actinomycin D directly into the cell culture well plates, and the cells remained in a healthy state during this treatment. We did not directly extract mRNA from cells for this experiment. Additionally, all solutions utilized throughout the whole experiment were prepared using Rnase-free water, ensuring that the integrity of the mRNA.

      (10) Further experiments demonstrated that the overexpression of circHMGCS1 stimulated the expression of adhesion molecules (VCAM1, ICAM1, and ET-1) (Figures 2B and 2C), suggesting that circHMGCS1 is involved in VED. How were these genes expressed in the RNA-seq?

      In the manuscript, we only focused exclusively on circRNA and miRNA sequencing, and not perform mRNA sequencing, Consequently, we employed qRT-PCR and Western blot to assess the expression alterations of ET-1, ICAM1, and VCAM1 at gene and protein level. The findings revealed that the overexpression of circHMGCS1 significantly upregulated the expression of adhesion molecules (VCAM1, ICAM1, and ET-1).

      (11) Line 256. By contrast, the combined treatment of circHMGCS1 and miR-4521 agomir did not significantly affect the body weight and blood glucose levels. OGTT and ITT experiments demonstrated that miR-4521 agomir considerably enhanced glucose tolerance and insulin resistance in diabetic mice (Figures 5C, 5D, and Figures S5B and S5C). Why did the miR-4521 agomir treatment considerably enhance glucose tolerance and insulin resistance in diabetic mice, but not the blood glucose levels?

      Our results showed that miR-4521 agomir could effectively suppress the increase of body weight and blood glucose in mice (Figure 5A-B).

      (12) In the experiments related to pull-down, the authors performed Biotin-coupled miR-4521 or its mutant probe, which was employed for circHMGCS1 pull-down. This result only confirms the Luciferase experiments shown in Figure 4A. The experiment that the authors need to perform is pull-down using a biotin-labeled antisense oligo (ASO) targeting the circHMGCS1 backsplice junction sequence followed by pulldown with streptavidin-conjugated magnetic beads to capture the associated miRNAs and RNA binding proteins (RBPs). Also, the ASO pulldown assay can be coupled to miRNA RT-qPCR and western blotting analysis to confirm the association of miRNAs and RBPs predicted to interact with the target circRNA.

      This point is correct. As suggested, we utilized a biotin-labeled circHMGCS1 probe for pull down experiments. Because circRNA-miRNA interactions are mainly mediated by the RNA-induced silencing complex, which includes Argonaute 2 (AGO2), we examined the levels of miR-4521 and AGO2 in the capture meterial. Our results demonstrated that circHMGCS1 significantly captured miR-4521 in the cells, with a concomitant acquisition of AGO2. These findings have been integrated into the revised manuscript (Supplementary Figures 4D and 4E).

      (13) In Figure 5, the authors showed that the results suggest that miR-4521 can inhibit the occurrence of diabetes, whereas circHMGCS1 specifically dampens the function of miR-4521, weakening its protective effect against diabetes. In this context, what are the endogenous target genes for the miR-4521 that could be regulating diabetes?

      In this study, we focused on the role of miR-4521 in endothelial function. Our animal experiments involving ARG1 knockdown revealed that the reduction of ARG1 expression resulted in the inability of miR-4521 to modulate the progression of type 2 diabetes. Consequently, ARG1 is likely an endogenous target gene of miR-4521, potentially implicated in the regulation of diabetes.

      (14) In the western blot of Figure 5, the β-actin band appears to be different from the genes analyzed. Was the same membrane used for the four proteins? The Ponceau S membrane should be provided.

      As described in our experimental methodology (Western blot analysis), we have utilized PVDF membranes for our Western blot experiments. β-actin, recognized for its high expression and specificity as a housekeeping gene, yields distinct bands with minimal background noise. This property can lead to the migration β-actin from the spot wells to both sides during electrophoresis. So much so that it is not aligned with the lane shown by the target gene. And the other 3 genes can see the phenomenon of obvious lane because their expression is not as high as β-actin. We replaced β-actin with a similar background in the revised manuscript (Figure 5L).

      (15) Why did the authors use AAV9, since the AAV9 has a tropism for the liver, heart, skeletal muscle, and not to endothelial vessels?

      AAV9 has garnered significant interest as a gene delivery vector due to its extensive tissue penetration, minimal immunogenicity, and stable gene expression profile. Its application in cardiovascular disease research and therapy has been widely reported (Barbon et al, 2023; Yao et al, 2018; Zincarelli et al, 2008). Meanwhile, we employed AAV9 for gene delivery via the tail vein injection in mice, and as shown in Figure 5J and Figure 7Q, we observed GFP signals carried by AAV9 in the thoracic aorta of mice. These findings suggest that AAV9 possesses the capability to infect endothelial cells effectively.

      Reviewer #2 (Public Review):

      Summary:

      The authors observed an aggravated vascular endothelial dysfunction upon overexpressing circHMGCS1 and inhibiting miR-4521. This study discovered that circHMGCS1 promotes arginase 1 expression by sponging miR-4521, which accelerated the impairment of vascular endothelial function.

      Strengths:

      The study is systematic and establishes the regulatory role of the circHMGCS1-miR-4521 axis in diabetes-induced cardiovascular diseases.

      Weaknesses:

      (1) The authors selected the miR-4521 as the target based on their reduced expression upon circHMGCS1 overexpression. Since the miRNA level is downregulated, the downstream target gene is expected to be upregulated even in the absence of circRNA. The changes in miRNA expression opposite to the levels of target circRNA could be through Target RNA-Directed MicroRNA Degradation. In addition, miRNA can also be stabilized by circRNAs. Hence, selecting miRNA targets based on opposite expression patterns and concluding miRNA sponging by circRNA needs further evidence of direct interactions.

      Thank you for your positive comments and kind suggestions.

      As suggested by Public Reviewer #1 (12), we employed a biotin-tagged circHMGCS1 to capture miR-4521 and AGO2 in HUVECs (Supplementary Figures 4D and 4E), and Dual luciferase assays have confirmed that miR-4521 can bind to circHMGCS1 directly. Furthermore, RNA pull down and RIP assays have demonstrated the direct binding capability of circHMGCS1 for miR-4521. Collectively, these findings underscore the direct interaction between circHMGCS1 and miR-4521.

      (2) The majority of the experiments were performed with an overexpression vector which can generate a lot of linear RNAs along with circRNAs. The linear RNAs produced by the overexpression vectors can have a similar effect to the circRNA due to sequence identity.

      In our manuscript, the employed vectors incorporate reverse repeat sequences that facilitate efficient circularization of circRNAs. This design ensures robust circular shearing upon the insertion of circRNA sequences into the polyclonal sites, thereby enhancing the overexpression of circRNAs (Supplementary Figure 2). Moreover, we used lentiviral virus as a vector for circRNA overexpression, not direct plasmid transfection. As demonstrated in Figure 2A, upon overexpression of circHMGCS1, we observed a significant upregulation in circHMGCS1 levels compared to the pLV-circNC and Control groups. Notably, the expression levels of the linear HMGCS1 mRNA did not exhibit significant alterations.

      (3) There is a lack of data of circHMGCS1 silencing and its effect on target miRNA & mRNAs.

      According to your suggestion, we employed shRNA to knockdown circHMGCS1 in HUVEC, and qRT-PCR was used to assess the expression levels of miR-4521 and ARG1. The knockdown of circHMGCS1 significantly inhibit the expression of circHMGCS1 in HUVEC without obviously affecting the levels of HMGCS1 mRNA. We then selected circHMGCS1 shRNA1 for further investigation. We observed that the knockdown of circHMGCS1 resulted in an upregulation of miR-4521 and a downregulation of ARG1 expression.

      Author response image 1.

      The impact of circHMGCS1 knockdown on ARG1 and miR-4521 expression levels in HUVEC. The cells were transfected with either circHMGCS1 shRNA1 or circHMGCS1 shRNA2, and the expressions levels of circHMGCS1 and HMGCS1 (A), miR-4521 (B) and ARG1 (C and D) in HUVECs were detected by qRT-PCR and Western blot. n=3 in each group. *p < 0.05, **p < 0.01. All significant difference was determined by one-way ANOVA followed by Bonferroni multiple comparison post hoc test, error bar indicates SD.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      I suggest improving the discussion based on the literature.

      (1) Line 131. .... (hsa_circ_0008621, 899 nt in length, identified as circHMGCS1 in subsequent studies because of its host gene being HMGCS1). Please, provide the reference.

      We appreciate the valuable comments. We have made changes for improvement, which is add in Line 133(Liang et al, 2021).

      (2) The authors conclude that both in vitro and in vivo data suggest that the miR-4521 or circHMGCS1 fails to regulate the effect of diabetes-induced VED in the absence of ARG1. Therefore, ARG1 may serve as a promising VED biomarker, and circHMGCS1 and miR-4521 play a key role in regulating diabetes-induced VED by ARG1. In this context, they should re-evaluate whether this is the best title. "Circular RNA HMGCS1 sponges miR-4521 to aggravate type 2 diabetes-induced vascular endothelial dysfunction"

      This manuscript initiates its exploration with circRNA as the focal point of study (Figure 1 and Figure 2), It then delves into the miRNAs associated with circRNA and elucidates their interactions (Figure 3, Figure 4 and Figure 5). Subsequently, the manuscript identifies the target genes of miRNA and validates the regulatory effects of circRNA and miR-4521 on ARG1 (Figure 6). The study culminates with the application of the ceRNA theory to confirm the significance of ARG1 in the functional interplay between circHMGCS1 and miR-4521 (Figure 7). These findings throughout the manuscript are dedicated to uncovering the pivotal roles of circHMGCS1 and miR-4521 in modulating vascular endothelial function. Notably, the interaction between circHMGCS1 and miR-4521 represents a novel discovery of our research. Therefore, we aim to emphasize the critical function of circHMGCS1 and miR-4521 in the regulation of vascular endothelial dysfunction in type 2 diabetes within the manuscript.

      Reviewer #2 (Recommendations For The Authors):

      I have a few suggestions for improving the study further.

      (1) Although the experiments suggest the role of circHMGCS1, miR-4521 in vascular endothelial function, the direct regulation or interaction of circHMGCS1-miR-4521-ARG1 is unclear. A rescue experiment that checks the effect of circHMGCS1 silencing with/without inhibition of miR-4521 on ARG1 expression must be performed to prove the circHMGCS1- miR-4521 regulatory axis.

      Thank you very much for your constructive comments.

      According to your suggestion, we utilized shRNA to effectively knockdown circHMGCS1 in HUVEC, Subsequent expression analysis via qRT-PCR was conducted to assess the levels of miR-4521 and ARG1. The knockdown of circHMGCS1 significantly reduced the expression of circHMGCS1 in HUVEC without influencing the expression of the host gene HMGCS1. Concurrently, the knockdown of circHMGCS1 resulted in an upregulation of miR-4521 (Supplementary Figure 4B) and a downregulation of ARG1 (Figure 6P and 6Q). In our manuscript, the upregulation in ARG1 expression caused by circHMGCS1 overexpression was reduced by miR-4521, and the downregulation in ARG1 expression caused by miR-4521 overexpression was also reversed by circHMGCS1. When miR-4521 was knocked down, the expression of ARG1 increased, and circHMGCS1 abrogated its regulatory effect on the expression of ARG1. Collectively, these findings indicate that the interplay between circHMGCS1 and miR-4521 significantly influences ARG1 expression.

      Author response image 2.

      The impact of circHMGCS1 knockdown on ARG1 and miR-4521 expression levels in HUVEC. The cells were transfected with either circHMGCS1 shRNA1 or circHMGCS1 shRNA2, and the expressions levels of circHMGCS1 and HMGCS1 (A), miR-4521 (B) and ARG1 (C and D) in HUVECs were detected by qRT-PCR and Western blot. n=3 in each group. *p < 0.05, **p < 0.01. All significant difference was determined by one-way ANOVA followed by Bonferroni multiple comparison post hoc test, error bar indicates SD.

      (2) It is unclear how the authors arrived at the circHMGCS1-miR-4521 pair. The pull down of circHMGCS1 followed by qPCR enrichment analysis of all target miRNAs must be performed to select the target miRNA.

      In this manuscript, we identified the expression of miRNA under PAHG treatment through miRNA sequencing, and then further screened out 4 miRNAs with potential binding sites to circHMGCS1 utilizing the miRanda database. Subsequently, we employed qRT-PCR and Western blot analysis to confirm the regulatory influence of miR-4521 on endothelial function (Figure 3). Following this, RIP, RNA pull down, dual luciferase and RNA-FISH experiments were conducted to map the interaction between circHMGCS1 and miR-4521 (Figure 4), the direct interaction between circHMGCS1 and miR-4521 was further substantiated through overexpression and knockdown studies (Figures 5-7). while the reviewer's method may offer a more direct validation, our methodology initially involved a database-driven screening of candidate miRNAs with the potential to target and bind circHMGCS1, followed by experimental validation of these interactions. Both methodologies are capable of establishing the interaction sites between circHMGCS1 and miR-4521.

      (3) Since the back splicing is not that efficient, the linear RNA from the overexpression construct may produce many linear RNAs with miRNA binding sites. The effect seen in the case of overexpression experiments needs to consider the level of linear and circular HMGCS1 produced by the vector.

      In this manuscript, the vector's multiple cloning site is flanked by inverted repeat sequences that facilitate efficient circRNA looping. This design enables the inserted sequence to form a stable loop and undergo circularization upon transcription, leading to the overexpression of circRNA (Supplementary Figure 2). For the validation of circular RNA, we employed divergent primers that straddle the circRNA splicing junction. These primers are specific for circRNA amplification and do not amplify the corresponding linear RNA, as demonstrated in Figure 2A. Upon overexpression of circHMGCS1, we observed a significant increase in circHMGCS1 levels compared to the empty vector and Control groups, while there was no significant change in the expression level of HMGCS1 mRNA.

      (4) As miR-4521 has multiple miRNA binding sites on circHMGCS1, it is not very clear which sites were mutated in circHMGCS1-MUT.

      We have made corrections to Supplementary Figure 4C. Utilizing the miRanda algorithm, we identified 10 potential binding sites for miR-4521 on circHMGCS1. Subsequently, we selected the site with the highest binding affinity for mutational analysis (miR-4521 binding positions 3-15, circHMGCS1 binding positions 260-281, binding rate 91.67%, binding ability -17.299999 kCal/Mol). We employed a dual-luciferase assay to confirm the direct interaction between circHMGCS1 and miR-4521.

      (5) Since the ceRNA network works efficiently in an equimolar concentration of the regulatory molecules, providing the copy number of circHMGCS1, miR-4521, and target mRNAs would be helpful.

      We employed qRT-PCR to ascertain the absolute quantification of mRNA copy numbers, following established methodologies (Nolan et al, 2006; Wagatsuma et al, 2005; Zhang et al, 2009). Our qRT-PCR data reveal that the circHMGCS1 mRNA copy number is 2343±529. In comparison, the ARG1 mRNA copy number stands at 88±27, while the miR-4521 copy number is significantly higher, recorded at 36277±9407.

      Author response image 3.

      The distribution of copy numbers for circHMGCS1, miR-4521 and ARG1 in HUVECs.

      (6) The yellow highlighted "cyclization-mediated sequence-F & R" does not seem to be complementary sequences. The method section may include the details of the vectors and cloning strategies for the overexpression constructs.

      The figure below illustrates the schematic representation of the complementary structure between the upstream and downstream sequences that facilitate circRNA circularization. This strategic pairing is designed to enhance the circularization efficiency of circRNA while concurrently suppressing mRNA synthesis (Liang & Wilusz, 2014). Details of this design have been integrated into the experimental method (Line539). The specific additions are as follows:

      The circHMGCS1 sequence [NM_001098272: 43292575-43297268], the splice site AG/GT and ALU elements were inserted into the pCDH-circRNA-GFP vector (upstream ALU: AAAGTGCTGAGATTACAGGCGTGAGCCACCACCCCCGGCCCACTTTTTGTAAAGGTACGTACTAATGACTTTTTTTTTATACTTCAG, downstream ALU: GTAAGAAGCAAGGAAAAGAATTAGGCTCGGCACGGTAGCTCACACCTGTAATCCCAGCA). The restriction enzyme sites selected were EcoRI and NotI.

      Author response image 4.

      (7) Since circHMGCS1 is a multi-exonic circRNA that can undergo alternative splicing and divergent primers only validate the backsplice junction, the full-length sequence of mature circHMGCS1 needs to be checked by circRNA-RCA PCR followed by Sanger sequencing.

      In compliance with your guidance, we have enriched the revised manuscript with additional data. Specifically, we have included the full-length nucleic acid electrophoresis diagram of circHMGCS1 in Supplementary Figure 1F, the Sanger sequencing results in Supplementary Figure 1G, and a comparative analysis of the circHMGCS1 sequences obtained from Sanger sequencing with those referenced in the circBase database, presented in Supplementary Figure 1H.

      Reference:

      Barbon, E., C. Kawecki, S. Marmier, A. Sakkal, F. Collaud, S. Charles, G. Ronzitti, C. Casari, O.D. Christophe, C.V. Denis, P.J. Lenting, and F. Mingozzi. 2023. Development of a dual hybrid AAV vector for endothelial-targeted expression of von Willebrand factor. Gene Ther. 30: 245-254.

      Fu, Y., J. Chen, and Z. Huang. 2019. Recent progress in microRNA-based delivery systems for the treatment of human disease. ExRNA. 1: 24.

      Kristensen, L.S., M.S. Andersen, L.V.W. Stagsted, K.K. Ebbesen, T.B. Hansen, and J. Kjems. 2019. The biogenesis, biology and characterization of circular RNAs. Nat Rev Genet. 20: 675-691.

      Liang, D., and J.E. Wilusz. 2014. Short intronic repeat sequences facilitate circular RNA production. Genes Dev. 28: 2233-2247.

      Liang, J., X. Li, J. Xu, G.M. Cai, J.X. Cao, and B. Zhang. 2021. hsa_circ_0072389, hsa_circ_0072386, hsa_circ_0008621, hsa_circ_0072387, and hsa_circ_0072391 aggravate glioma via miR-338-5p/IKBIP. Aging (Albany NY). 13: 25213-25240.

      Liu, C.X., and L.L. Chen. 2022. Circular RNAs: Characterization, cellular roles, and applications. Cell. 185: 2016-2034.

      Nolan, T., R.E. Hands, and S.A. Bustin. 2006. Quantification of mRNA using real-time RT-PCR. Nat Protoc. 1: 1559-1582.

      Wagatsuma, A., H. Sadamoto, T. Kitahashi, K. Lukowiak, A. Urano, and E. Ito. 2005. Determination of the exact copy numbers of particular mRNAs in a single cell by quantitative real-time RT-PCR. J Exp Biol. 208: 2389-2398.

      Yao, C., T. Veleva, L. Scott, Jr., S. Cao, L. Li, G. Chen, P. Jeyabal, X. Pan, K.M. Alsina, I.D. Abu-Taha, S. Ghezelbash, C.L. Reynolds, Y.H. Shen, S.A. Lemaire, W. Schmitz, F.U. Müller, A. El-Armouche, N. Tony Eissa, C. Beeton, S. Nattel, X.H.T. Wehrens, D. Dobrev, and N. Li. 2018. Enhanced Cardiomyocyte NLRP3 Inflammasome Signaling Promotes Atrial Fibrillation. Circulation. 138: 2227-2242.

      Zhang, X.X., T. Zhang, M. Zhang, H.H. Fang, and S.P. Cheng. 2009. Characterization and quantification of class 1 integrons and associated gene cassettes in sewage treatment plants. Appl Microbiol Biotechnol. 82: 1169-1177.

      Zincarelli, C., S. Soltys, G. Rengo, and J.E. Rabinowitz. 2008. Analysis of AAV serotypes 1-9 mediated gene expression and tropism in mice after systemic injection. Mol Ther. 16: 1073-1080.

    1. eLife assessment

      This study investigates the mechanistic connection between glycosylation at the N162 site of the Fc gamma receptor FcγRIIIa and the regulation of NK cell-mediated antibody-dependent cytotoxicity. The compelling findings, derived from novel isotope labeling approaches and state-of-the-art NMR spectroscopy techniques, underscore the impact of glycan composition on receptor stability and immune function. This research offers fundamental insights that could aid in the development of more effective therapeutic antibodies. The manuscript will be of interest to researchers in the fields of immunology and therapeutic antibody development.

    2. Reviewer #1 (Public Review):

      Summary:

      In this work, the authors continue their investigations on the key role of glycosylation to modulate the function of a therapeutic antibody. As a follow-up to their previous demonstration on how ADCC was heavily affected by the glycans at the Fc gamma receptor (FcγR)IIIa, they now dissect the contributions of the different glycans that decorate the diverse glycosylation sites. Using a well-designed mutation strategy, accompanied by exhaustive biophysical measurements, with extensive use of NMR, using both standard and newly developed methodologies, they demonstrate that there is one specific locus, N162, which is heavily involved in the stabilization of (FcγR)IIIa and that the concomitant NK function is regulated by the glycan at this site.

      Strengths:

      The methodological aspects are carried out at the maximum level.

      Weaknesses:

      The exact (or the best possible assessment) of the glycan composition at the N162 site is not defined.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors set out to demonstrate a mechanistic link between Fcgamma receptor (IIIA) glycosylation and IgG binding affinity and signaling - resulting in antibody-dependent cellular cytotoxicity - ADCC. The work builds off prior findings from this group about the general impact of glycosylation on FcR (Fc receptor)-IgG binding.

      Strengths:

      The structural data (NMR) is highly compelling and very significant to the field. A demonstration of how IgG interacts with FcgRIIIA in a manner sensitive to glycosylation of both the IgG and the FcR fills a critical knowledge gap. The approach to demonstrate the selective impact of glycosylation at N162 is also excellent and convincing. The manuscript/study is, overall, very strong.

      Weaknesses:

      There are a number of minor weaknesses that should be addressed.

      (1) Since S164A is the only mutant in Figure 1 that seems to improve affinity, even if minimally, it would be a nice reference to highlight that residue in the structural model in panel B.

      (2) It is confusing why some of the mutants in the study are not represented in Figure 1 panel A. Those affinities and mutants should be incorporated into panel A so the reader can easily see where they all fall on the scale. T167Y in particular needs to be shown, as it is one of few mutants that fall between what seems to be ADCC+ and ADCC- lines. Also, that mutant seems to have a stronger affinity compared to wt (judged by panel D), yet less ADCC than wt. This would imply that the relationship between affinity and activity is not as clean as stated, though it is clearly important. Comments about this would strengthen the overall manuscript.

      (3) This statement feels out of place: "In summary, this result demonstrates that the sensitivity to antibody fucosylation may be eliminated through FcγRIIIa engineering while preserving antibody-binding affinity." In Figure 2, the authors do indeed show that mutations in FcgRIIIa can alter the impact of IgG core fucosylation, but implying that receptor engineering is somehow translatable or as impactful therapeutically as engineering the antibody itself deflates the real basic science/biochemical impact of understanding these interactions in molecular detail. Not everything has to be immediately translatable to be important.

      (4) The findings reported in Figure 2, panel C are exciting. Controls for the quality of digestion at each step should be shown (perhaps in supplementary data).

      (5) Figure 3 is confusing (mislabeled?) and does not show what is described in the Results. First, there is a F158V variant in the graph but a V158F variant in the text. Please correct this. Second, this variant (V158F/F158V) does not show the 2-fold increase in ADCC with kifunesine as stated. Finally, there are no statistical evaluations between the groups (+/- kif; +/- fucose). The differences stated are not clearly statistically significant given the wide spread of the data. This is true even for the wt variant.

      (6) The kifunensine impact is somewhat confusing. They report a major change in ADCC, yet similar large changes with trimming only occur once most of the glycan is nearly gone (Figure 2). Kifunensine will tend to generate high mannose and possibly a few hybrid glycans. It is difficult to understand what glycoforms are truly important outside of stating that multi-branched complex-type N-glycans decrease affinity.

      (7) This is outside of the immediate scope, but I feel that the impact would be increased if differences in NK cell (and thus FcgRIIIA) glycosylation are known to occur during disease, inflammation, age, or some other factor - and then to demonstrate those specific changes impact ADCC activity via this mechanism.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this work, the authors continue their investigations on the key role of glycosylation to modulate the function of a therapeutic antibody. As a follow-up to their previous demonstration on how ADCC was heavily affected by the glycans at the Fc gamma receptor (FcγR)IIIa, they now dissect the contributions of the different glycans that decorate the diverse glycosylation sites. Using a well-designed mutation strategy, accompanied by exhaustive biophysical measurements, with extensive use of NMR, using both standard and newly developed methodologies, they demonstrate that there is one specific locus, N162, which is heavily involved in the stabilization of (FcγR)IIIa and that the concomitant NK function is regulated by the glycan at this site.

      Strengths:

      The methodological aspects are carried out at the maximum level.

      Weaknesses:

      The exact (or the best possible assessment) of the glycan composition at the N162 site is not defined.

      We will revise the Introduction to include previous findings from our laboratory regarding processing on YTS cells:

      “YTS cells, a key cytotoxic human NK cell line used for these studies, express FcγRIIIa with extensive glycan processing, including the N162 site with predominantly hybrid and complex-type glycoforms {Patel 2021}.”  

      Reviewer #2 (Public Review):

      Summary:

      The authors set out to demonstrate a mechanistic link between Fcgamma receptor (IIIA) glycosylation and IgG binding affinity and signaling - resulting in antibody-dependent cellular cytotoxicity - ADCC. The work builds off prior findings from this group about the general impact of glycosylation on FcR (Fc receptor)-IgG binding.

      Strengths:

      The structural data (NMR) is highly compelling and very significant to the field. A demonstration of how IgG interacts with FcgRIIIA in a manner sensitive to glycosylation of both the IgG and the FcR fills a critical knowledge gap. The approach to demonstrate the selective impact of glycosylation at N162 is also excellent and convincing. The manuscript/study is, overall, very strong.

      Weaknesses:

      There are a number of minor weaknesses that should be addressed.

      (1) Since S164A is the only mutant in Figure 1 that seems to improve affinity, even if minimally, it would be a nice reference to highlight that residue in the structural model in panel B.

      We will revise Figure 1B to include the S164 site.

      (2) It is confusing why some of the mutants in the study are not represented in Figure 1 panel A. Those affinities and mutants should be incorporated into panel A so the reader can easily see where they all fall on the scale.

      We thank the reviewer for this comment. We will restructure the Results section to highlight that a primary outcome of the experiment referenced was to map the contribution of interface residues to antibody binding affinity. These data were not previously available, highlighting hotspots at the interface. Figure 1A and B report these results.

      We then used a subset of mutations from this experiment, as well as a subset of mutations from an additional library containing mutations proximal to the interface, to build a small library for evaluation using ADCC. The complete binding data for all variants, binding to two different IgG1 Fc glycoforms, is presented in Supplemental Table 1. 

      T167Y in particular needs to be shown, as it is one of few mutants that fall between what seems to be ADCC+ and ADCC- lines. Also, that mutant seems to have a stronger affinity compared to wt (judged by panel D), yet less ADCC than wt. This would imply that the relationship between affinity and activity is not as clean as stated, though it is clearly important. Comments about this would strengthen the overall manuscript.

      We thank the reviewer for this particular insight. We agree that the lack of a clean correlation between ADCC potency and affinity implies additional factors that could have affected these experimental results. We will add the following sentence to the discussion. 

      “Notably, the ADCC potency for those high-affinity variants does not fall cleanly on a line, indicating that other factors affect our observations, which may include organization at the cell surface, changes to glycan composition, or receptor trafficking.”

      (3) This statement feels out of place: "In summary, this result demonstrates that the sensitivity to antibody fucosylation may be eliminated through FcγRIIIa engineering while preserving antibody-binding affinity." In Figure 2, the authors do indeed show that mutations in FcgRIIIa can alter the impact of IgG core fucosylation, but implying that receptor engineering is somehow translatable or as impactful therapeutically as engineering the antibody itself deflates the real basic science/biochemical impact of understanding these interactions in molecular detail. Not everything has to be immediately translatable to be important. 

      We agree and will remove the highlighted sentence.   

      (4) The findings reported in Figure 2, panel C are exciting. Controls for the quality of digestion at each step should be shown (perhaps in supplementary data). We agree.

      We will add an example of the digestions as Figure S2.  

      (5) Figure 3 is confusing (mislabeled?) and does not show what is described in the Results. First, there is a F158V variant in the graph but a V158F variant in the text.

      Please correct this. 

      Thank you for identifying this typo. We will correct Figure 3.

      Second, this variant (V158F/F158V) does not show the 2-fold increase in ADCC with kifunesine as stated. 

      Thank you for drawing our attention to this rounding error. We will revise the text to report a statistically significant 1.4-fold increase.

      Finally, there are no statistical evaluations between the groups (+/- kif; +/- fucose). 

      We provide the p values for +/-fuc and +/- Kifunensine for each YTS cell line in the figure. We did not provide a global comparison of p values that included all cell lines due to some cell lines experiencing a significant change and others not. However, we will add the raw data as Supplemental Table 2 should readers wish to perform these analyses.

      The differences stated are not clearly statistically significant given the wide spread of the data. This is true even for the wt variant.

      We agree that there are points that overlap in this figure between the different treatments. However, our use of the students T-test (two tailed) using three experiments collected on three different days (each with three technical replicates) provides enough resolution to determine the significance of difference of the means for the different treatments. This is, by our estimation, a highly rigorous manner to collect and analyze the data.  

      (6) The kifunensine impact is somewhat confusing. They report a major change in ADCC, yet similar large changes with trimming only occur once most of the glycan is nearly gone (Figure 2). Kifunensine will tend to generate high mannose and possibly a few hybrid glycans. It is difficult to understand what glycoforms are truly important outside of stating that multi-branched complex-type N-glycans decrease affinity.

      Note that Figure 2 does not evaluate the kifunensine-treated glycan, which is mostly Man8 and Man9 structures. In our previous work, these structures likewise provide increased binding affinity (see pubmed ID 30016589). We believe the most important message is that composition of the N162 glycan (removed with the S164A mutation) regulates NK cell ADCC. On cells, we are not able to modulate N162 glycan composition without affecting potentially every other N-glycan on the surface, so we do not have an ADCC experiments that is directly comparable to Figure 2. Thus, this increased ADCC resulting from kifunensine treatment is consistent with previously observed increases in binding affinity measurement.  

      (7) This is outside of the immediate scope, but I feel that the impact would be increased if differences in NK cell (and thus FcgRIIIA) glycosylation are known to occur during disease, inflammation, age, or some other factor - and then to demonstrate those specific changes impact ADCC activity via this mechanism.

      We agree completely. As mentioned in the Introduction, we know that N162 glycan composition varies substantially from donor to donor based on previous work from our lab. Curiously, little variability appeared between donors at the other four Nglycosylation sites. Thus, there is the potential that different NK cell N162 glycan compositions are coincident with different indications. This is an area we are quite interested in pursuing.

    1. eLife assessment

      This paper provides an important assessment of competition dynamics allowing coexistence of the carnivore guild within a large national park in China. Multiple surveying techniques (camera traps and DNA metabarcoding) provide convincing evidence that spatial segregation represents the main strategy of coexistence, while species have a certain degree of temporal and dietary overlap. Altogether, the manuscript provides information critical to the conservation and management agenda of the park.

    1. eLife assessment

      This paper makes fundamental contributions to understanding the mechanisms by which the conserved guidance cue UNC-6/Netrin controls the long-range growth and targeting of axons. Using state-of-the-art genetics and in vivo imaging, the authors provide solid support for the finding that UNC-6/Netrin can act via both chemotaxis and haptotaxis, though additional studies would be necessary to make these findings stronger. The paper's insights will be of interest to a variety of cell and developmental biologists and neuroscientists.

    1. eLife assessment

      This is a valuable study in the Jurkat T cell line that calls attention to phosphorylation of formin-like 1 β role and its role in polarization of CD63 positive extracellular vesicles (referred to as exosomes). The evidence presented in the Jurkat model is solid, but concerns have been raised about the statistical analysis and more details would be required to fully assess the significance of the results. For example, ANOVA is the method described, but it requires large amounts of normally distributed data in multiple groups and cannot be used to make pairwise comparisons within groups, which would require a post-hoc method (which is not discussed). In addition, the data showing forming-like 1 β in primary human T cells without and with a CAR are provided without quantification and don't investigate any of the novel claims, so doesn't address the relevance of Formin-like 1 β beyond the Jurkat model. Nonetheless, the consistent trends in the body of the study do provide reliable support for the claims.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer 1 (Public Review):

      Summary:

      The authors propose that the energy landscape of animals can be thought of in the same way as the fundamental versus realized niche concept in ecology. Namely, animals will use a subset of the fundamental energy landscape due to a variety of factors. The authors then show that the realized energy landscape of eagles increases with age as the animals are better able to use the energy landscape. Strengths:

      This is a very interesting idea and that adds significantly to the energy landscape framework. They provide convincing evidence that the available regions used by birds increase with size.

      Weaknesses:

      Some of the measures used in the manuscript are difficult to follow and there is no mention of the morphometrics of birds or how these change with age (other than that they don’t change which seems odd as surely they grow). Also, there may need to be more discussion of other ontogenetic changes such as foraging strategies, home range size etc.

      We thank reviewer 1 for their interest in our study and for their constructive recommendations. We have included further discussions of these points in the manuscript and outline these changes in our responses to the detailed recommendations below.

      Reviewer 2 (Public Review):

      Summary:

      With this work, the authors tried to expand and integrate the concept of realized niche in the context of movement ecology by using fine-scale GPS data of 55 juvenile Golden eagles in the Alps. Authors found that ontogenic changes influence the percentage of area flyable to the eagles as individuals exploit better geographic uplifts that allow them to reduce the cost of transport.

      Strengths:

      Authors made insightful work linking changes in ontogeny and energy landscapes in large soaring birds. It may not only advance the understanding of how changes in the life cycle affect the exploitability of aerial space but also offer valuable tools for the management and conservation of large soaring species in the changing world.

      Weaknesses:

      Future research may test the applicability of the present work by including more individuals and/or other species from other study areas.

      We are thankful to reviewer 2 for their encouragement and positive assessment of our work. We have addressed their specific recommendations below.

      Recommendations for the authors:

      Reviewer 1 (Recommendations For The Authors):

      I found this to be a very interesting paper which adds some great concepts and ideas to the energy landscape framework. The paper is also concise and well-written. While I am enthusiastic about the paper there are areas that need clarifying or need to be made clearer. Specific comments below:

      Line 64: I disagree that competition is the fundamental driver of the realized niche. In some cases, it may be but in others, predation may be far more important (as an example).

      We agree with this point and have now clarified that competition is an example of a driver of the realized niche. We have also included predation as another example:

      "However, just as animals do not occupy the entirety of their fundamental Hutchinsonian niche in reality [1], for example due to competition or predation risk, various factors can contribute to an animal not having access to the entirety of its fundamental movement niche."

      Intro: I think the authors should emphasize that morphological changes with ontogeny will change the energy landscape for many animals. It may not be the case specifically with eagles but that won’t be true for other animals. For example, in many sharks, buoyancy increases with age.

      We agree and have now clarified that the developmental processes that we are interested in happen in addition to morphological changes:

      "In addition to morphological changes, as young animals progress through their developmental stages, their movement proficiency [2] and cognitive capabilities [3] improve and memory manifests [4]."

      Line 91-93: The idea that birds fine-tune motor performance to take advantage of updrafts is a very important one to the manuscript and should be discussed in a bit more detail. How? At the moment there is a single sentence and it doesn’t even have a citation yet this is the main crux of the changes in realized energy landscape with age. This point should be emphasized because, by the end of the introduction, it is not clear to me why the landscape should be cheaper as the birds age?

      Thank you for pointing out this missing information. We have now added examples to clarify how soaring birds fine-tune their motor performance when soaring. These include for example adopting high bank angles in narrow and weak thermals [5] and reducing gliding airspeed when the next thermal has not been detected [6]:

      "Soaring flight is a learned and acquired behavior [7, 8], requiring advanced cognitive skills to locate uplifts as well as fine-tuned locomotor skills for optimal adjustment of the body and wings to extract the most energy from them, for example by adopting high bank angles in narrow and weak thermals [5] and reducing gliding airspeed when the next thermal has not been detected [6]."

      Results:

      Line 106: explain the basics of the life history of the birds in the introduction. I have no idea what emigration refers to or the life history of these animals.

      Thank you for pointing out the missing background information. We have now added this

      information to the introduction:

      "We analyzed 46,000 hours of flight data collected from bio-logging devices attached to 55 wild-ranging golden eagles in the Central European Alps. These data covered the transience phase of natal dispersal (hereafter post-emigration). In this population, juveniles typically achieve independence by emigrating from the parental territory within 4-10 months after fledging. However, due to the high density of eagles and consequently the scarcity of available territories, the transience phase between emigration and settling by eventually winning over a territory is exceptionally long at well over 4 years. Our hypothesis posited that the realized energy landscape during this transience phase gradually expands as the birds age."

      What I still am having a hard time understanding is the flyability index. Is this just a measure of the area animals actively select and then the assumption that it’s a good region to fly within?

      We have modified our description of the flyability index for more clarity. In short, we built a step-selection model and made predictions using this model. The predictions estimate the probability of use of an area based on the predictors of the model. For the purpose of our study and what our predictors were (proxies for uplift + movement capacity), we interpreted the predicted values as the "flyability index". We have now clarified this in the methods section:

      "We made the predictions on the scale of the link function and converted them to values between 0 and 1 using the inverse logit function [9]. These predicted values estimated the probability of use of an area for flying based on the model. We interpreted these predicted values as the flyability index, representing the potential energy available in the landscape to support flight, based on the uplift proxies (TRI and distance to ridge line) and the movement capacity (step length) of the birds included in the model."

      It might also be useful to simply show the changes in the area the animals use with age as well (i.e. a simple utilization distribution). This should increase in age for many animals but would also be a reflection of the resources animals need to acquire as they get older.

      We have now added the figure S2 to the supplementary material. This plot was created by calculating the cumulative area used by the birds in each week after emigration. This was done by extracting the commuting flights for each week, converting these to line objects, overlapping the lines with a raster of 100*100 m cell size, counting the number of overlapping cells and calculating the area that they covered. We did not calculate UDs or MCPs because the eagles seem to be responding to linear features of the landscape, e.g. preferring ridgelines and avoiding valleys. Using polygons to estimate used areas would have made it difficult to ensure that decision-making with regards to these linear features was captured.

      In a follow-up project, a PhD student in the golden eagle consortium is exploring the individuals’ space use after emigration considering different environmental and social factors. The outcome of that study will further complete our understanding of the post-emigration behavior of juvenile golden eagles in the Alps.

      How much do the birds change in size over the ontogeny measured? This is never discussed.

      Thank you for bringing up this question. The morphometrics of juvenile golden eagles are not significantly different from the adults, except in the size of culmen and claws [10]. Body mass changes after fledging, because of the development of the pectoral muscles as the birds start flying. Golden eagles typically achieve adult-like size and mass within their natal territory before emigration, at which time we started quantifying the changes in energy landscape. Given our focus on post-emigration flight behavior, we do not expect any significant changes in size and body mass during our study period. We now cover this in the discussion:

      "Juvenile golden eagles complete their morphological development before gaining independence from their parents, with their size and wing morphology remaining stable during the post-emigration phase [10, 11]. Consequently, variations in flyability of the landscape for these birds predominantly reflect their improved mastery of soaring flight, rather than changes in their morphology."

      Discussion:

      Line 154: Could the increase in step length also be due to changes in search strategies with age? e.g. from more Brownian motion when scavenging to Levy search patterns when actively hunting?

      This is a very good point and we tried to look for evidence of this transition in the tracking data. We explored the first passage time for two individuals with a radius of 50 km to see if there is a clear transition from a Brownian to a Levy motion. The patterns that emerge are inconclusive and seem to point to seasonality rather than a clear transition in foraging strategy (Author response image 1). We have modified our statement in the discussion about the change in preference of step lengths indicating improve flight ability, to clarify that it is speculative:

      Author response image 1.

      First passage times using a 50 km radius for two randomly selected individuals.

      "Our findings also reveal that as the eagles aged, they adopted longer step lengths, which could indicate an increasing ability to sustain longer uninterrupted flight bouts."

      Methods:

      Line 229: What is the cutoff for high altitude or high speed?

      We used the Expectation-maximization binary clustering (EMbC) method to identify commuting flights. The EmbC method does not use hard cutoffs to cluster the data. Each data point was assigned to the distribution to which it most likely belonged based on the final probabilities after multiple iterations of the algorithm. Author response image 2 shows the distribution of points that were either used or not used based on the EmbC classification.

      Author response image 2.

      Golden eagle tracking points were either retained (used) or discarded (not used) for further data analysis based on the EmbC algorithm. The point were clustered based on ground speed and height above ground.

      Figure 1: The figure captions should stand on their own but in this case there is no information as to what the tests are actually showing.

      We have now updated the caption to provide information about the model:

      "Coefficient estimates of the step selection function predicting probability of use as a function of uplift proxies, week since emigration, and step length. All variables were z-transformed prior to modeling.

      The error bars show 95% confidence intervals."

      Reviewer 2 (Recommendations For The Authors):

      First, I want to congratulate you on this fantastic work. I enjoyed reading it. The manuscript is clear and well-written, and the findings are sound and relevant to the field of movement ecology. Also, the figures are neatly presented and easy to follow.

      I particularly liked expanding the old concept of fundamental vs realized niche into a movement ecology context. I believe that adds a fresh view into these widely accepted ecological assumptions on species niche, which may help other researchers build upon them to better understand movement "realms" on highly mobile animals in a rapidly changing world.

      I made some minor comments to the manuscript since it was hard to find important weaknesses in it, given the quality of your work. However, there was a point in the discussion that I feel deserves your attention (or rather a reflection) on how major biological events such as moulting could also influence birds to master the flying and exploitation of the energy landscape. You may find my suggestion quite subjective, but I think it may help expand your idea for future works and, what is more, link concepts such as energy landscapes, ontogeny, and important life cycle events such as moulting in large soaring birds. I consider this relevant from a mechanistic perspective to understand better how individuals negotiate all three concepts to thrive and persist in changing environments and to maximise their

      fitness.

      Once again, congratulations on this excellent piece of research.

      We thank the reviewer for their enthusiasm about our work and for bringing up important points about the biology of the species. Our detailed response are below.

      MINOR COMMENTS:

      (Note: Line numbers refer to those in the PDF version provided by the journal).

      Line 110: Distinguished (?)

      corrected

      Line 131: Overall, I agree with the authors’ discussion and very much liked how they addressed crucial points. However, I have a point about some missing non-discussed aspects of bird ecology that had not been mentioned.

      The authors argue that morphological traits are less important in explaining birds’ mastery of flight (thus exploiting all available options in the landscape). However, I think the authors are missing some fundamental aspects of bird biology that are known to affect birds’ flying skills, such as moult.

      The moulting process affects species’ flying capacity. Although previous works have not assessed moults’ impact on movement capacity, I think it is worth including the influence of flyability on this ecologically relevant process.

      For instance, golden eagles change their juvenile plumage to intermediate, sub-adult plumage in two or three moult cycles. During this process, the moulting process is incomplete and affects the birds’ aerodynamics, flying capacity, and performance (see Tomotani et al. 2018; Hedenström 2023). Thus, one could expect this process to be somewhat indirectly linked to the extent to which birds can exploit available resources.

      Hedenström, A. (2023). Effects of wing damage and moult gaps on vertebrate flight performance.

      Journal of Experimental Biology, 226(9), jeb227355. Tomotani, B. M., Muijres, F. T., Koelman, J., Casagrande, S., & Visser, M. E. (2018). Simulated moult reduces flight performance, but overlap with breeding does not affect breeding success in a longdistance migrant. Functional Ecology, 32(2), 389-401.

      We thank the reviewer for bringing up this relevant topic. We explored the literature listed by the reviewer and also other sources. We came to the conclusion that moulting does not impact our findings. In our study, we included data for eagles that had emigrated from the natal territories, with their fully grown feathers in juvenile plumage. The moulting schedule in juvenile birds is similar to that of adults: the timing, intensity, and sequence of feathers being replaced is consistent every year (Author response image 3). For these reasons, we do not believe that moulting stage noticeably impacts flight performance at the scale of our study (hourly flights). Fine details of soaring flight performance (aerodynamics within and between thermals) could differs during moulting of different primary and secondary feathers, but this is something that would occur every time the eagle replaces these feather and we do not expect it to be any different for juveniles. Such fine scale investigations are outside the scope of this study.

      Author response image 3.

      Moulting schedule of golden eagles [12]

      Lines 181-182: I don’t think trophic transitions rely only on individual flying skill changes. Furthermore, despite its predominant role, scavenging does not mean it is the primary source of food acquisition in golden eagles. This also depends on prey availability, and scavenging is an auxiliary font of easy-to-catch food.

      Scavenging implies detecting carcasses. Should this carcass appearance occur in highly rugged areas, the likelihood of detection also reduces notably. This is not to say that there are not more specialized carrion consumers, such as vultures, that may outcompete eagles in searching for such resources more

      efficiently.

      In summary, I don‘t think such transition relies only on flying skills but on other non-discussed factors such as knowledge accumulation of the area or even the presence of conspecifics.

      Line 183: This is precisely what I meant with my earlier comment.

      Thank you for the discussion on the interaction between flight development and foraging strategy. We explored the transition from scavenging to hunting above as a response to Reviewer 1, but did not find a clear transition. This is in line with your comment that the birds probably use both scavenging and hunting methods opportunistically.

      Lines 193-195: I will locate this sentence somewhere in this paragraph. As it is now, it seems a bit out of context. It could be a better fit at the end of the first point in line 203.

      Thank you for pointing out the issue with the flow. We have now added a transitional sentence before this one to improve the paragraph. The beginning of the conclusion now reads as follows, with the new sentence shown in boldface.

      "Spatial maps serve as valuable tools in informing conservation and management strategies by showing the general distribution and movement patterns of animals. These tools are crucial for understanding how animals interact with their environment, including human-made structures. Within this context, energy landscapes play an important role in identifying potential areas of conflict between animals and anthropogenic infrastructures such as wind farms. The predictability of environmental factors that shape the energy landscape has facilitated the development of these conservation tools, which have been extrapolated to animals belonging to the same ecological guild traversing similar environments."

      References

      (1) Colwell, R. K. & Rangel, T. F. Hutchinson’s duality: The once and future niche. Proceedings of the National Academy of Sciences 106, 19651–19658. doi:10.1073/pnas.0901650106 (2009).

      (2) Corbeau, A., Prudor, A., Kato, A. & Weimerskirch, H. Development of flight and foraging behaviour in a juvenile seabird with extreme soaring capacities. Journal of Animal Ecology 89, 20–28. doi:10.1111/1365-2656.13121 (2020).

      (3) Fuster, J. M. Frontal lobe and cognitive development. Journal of neurocytology 31, 373–385.

      doi:10.1023/A:1024190429920 (2002).

      (4) Ramsaran, A. I., Schlichting, M. L. & Frankland, P. W. The ontogeny of memory persistence and specificity. Developmental Cognitive Neuroscience 36, 100591. doi:10.1016/j.dcn.2018.09.002 (2019).

      (5) Williams, H. J., Duriez, O., Holton, M. D., Dell’Omo, G., Wilson, R. P. & Shepard, E. L. C. Vultures respond to challenges of near-ground thermal soaring by varying bank angle. Journal of Experimental Biology 221, jeb174995. doi:10.1242/jeb.174995 (Dec. 2018).

      (6) Williams, H. J., King, A. J., Duriez, O., Börger, L. & Shepard, E. L. C. Social eavesdropping allows for a more risky gliding strategy by thermal-soaring birds. Journal of The Royal Society Interface 15, 20180578. doi:10.1098/rsif.2018.0578 (2018).

      (7) Harel, R., Horvitz, N. & Nathan, R. Adult vultures outperform juveniles in challenging thermal soaring conditions. Scientific reports 6, 27865. doi:10.1038/srep27865 (2016).

      (8) Ruaux, G., Lumineau, S. & de Margerie, E. The development of flight behaviours in birds. Proceedings of the Royal Society B: Biological Sciences 287, 20200668. doi:10.1098/rspb.2020.

      0668 (2020).

      (9) Bolker, B., Warnes, G. R. & Lumley, T. Package gtools. R Package "gtools" version 3.9.4 (2022).

      (10) Bortolotti, G. R. Age and sex size variation in Golden Eagles. Journal of Field Ornithology 55,

      54–66 (1984).

      (11) Katzner, T. E., Kochert, M. N., Steenhof, K., McIntyre, C. L., Craig, E. H. & Miller, T. A. Birds of the World (eds Rodewald, P. G. & Keeney, B. K.) chap. Golden Eagle (Aquila chrysaetos), version 2.0. doi:10.2173/bow.goleag.02 (Cornell Lab of Ornithology, Ithaca, NY, USA, 2020).

      (12) Bloom, P. H. & Clark, W. S. Molt and sequence of plumages of Golden Eagles and a technique for in-hand ageing. North American Bird Bander 26, 2 (2001).

    2. Reviewer #2 (Public Review):

      Summary:

      With this work, the authors tried to expand and integrate the concept of realized niche in the context of movement ecology by using fine-scale GPS data of 55 juvenile Golden eagles in the Alps. Authors found that ontogenic changes influence the percentage of area flyable to the eagles as individuals exploit better geographic uplifts that allow them to reduce the cost of transport.

      Strengths:

      Authors made insightful work linking changes in ontogeny and energy landscapes in large soaring birds that may not only advance the understanding of how changes in the life cycle affect the exploitability of aerial space but also offer valuable tools for the management and conservation of large soaring species in the changing world.

      Weaknesses:

      Future research may test the applicability of the present work by including more individuals and/or other species from other study areas.

    3. eLife assessment

      This important study substantially advances our understanding of energy landscapes and their link to animal ontogeny. The evidence supporting the conclusions is compelling, with high-throughput telemetry data and advanced track segmentation methods used to develop and map energy landscapes. The work will be of broad interest to animal ecologists.

    4. Reviewer #1 (Public Review):

      Summary:

      The authors propose that the energy landscape of animals can be thought of in the same way as the fundamental versus realized niche concept in ecology. Namely, animals will use a subset of the fundamental energy landscape due to a variety of factors. The authors then show that the realized energy landscape of eagles increases with age as the animals are better able to use the energy landscape.

      Strengths:

      This is a very interesting idea and that adds significantly to the energy landscape framework. They provide convincing evidence that the available regions used by birds increase with size.

      Review of revised version:

      The authors have addressed all my comments and concerns. This is a really nice and important manuscript. I have one minor suggestion: Line 74-85: when discussing the effect of ontogeny, the authors give examples of how these may change due to improved cognition and memory. I would recommend they also give examples of how these may change with morphology (e.g. change in wing or fin relative area, buoyancy in sharks etc) should also be included. Most growth in fish for example is allometric so the relative measures of area of fins to body size should also change.

      This is of course up to the authors but it would highlight how their study is applicable to many other systems beyond just birds (even though morphology is of little importance for their eagles).

    1. Author response:

      (1) Clarification and Detailed Explanation in the Methods Section:

      - Regarding Reviewer 1's comments about the unclear explanation of the update process for pseudotime, T, and the selection of important genes/features at bifurcation points in the methods, we will provide a detailed description of the update process for pseudotime T and how high-weight genes important to the bifurcation process are selected.

      - Regarding Reviewer 2's comments concerning the impact of the initial pseudotime prediction method and the insufficient description of various parameters, we will add information about the differences in the initially used pseudotime prediction methods and provide detailed information on the techniques and parameters used in each analysis.

      - Regarding Reviewer 2's comments on the choice of kernel functions, we will explain the rationale for selecting rbf and polynomial kernels and why other options were discarded.

      (2) Performance Comparison and Data Presentation:

      - Regarding Reviewer 1's comments about using a few trajectory plots of the real-world data to visualize the results, we will include 1-2 trajectory plots of real-world datasets in the benchmark analysis to better visualize the results and assess accuracy.

      - Regarding Reviewer 2's comments concerning the lack of comparison results and discussion related to trajectory prediction methods based on deep learning, we will include a comparison with deep learning methods such as scTour and Tigon in the revision. Additionally, we will discuss the latest deep learning methods for bifurcation analysis and alternative trajectory inference methods such as CellRank.

      - Regarding Reviewer 2's comments on the impact of MURP, we will include an analysis on whether the number of MURPs affects the performance of the method and compare it with the random subsampling approach.

      (3) Article Calibration and Refinement:

      - Regarding Reviewer 2's comments on the discussion section, we will simplify the first three paragraphs to succinctly convey the background and implications of our contributions. Additionally, we will explain why HVG is considered as the entire feature space in our comparisons and analyses.

      - Regarding Reviewer 2's comments concernig the regulons in the microglia analysis, we will review the correct explanations and revise the article accordingly.

      - In response to the issues raised by both reviewers regarding grammatical errors, spelling mistakes, and inconsistencies between text and figures, we will review and correct any errors in the article. This includes providing explanations for all abbreviations upon their first appearance, ensuring the accuracy of text and figure descriptions, correcting equation numbering, improving image quality, and revising descriptions such as "the current manifold learning methods face two major challenges."

      (4) Enhancing Descriptions and Readability:

      - Regarding Reviewer 1's comments about the synthetic data, we will add a brief description in the main text on how synthetic data were generated.

      - Regarding Reviewer 1's comments on the survival analysis, we will provide a more detailed description of the computational steps and clarify whether key confounding factors such as age, clinical stage, and tumor purity were controlled.

      - Regarding Reviewer 2's comments on evaluation metrics, we will add detailed descriptions of the evaluation metrics and provide intuitive explanations of how different methods perform across various metrics in the comparison results.

      - Regarding Reviewer 2's comments on CD8+ T cells, we plan to compare MGPfact with Monocle3, in addition to Monocle2. This will help clarify the added value of MGPfact and provide a more comprehensive evaluation of its performance.

      - Regarding Reviewer 2's comments about consensus trajectorie, we will add detailed descriptions of the process of generating consensus trajectories.

      - Regarding Reviewer 2's comments on regulons, we will include additional information on the process of downstream trajectory analysis and clarify the roles of SCENIC, GENIE3, RCisTarget, and AUCell in the bifurcation analysis.

    2. eLife assessment

      This manuscript describes a novel computational method to investigate cell evolutionary trajectory for scRNA-seq samples. This is an important tool for estimating pseudotime in the evolutionary path through modelling the bifurcations in a Gaussian process. While the evaluation of the method is extensive and compelling, the reviewers suggested further analyses to ensure that the method is indeed robust. When these issues are addressed, this will be of substantive value to biologists interested in scRNA-seq bioinformatic methods.

    3. Reviewer #1 (Public Review):

      Summary:

      Ren et al developed a novel computational method to investigate cell evolutionary trajectory for scRNA-seq samples. This method, MGPfact, estimates pseudotime and potential branches in the evolutionary path by explicitly modeling the bifurcations in a Gaussian process. They benchmarked this method using synthetic as well as real-world samples and showed superior performance for some of the tasks in cell trajectory analysis. They further demonstrated the utilities of MGPfact using single-cell RNA-seq samples derived from microglia or T cells and showed that it can accurately identify the differentiation timepoint and uncover biologically relevant gene signatures.

      Strengths:

      Overall I think this is a useful new tool that could deliver novel insights for the large body of scRNA-seq data generated in the public domain. The manuscript is written in a logical way and most parts of the method are well described.

      Weaknesses:

      Some parts of the methods are not clear.

      It should be outlined in detail how pseudo time T is updated in Methods. It is currently unclear either in the description or Algorithm 1.

      There should be a brief description in the main text of how synthetic data were generated, under what hypothesis, and specifically how bifurcation is embedded in the simulation.

      Please explain what the abbreviations mean at their first occurrence.

      In the benchmark analysis (Figures 2/3), it would be helpful to include a few trajectory plots of the real-world data to visualize the results and to evaluate the accuracy.

      It is not clear how this method selects important genes/features at bifurcation. This should be elaborated on in the main text.

      It is not clear how survival analysis was performed in Figure 5. Specifically, were critical confounders, such as age, clinical stage, and tumor purity controlled?

      I recommend that the authors perform some sort of 'robustness' analysis for the consensus tree built from the bifurcation Gaussian process. For example, subsample 80% of the cells to see if the bifurcations are similar between each bootstrap.

    4. Reviewer #2 (Public Review):

      Summary of the manuscript:

      The authors present MGPfactXMBD, a novel model-based manifold-learning framework designed to address the challenges of interpreting complex cellular state spaces from single-cell RNA sequences. To overcome current limitations, MGPfactXMBD factorizes complex development trajectories into independent bifurcation processes of gene sets, enabling trajectory inference based on relevant features. As a result, it is expected that the method provides a deeper understanding of the biological processes underlying cellular trajectories and their potential determinants.

      MGPfactXMBD was tested across 239 datasets, and the method demonstrated similar to slightly superior performance in key quality-control metrics to state-of-the-art methods. When applied to case studies, MGPfactXMBD successfully identified critical pathways and cell types in microglia development, validating experimentally identified regulons and markers. Additionally, it uncovered evolutionary trajectories of tumor-associated CD8+ T cells, revealing new subtypes with gene expression signatures that predict responses to immune checkpoint inhibitors in independent cohorts.

      Overall, MGPfactXMBD represents a relevant tool in manifold learning for scRNA-seq data, enabling feature selection for specific biological processes and enhancing our understanding of the biological determinants of cell fate.

      Summary of the outcome:

      The novel method addresses core state-of-the-art questions in biology related to trajectory identification. The design and the case studies are of relevance.

      However, in my opinion, the manuscript requires several clarifications and updates.

      Also, how the methods compare with existing Deep Learning based approaches such as TIGON is a question mark. If a comparison would be possible, it should be conducted; if not, it should be clarified why.

      Strengths:

      (1) Relevant methodology for a current field of research.

      (2) Relevant case studies with relevant outcomes.

      Weaknesses:

      (1) In general, the manuscript may be improved by making the text more accessible to the Journal's audience: (i) intuitive explanation of some concepts; (ii) review the flow of some explanations.

      (2) Additionally, several parts require more details on how the methods work, especially the case studies.

      (3) Finally, there are missing references to published work and possibly some additional comparisons to make.

    1. Reviewer #1 (Public Review):

      Summary:

      The authors report an inability to reproduce a transgenerational memory of avoidance of the pathogen PA14 in C. elegans. Instead, the authors demonstrate intergenerational inheritance for a single F1 generation, in embryos of mothers exposed to OP50 and PA14, where embryos isolated from these mothers by bleaching are capable of remembering to avoid PA14 in a manner that is dependent on systemic RNAi proteins sid-1 and sid-2. This could reflect systemic sRNAs generated by neuronal daf-7 signaling that are transmitted to F1 embryos. The authors note that transgenerational memory of PA14 was reported by the Murphy group at Princeton, but that environmental or strain variation (worms or bacteria) might explain the single generation of inheritance observed at Harvard. The Hunter group tried different bacterial growth conditions and different worm growth temperatures for independent PA14 strains, which they showed to be strongly pathogenic. However, the authors could not reproduce a transgenerational effect at Harvard. This important data will allow members of the scientific community to focus on the robust and reproducible inheritance of PA14 avoidance transmitted to F1 embryos of mothers exposed to PA14, which the authors demonstrate depends on small RNAs in a manner that is downstream of or in parallel to daf-7. This paper honestly and importantly alters expectations and questions the model that avoidance of PA14 is mediated by a bacterial ncRNA whose siRNAs target a C. elegans gene. Instead, endogenous C. elegans sRNAs that affect pathogen response may be the culprit that explains sRNA-mediated avoidance.

      Overall, this is an important paper that demonstrates that one model for transgenerational inheritance in C. elegans is not reproducible. This is important because it is not clear how many of the reported models of transgenerational inheritance reported in C. elegans are reproducible. The authors do demonstrate a memory for F1 embryos that could be a maternal effect, and the authors confirm that this is mediated by a systemic small RNA response. There are several points in the manuscript where a more positive tone might be helpful.

      Strengths:

      The authors note that the high copy number daf-7::GFP transgene used by the Murphy group displayed variable expression and evidence for somatic silencing or transgene breakdown in the Hunter lab, as confirmed by the Murphy group. The authors nicely use single copy daf-7::GFP to show that neuronal daf-7::GFP is elevated in F1 but not F2 progeny with regards to the memory of PA14 avoidance, speaking to an intergenerational phenotype.

      The authors nicely confirm that sid-1 and sid-2 are generally required for intergenerational avoidance of F1 embryos of moms exposed to PA14. However, these small RNA proteins did not affect daf-7::GFP elevation in the F1 progeny. This result is unexpected given previous reports that single copy daf-7::GFP is not elevated in F1 progeny of sid mutants. Because the Murphy group reported that daf-7 mutation abolishes avoidance for F1 progeny, this means that the sid genes function downstream of daf-7 or in parallel, rather than upstream as previously suggested.

      The authors studied antisense small RNAs that change in Murphy data sets, identifying 116 mRNAs that might be regulated by sRNAs in response to PA14. Importantly, the authors show that the maco-1 gene, putatively targeted by piRNAs according to the Kaletsky 2020 paper, displays few siRNAs that change in response to PA14. The authors conclude that the P11 ncRNA of PA14, which was proposed to promote interkingdom RNA communication by the Murphy group, is unlikely to affect maco-1 expression by generating sRNAs that target maco-1 in C. elegans. The authors define 8 genes based on their analysis of sRNAs and mRNAs that might promote resistance to PA14, but they do not further characterize these genes' role in pathogen avoidance. The Murphy group might wish to consider following up on these genes and their possible relationship with P11.

      Weaknesses:

      This very thorough and interesting manuscript is at times pugnacious.

      Please explain more clearly what is High Growth media for E. coli in the text and methods, conveying why it was used by the Murphy lab, and if Normal Growth or High Growth is better for intergenerational heritability assays.

    2. eLife assessment

      This important study reports numerous attempts to replicate reports on transgenerational inheritance of a learned behavior, pathogen avoidance, in C. elegans. While the authors observe parental effects that are limited to a single generation (also called intergenerational inheritance), the authors failed to find any evidence for transmission over multiple generations, or transgenerational inheritance. The experiments presented are meticulously described, making for compelling evidence that in the authors' hands transgenerational inheritance cannot be observed, although there remains the possibility that subtle differences in culture conditions or lab environment explain the failure to reproduce previous observations. Given the prominence of the original reports of transgenerational inheritance, the present study is of broad interest to anyone studying genetics, epigenetics, or learned behavior.

    3. Reviewer #2 (Public Review):

      This paper examines the reproducibility of results reported by the Murphy lab regarding transgenerational inheritance of a learned avoidance behavior in C. elegans. It has been well established by multiple labs that worms can learn to avoid the pathogen pseudomonas aeruginosa (PA14) after a single exposure. The Murphy lab has reported that learned avoidance is transmittable to 4 generations and dependent on a small RNA expressed by PA14 that elicits the transgenerational silencing of a gene in C. elegans. The Hunter lab now reports that although they can reproduce inheritance of the learned behavior by the first generation (F1), they cannot reproduce inheritance in subsequent generations.

      This is an important study that will be useful for the community. Although they fail to identify a "smoking gun", the study examines several possible sources for the discrepancy, and their findings will be useful to others interested in using these assays. The preference assay appears to work in their hands in as much as they are able to detect the learned behavior in the P0 and F1 generations, suggesting that the failure to reproduce the transgenerational effect is not due to trivial mistakes in the protocol. An obvious reason, however, to account for the differing results is that the culture conditions used by the authors are not permissive for the expression of the small RNA by PA14 that the MUrphy lab identified as required for transgenerational inheritance. It would seem prudent for the authors to determine whether this small RNA is present in their cultures, or at least acknowledge this possibility. The authors should also note that their protocol was significantly different from the Murphy protocol (see comments below) and therefore it remains possible that protocol differences cumulatively account for the different results.

    4. Reviewer #3 (Public Review):

      Summary:

      It has been previously reported in many high-profile papers, that C. elegans can learn to avoid pathogens. Moreover, this learned pathogen avoidance can be passed on to future generations - up to the F5 generation in some reports. In this paper, Gainey et al. set out to replicate these findings. They successfully replicated pathogen avoidance in the exposed animals, as well as a strong increase in daf-7 expression in ASI neurons in F1 animals, as determined by a daf-7::GFP reporter construct. However, they failed to see strong evidence for pathogen avoidance or daf-7 overexpression in the F2 generation. The failure of replication is the major focus of this work.

      Given their failure to replicate these findings, the authors embark on a thorough test of various experimental confounders that may have impacted their results. They also re-analyze the small RNA sequencing and mRNA sequencing data from one of the previously published papers and draw some new conclusions, extending this analysis.

      Strengths:

      (1) The authors provide a thorough description of their methods, and a marked-up version of a published protocol that describes how they adapted the protocol to their lab conditions. It should be easy to replicate the experiments.

      (2) The authors test the source of bacteria, growth temperature (of both C. elegans and bacteria), and light/dark husbandry conditions. They also supply all their raw data, so that the sample size for each testing plate can be easily seen (in the supplementary data). None of these variations appears to have a measurable effect on pathogen avoidance in the F2 generation, with all but one of the experiments failing to exhibit learned pathogen avoidance.

      (3) The small RNA seq and mRNA seq analysis is well performed and extends the results shown in the original paper. The original paper did not give many details of the small RNA analysis, which was an oversight. Although not a major focus of this paper, it is a worthwhile extension of the previous work.

      (4) It is rare that negative results such as these are accessible. Although the authors were unable to determine the reason that their results differ from those previously published, it is important to document these attempts in detail, as has been done here. Behavioral assays are notoriously difficult to perform and public discourse around these attempts may give clarity to the difficulties faced by a controversial field.

      Weaknesses:

      (1) Although the "standard" conditions have been tested over multiple biological replicates, many of the potential confounders that may have altered the results have been tested only once or twice. For example, changing the incubation temperature to 25{degree sign}C was tested in only two biological replicates (Exp 5.1 and 5.2) - and one of these experiments actually resulted in apparent pathogen avoidance inheritance in the F2 generation (but not in the F1). An alternative pathogen source was tested in only one biological replicate (Exp 3). Given the variability observed in the F2 generation, increasing biological replicates would have added to the strengths of the report.

      (2) A key difference between the methods used here and those published previously, is an increase in the age of the animals used for training - from mostly L4 to mostly young adults. I was unable to find a clear example of an experiment when these two conditions were compared, although the authors state that it made no difference to their results.

      (3) The original paper reports a transgenerational avoidance effect up to the F5 generation. Although in this work the authors failed to see avoidance in the F2 generation, it would have been prudent to extend their tests for more generations in at least a couple of their experiments to ensure that the F2 generation was not an aberration (although this reviewer acknowledges that this seems unlikely to be the case).

    5. Author response:

      eLife assessment

      This important study reports numerous attempts to replicate reports on transgenerational inheritance of a learned behavior, pathogen avoidance, in C. elegans. While the authors observe parental effects that are limited to a single generation (also called intergenerational inheritance), the authors failed to find any evidence for transmission over multiple generations, or transgenerational inheritance. The experiments presented are meticulously described, making for compelling evidence that in the authors' hands transgenerational inheritance cannot be observed, although there remains the possibility that subtle differences in culture conditions or lab environment explain the failure to reproduce previous observations. Given the prominence of the original reports of transgenerational inheritance, the present study is of broad interest to anyone studying genetics, epigenetics, or learned behavior.

      Thank you for your considered reviews and advice on how to improve our manuscript. We appreciate that the editors and reviewers felt that our manuscript addressed an important issue and acknowledged the difficulty of publishing negative results. We will revise the manuscript and consider all the concerns raised by the editor and referees.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors report an inability to reproduce a transgenerational memory of avoidance of the pathogen PA14 in C. elegans. Instead, the authors demonstrate intergenerational inheritance for a single F1 generation, in embryos of mothers exposed to OP50 and PA14, where embryos isolated from these mothers by bleaching are capable of remembering to avoid PA14 in a manner that is dependent on systemic RNAi proteins sid-1 and sid-2. This could reflect systemic sRNAs generated by neuronal daf-7 signaling that are transmitted to F1 embryos. The authors note that transgenerational memory of PA14 was reported by the Murphy group at Princeton, but that environmental or strain variation (worms or bacteria) might explain the single generation of inheritance observed at Harvard. The Hunter group tried different bacterial growth conditions and different worm growth temperatures for independent PA14 strains, which they showed to be strongly pathogenic. However, the authors could not reproduce a transgenerational effect at Harvard. This important data will allow members of the scientific community to focus on the robust and reproducible inheritance of PA14 avoidance transmitted to F1 embryos of mothers exposed to PA14, which the authors demonstrate depends on small RNAs in a manner that is downstream of or in parallel to daf-7. This paper honestly and importantly alters expectations and questions the model that avoidance of PA14 is mediated by a bacterial ncRNA whose siRNAs target a C. elegans gene. Instead, endogenous C. elegans sRNAs that affect pathogen response may be the culprit that explains sRNA-mediated avoidance.

      Overall, this is an important paper that demonstrates that one model for transgenerational inheritance in C. elegans is not reproducible. This is important because it is not clear how many of the reported models of transgenerational inheritance reported in C. elegans are reproducible. The authors do demonstrate a memory for F1 embryos that could be a maternal effect, and the authors confirm that this is mediated by a systemic small RNA response. There are several points in the manuscript where a more positive tone might be helpful.

      We would like to correct the statement made in the second to last sentence. The demonstration of an F1 response to PA14 was first reported by Moore et al., (2019) and then by Pereira et al., (2020) using a different behavioral assay. We merely confirmed these results in our hands, and confirmed the observation, first reported by Kaletsky et al., (2020), that sid-1 and sid-2 are required for this F1 response; although we did find that sid-1 and sid-2 are not required for the PA14-induced increase in daf-7p::gfp expression in ASI neurons in the F1 progeny of trained adults, which had not been addressed in the published work.

      Yes, the intergenerational F1 response could be a maternal effect, but the in utero F1 embryos and their precursor germ cells were directly exposed to PA14 metabolites and toxins (non-maternal effect) as well as any parental response, whether mediated by small RNAs, prions,  hormones, or other unknown information carriers. While the F1 aversion response does require sid-1 and sid-2, we would not presume that the substrate is therefore an RNA molecule, particularly because the systemic RNAi response supported by sid-1 and sid-2 is via long double-stranded RNA. To date, no evidence suggests that either protein transports small RNAs, particularly single-stranded RNAs. 

      Strengths:

      The authors note that the high copy number daf-7::GFP transgene used by the Murphy group displayed variable expression and evidence for somatic silencing or transgene breakdown in the Hunter lab, as confirmed by the Murphy group. The authors nicely use single copy daf-7::GFP to show that neuronal daf-7::GFP is elevated in F1 but not F2 progeny with regards to the memory of PA14 avoidance, speaking to an intergenerational phenotype.

      The authors nicely confirm that sid-1 and sid-2 are generally required for intergenerational avoidance of F1 embryos of moms exposed to PA14. However, these small RNA proteins did not affect daf-7::GFP elevation in the F1 progeny. This result is unexpected given previous reports that single copy daf-7::GFP is not elevated in F1 progeny of sid mutants. Because the Murphy group reported that daf-7 mutation abolishes avoidance for F1 progeny, this means that the sid genes function downstream of daf-7 or in parallel, rather than upstream as previously suggested.

      The authors studied antisense small RNAs that change in Murphy data sets, identifying 116 mRNAs that might be regulated by sRNAs in response to PA14. Importantly, the authors show that the maco-1 gene, putatively targeted by piRNAs according to the Kaletsky 2020 paper, displays few siRNAs that change in response to PA14. The authors conclude that the P11 ncRNA of PA14, which was proposed to promote interkingdom RNA communication by the Murphy group, is unlikely to affect maco-1 expression by generating sRNAs that target maco-1 in C. elegans. The authors define 8 genes based on their analysis of sRNAs and mRNAs that might promote resistance to PA14, but they do not further characterize these genes' role in pathogen avoidance. The Murphy group might wish to consider following up on these genes and their possible relationship with P11.

      Weaknesses:

      This very thorough and interesting manuscript is at times pugnacious.

      We reiterate that we never claimed that Moore et al., (2019) did not obtain their reported results. We simply stated that we could not replicate their results using the published methods and then failed in our search to identify variable(s) that might account for our results. We will do better when revising the manuscript to make clear, unmuddied statements of facts and state that future investigations may provide independent evidence that supports the original claims and explains our divergent results.

      Please explain more clearly what is High Growth media for E. coli in the text and methods, conveying why it was used by the Murphy lab, and if Normal Growth or High Growth is better for intergenerational heritability assays.

      We used the standard recipes as described in Moore et al., (2021), and will include the recipes and some of the relevant commentary from the paragraphs below to the methods and text as appropriate. 

      Normal Growth (NG) media minimally supports OP50 growth, resulting in a thin lawn that minimally obscures viewing larvae and embryos. High Growth (HG) media contains 8X more peptone, which supports much higher OP50 growth, resulting in a thick bacterial lawn that supports larger worm populations. The thicker bacterial lawn can also compromise agar integrity, and the higher worm density encourages worm burrowing behavior, thus the HG plates also have 75% more agar to inhibit worm burrowing. 

      Our results (Figure 4) show that worms grown on OP50 seeded NG or HG plates show different choice responses (PA14 vs OP50). As for experimental “advice”, we would caution our colleagues to not assume that OP50 is a neutral food and to be aware that how you grow and store OP50 (or any bacterial culture that is to be used as food for worms) may have a significant effect on the phenotype you are studying. 

      Reviewer #2 (Public Review):

      This paper examines the reproducibility of results reported by the Murphy lab regarding transgenerational inheritance of a learned avoidance behavior in C. elegans. It has been well established by multiple labs that worms can learn to avoid the pathogen pseudomonas aeruginosa (PA14) after a single exposure. The Murphy lab has reported that learned avoidance is transmittable to 4 generations and dependent on a small RNA expressed by PA14 that elicits the transgenerational silencing of a gene in C. elegans. The Hunter lab now reports that although they can reproduce inheritance of the learned behavior by the first generation (F1), they cannot reproduce inheritance in subsequent generations.

      This is an important study that will be useful for the community. Although they fail to identify a "smoking gun", the study examines several possible sources for the discrepancy, and their findings will be useful to others interested in using these assays. The preference assay appears to work in their hands in as much as they are able to detect the learned behavior in the P0 and F1 generations, suggesting that the failure to reproduce the transgenerational effect is not due to trivial mistakes in the protocol. An obvious reason, however, to account for the differing results is that the culture conditions used by the authors are not permissive for the expression of the small RNA by PA14 that the MUrphy lab identified as required for transgenerational inheritance. It would seem prudent for the authors to determine whether this small RNA is present in their cultures, or at least acknowledge this possibility.

      We note that Kaletsky et al., (2020) (Figure 3L) showed that PA14 ΔP11 bacteria failed to induce an F1 avoidance response. Thus, the fact that we observed F1 avoidance implies that our culture conditions successfully induced P11 expression. We believe that this addresses the concern raised here. We thank the reviewer for raising this issue and we will add a statement to this effect in the revised manuscript.

      The authors should also note that their protocol was significantly different from the Murphy protocol (see comments below) and therefore it remains possible that protocol differences cumulatively account for the different results.

      We disagree. Our adjustments to the core protocol were minor and, where possible, were explicitly tested in side-by-side experiments. To discover the source(s) of discrepancy between our results and the published results we subsequently introduced variations to this core protocol to exclude likely variables (worm and bacteria growth temperatures, assay conditions, worm handling methods, bacterial culture and storage conditions, and some minor developmental timing issues). To substantiate these assertions, we will, upon revision, add the precise protocol we followed for the aversion assay to the supplemental documents, provide some additional experimental results supporting these claims, and further clarify which presented experiments included protocol variations (e.g. sodium azide or cold immobilization). It remains possible that we misunderstood the published protocol, but we were highly motivated to replicate the results and read every published version with extreme care.

      Reviewer #3 (Public Review):

      Summary:

      It has been previously reported in many high-profile papers, that C. elegans can learn to avoid pathogens. Moreover, this learned pathogen avoidance can be passed on to future generations - up to the F5 generation in some reports. In this paper, Gainey et al. set out to replicate these findings. They successfully replicated pathogen avoidance in the exposed animals, as well as a strong increase in daf-7 expression in ASI neurons in F1 animals, as determined by a daf-7::GFP reporter construct. However, they failed to see strong evidence for pathogen avoidance or daf-7 overexpression in the F2 generation. The failure of replication is the major focus of this work.

      Given their failure to replicate these findings, the authors embark on a thorough test of various experimental confounders that may have impacted their results. They also re-analyze the small RNA sequencing and mRNA sequencing data from one of the previously published papers and draw some new conclusions, extending this analysis.

      Strengths:

      (1) The authors provide a thorough description of their methods, and a marked-up version of a published protocol that describes how they adapted the protocol to their lab conditions. It should be easy to replicate the experiments.

      (2) The authors test the source of bacteria, growth temperature (of both C. elegans and bacteria), and light/dark husbandry conditions. They also supply all their raw data, so that the sample size for each testing plate can be easily seen (in the supplementary data). None of these variations appears to have a measurable effect on pathogen avoidance in the F2 generation, with all but one of the experiments failing to exhibit learned pathogen avoidance.

      (3) The small RNA seq and mRNA seq analysis is well performed and extends the results shown in the original paper. The original paper did not give many details of the small RNA analysis, which was an oversight. Although not a major focus of this paper, it is a worthwhile extension of the previous work.

      (4) It is rare that negative results such as these are accessible. Although the authors were unable to determine the reason that their results differ from those previously published, it is important to document these attempts in detail, as has been done here. Behavioral assays are notoriously difficult to perform and public discourse around these attempts may give clarity to the difficulties faced by a controversial field.

      Thank you for your support. Choosing to pursue publication of these negative results was not an easy decision, and we thank members of the community for their support and encouragement.

      Weaknesses:

      (1) Although the "standard" conditions have been tested over multiple biological replicates, many of the potential confounders that may have altered the results have been tested only once or twice. For example, changing the incubation temperature to 25{degree sign}C was tested in only two biological replicates (Exp 5.1 and 5.2) - and one of these experiments actually resulted in apparent pathogen avoidance inheritance in the F2 generation (but not in the F1). An alternative pathogen source was tested in only one biological replicate (Exp 3). Given the variability observed in the F2 generation, increasing biological replicates would have added to the strengths of the report.

      We agree that our study was not exhaustive in our exploration of variables that might be interfering with our ability to detect F2 avoidance. We also note that some of these variables also failed (with many more independent experiments) to induce elevated daf-7p::gfp expression in ASI neurons in F2 progeny. Our goal was not to show that variation in some growth or assay condition would generate reproducible negative results, the exploration was designed to tweak conditions to enable detection of a robust F2 response. Given the strength of the data presented in Moore et al., (2019) we expected that adjustment of the problematic variable would produce positive results apparent in a single replicate, which could then be followed up. If we had succeeded, then we would have documented the conditions that enabled robust F2 inheritance and would have explored molecular mechanisms that support this important but mysterious process.

      (2) A key difference between the methods used here and those published previously, is an increase in the age of the animals used for training - from mostly L4 to mostly young adults. I was unable to find a clear example of an experiment when these two conditions were compared, although the authors state that it made no difference to their results.

      We can state firmly that the apparent time delay did not affect P0 learned avoidance or, as documented in Table S1, daf-7p::gfp expression in ASI neurons. In our experience, training mostly L4’s on PA14 frequently failed to produce sufficient F1 embryos for both F1 avoidance assays or daf-7p::gfp measurements in ASI neurons and collection of F2 progeny. Indeed, in early attempts to detect heritable PA14 aversion, trained P0 and F1 progeny were not assayed in order to obtain sufficient F2’s for a choice assay. These animals failed to display aversion, but without evidence of successful P0 training or an F1 intergenerational response this was deemed a non-fruitful trouble-shooting approach. We will add to our supplemental figures P0 choice results from experiments using younger trained animals that failed to produce sufficient F1’s to continue the inheritance experiments. 

      The different timing between the two protocols may reflect the age of the recovered bleached P0 embryos. It is reasonable to assume that bleaching day 1 adults vs day 2 adults from the P-1 population could shift the average age of recovered P0 embryos by several hours. The Murphy protocol only states that P0 embryos were obtained by bleaching healthy adults. Regardless, if the hypothesis entertained here is true, that a several hour difference in larval/adult age during 24 hours of training affects F2 inheritance of learned aversion but does not affect P0 learned avoidance, then we would argue that this paradigm for heritable learned avoidance, as described in Moore et al, (2019, 2021), is not sufficiently robust for mechanistic investigations. 

      (3) The original paper reports a transgenerational avoidance effect up to the F5 generation. Although in this work the authors failed to see avoidance in the F2 generation, it would have been prudent to extend their tests for more generations in at least a couple of their experiments to ensure that the F2 generation was not an aberration (although this reviewer acknowledges that this seems unlikely to be the case).

      Citations

      Moore, R.S., Kaletsky, R., and Murphy, C.T. (2019). Piwi/PRG-1 Argonaute and TGF-beta Mediate Transgenerational Learned Pathogenic Avoidance. Cell 177, 1827-1841 e1812.

      Pereira, A.G., Gracida, X., Kagias, K., and Zhang, Y. (2020). C. elegans aversive olfactory learning generates diverse intergenerational effects. J Neurogenet 34, 378-388.

      Kaletsky, R., Moore, R.S., Vrla, G.D., Parsons, L.R., Gitai, Z., and Murphy, C.T. (2020). C. elegans interprets bacterial non-coding RNAs to learn pathogenic avoidance. Nature 586, 445-451.

      Moore, R.S., Kaletsky, R., and Murphy, C.T. (2021). Protocol for transgenerational learned pathogen avoidance behavior assays in Caenorhabditis elegans. STAR Protoc 2, 100384.

    1. eLife assessment

      The intrinsic chirality of actin filaments (F-actin) is implicated in the chiral arrangement and movement of cellular structures, but it was unknown how opposite chiralities can arise when the chirality of F-actin is invariant. Kwong et al. present evidence that two actin filament-based cytoskeletal structures, transverse actin arcs and radial stress fibers, drive clockwise and anti-clockwise rotation, respectively. This fundamental work, which has broad implications for cell biology, is supported by compelling data.

    2. Reviewer #3 (Public Review):

      Summary:

      In this work, the authors plate different type of cells on circular micropatterns and question how the organization and dynamics of the actin cytoskeleton correlate with particular actin chiral properties and rotational direction of the nucleus. The observe that cell spreading on large patterns correlates with the emergence of anti-clockwise rotations (ACW), while spreading on small patterns leads preferentially to clockwise rotations (CW). ACW originate, as previously demonstrated, from the polymerization of radial fibers, while clockwise rotations (CW) are observed when radial fibers are disorganized or absent and when transverse arcs take over to power CW rotations. These data are supported by a large number of observations and use of multiple drugs lead to observations that are consistent with the proposed model.

      Strengths:

      This is a beautiful work in which the authors rely on a large number of high-quality microscopic observations and use a full arsenal of drugs to test their model as thoroughly as possible.<br /> This study examines the influence of multiple actin networks. This is a challenging task in that the assembly and dynamics of different actin networks are interdependent, making it difficult to unambiguously analyze the importance of any specific network.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Preliminary note from the Reviewing Editor:

      The evaluations of the two Reviewers are provided for your information. As you can see, their opinions are very different.

      Reviewer #1 is very harsh in his/her evaluation. Clearly, we don't expect you to be able to affect one type of actin network without affecting the other, but rather to change the balance between the two. However, he/she also raises some valid points, in particular that more rationale should be added for the perturbations (also mentioned by Reviewer #2). Both Reviewers have also excellent suggestions for improving the presentation of the data.

      We sincerely appreciate your and the reviewers’ suggestions. The comments are amended accordingly.

      On another point, I was surprised when reading your manuscript that a molecular description of chirality change in cells is presented as a completely new one. Alexander Bershadsky's group has identified several factors (including alpha-actinin) as important regulators of the direction of chirality. The articles are cited, but these important results are not specifically mentioned. Highlighting them would not call into question the importance of your work, but might even provide additional arguments for your model.

      We appreciate the editor’s comment. Alexander Bershadsky's group has done marvelous work in cell chirality. They introduced the stair-stepping and screw theory, which suggested how radial fiber polymerization generates ACW force and drives the actin cytoskeleton into the ACW pattern. Moreover, they have identified chiral regulators like alpha-actinin 1, mDia1, capZB, and profilin 1, which can reverse or neutralize the chiral expression.

      It is worth noting that Bershadsky's group primarily focuses on radial fibers. In our manuscript, instead, we primarily focused on the contractile unit in the transverse arcs and CW chirality in our investigation. Our manuscript incorporates our findings in the transverse arcs and the radial fibers theory by Bershadsky's group into the chirality balance hypothesis, providing a more comprehensive understanding of the chirality expression.

      We have included relevant articles from Alexander Bershadsky's group, we agree that highlighting these important results of chiral regulators would further strengthen our manuscript. The manuscript was revised as follows:

      “ACW chirality can be explained by the right-handed axial spinning of radial fibers during polymerization, i.e. ‘stair-stepping' mode proposed by Tee et al. (Tee et al. 2015) (Figure 8A; Video 4). As actin filament is formed in a right-handed double helix, it possesses an intrinsic chiral nature. During the polymerization of radial fiber, the barbed end capped by formin at focal adhesion was found to recruit new actin monomers to the filament. The tethering by formin during the recruitment of actin monomers contributes to the right-handed tilting of radial fibers, leading to ACW rotation. Supporting this model, Jalal et al. (Jalal et al. 2019) showed that the silencing of mDia1, capZB, and profilin 1 would abolish the ACW chiral expression or reverse the chirality into CW direction. Specifically, the silencing of mDia1, capZB or profilin-1 would attenuate the recruitment of actin monomer into the radial fiber, with mDia1 acting as the nucleator of actin filament (Tsuji et al. 2002), CapZB promoting actin polymerization as capping protein (Mukherjee et al. 2016), and profilin-1 facilitating ATP-bound G-actin to the barbed ends(Haarer and Brown 1990; Witke 2004). The silencing resulted in a decrease in the elongation velocity of radial fiber, driving the cell into neutral or CW chirality. These results support that our findings that reduction of radial fiber elongation can invert the balance of chirality expression, changing the ACW-expressing cell into a neutral or CW-expressing cell.”

      By incorporating their findings into our revision and discussion, we provide additional support for our radial fiber-transverse arc balance model for chirality expression. The revision is made on pages 8 to 9, 13, lines 253 to 256, 284, 312 to 313, 443, 449 to 459.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Kwong et al. present evidence that two actin-filament based cytoskeletal structures regulate the clockwise and anticlockwise rotation of the cytoplasm. These claims are based on experiments using cells plated on micropatterned substrates (circles). Previous reports have shown that the actomyosin network that forms on the dorsal surface of a cell plated on a circle drives a rotational or swirling pattern of movement in the cytoplasm. This actin network is composed of a combination of non-contractile radial stress fibers (AKA dorsal stress fibers) which are mechanically coupled to contractile transverse actin arcs (AKA actin arcs). The authors claim that directionality of the rotation of the cytoplasm (i.e., clockwise or anticlockwise) depends on either the actin arcs or radial fibers, respectively. While this would interesting, the authors are not able to remove either actin-based network without effecting the other. This is not surprising, as it is likely that the radial fibers require the arcs to elongate them, and the arcs require the radial fibers to stop them from collapsing. As such, it is difficult to make simple interpretations such as the clockwise bias is driven by the arcs and anticlockwise bias is driven by the radial fibers.

      Weaknesses:

      (1) There are also multiple problems with how the data is displayed and interpreted. First, it is difficult to compare the experimental data with the controls as the authors do not include control images in several of the figures. For example, Figure 6 has images showing myosin IIA distribution, but Figure 5 has the control image. Each figure needs to show controls. Otherwise, it will be difficult for the reader to understand the differences in localization of the proteins shown. This could be accomplished by either adding different control examples or by combining figures.

      We appreciate the reviewer’s comment. We agree with the reviewer that it is difficult to compare our results in the current arrangement. The controls are included in the new Figure 6.

      (2) It is important that the authors should label the range of gray values of the heat maps shown. It is difficult to know how these maps were created. I could not find a description in the methods, nor have previous papers laid out a standardized way of doing it. As such, the reader needs some indication as to whether the maps showing different cells were created the same and show the same range of gray levels. In general, heat maps showing the same protein should have identical gray levels. The authors already show color bars next to the heat maps indicating the range of colors used. It should be a simple fix to label the minimum (blue on the color bar) and the maximum (red on the color bar) gray levels on these color bars. The profiles of actin shown in Figure 3 and Figure 3- figure supplement 3 were useful for interpretating the distribution of actin filaments. Why did not the authors show the same for the myosin IIa distributions?

      We appreciate the reviewer’s comment. For generating the distribution heatmap, the images were taken under the same setting (e.g., fluorescent staining procedure, excitation intensity, or exposure time). The prerequisite of cells for image stacking was that they had to be fully spread on either 2500 µm2 or 750 µm2 circular patterns. Then, the location for image stacking was determined by identifying the center of each cell spread in a perfect circle. Finally, the images were aligned at the cell center to calculate the averaged intensity to show the distribution heatmap on the circular pattern. Revision is made on pages 19 to 20, lines 668 to 677.

      It is important to note that the individual heatmaps represent the normalized distribution generated using unique color intensity ranges. This approach was chosen to emphasize the proportional distribution of protein within cells and its variations among samples, especially for samples with generally lower expression levels. Additionally, a differential heatmap with its own range was employed to demonstrate the normalized differences compared to the control sample. Furthermore, to provide additional insight, we plotted the intensity profile of the same protein with the same size for comparative analysis. Revision is made on pages 20, lines 679 to 682.

      The labels of the heatmap are included to show the intensity in the revised Figure 3, Figure 5, Figure 6, and Figure 3 —figure supplement 4.

      To better illustrate the myosin IIa distribution, the myosin intensity profiles were plotted for Y27 treatment and gene silencing. The figures are included as Figure 5—figure supplement 2 and Figure 6—figure supplement 2. Revisions are made on pages 10, lines 332 to 334 and pages 11, lines 377 to 379.

      (3) Line 189 "This absence of radial fibers is unexpected". The authors should clarify what they mean by this statement. The claim that the cell in Figure 3B has reduced radial stress fiber is not supported by the data shown. Every actin structure in this cell is reduced compared to the cell on the larger micropattern in Figure 3A. It is unclear if the radial stress fibers are reduced more than the arcs. Are the authors referring to radial fiber elongation?

      We appreciate the reviewer’s comment. We calculated the structures' pixel number and the percentage in the image to better illustrate the reduction of radial fiber or transverse arc. As radial fibers emerge from the cell boundary and point towards the cell center and the transverse arcs are parallel to the cell edge, the actin filament can be identified by their angle with respect to the cell center. We found that the pixel number of radial fiber is greatly reduced by 91.98 % on 750 µm2 compared to the 2500 µm2 pattern, while the pixel number of transverse arc is reduced by 70.58 % (Figure 3- figure supplement 3A). Additionally, we compared the percentage of actin structures on different pattern sizes (Figure 3- figure supplement 3B). On 2500 µm2 pattern, the percentage of radial fiber in the actin structure is 61.76 ± 2.77 %, but it only accounts for 31.13 ± 2.76 % while on 750 µm2 pattern. These results provide evidence of the structural reduction on a smaller pattern.

      Regarding the radial fiber elongation, we only discussed the reduction of radial fiber on 750 µm2 compared to the 2500 µm2 pattern in this part. For more understanding of the radial fiber contribution to chirality, we compared the radial fiber elongation rate in the LatA treatment and control on 2500 µm2 pattern (Figure 4). This result suggests the potential role of radial fiber in cell chirality. Revisions are made on page 6, lines 186 to 194; pages 17 to 18, 601 to 606; and the new Figure 3- figure supplement 3.

      (4) The choice of the small molecule inhibitors used in this study is difficult to understand, and their results are also confusing. For example, sequestering G actin with Latrunculin A is a complicated experiment. The authors use a relatively low concentration (50 nM) and show that actin filament-based structures are reduced and there are more in the center of the cell than in controls (Figure 3E). What was the logic of choosing this concentration?

      We appreciate the reviewer’s comment. The concentration of drugs was selected based on literatures and their known effects on actin arrangement or chiral expression.

      For example, Latrunculin A was used at 50 nM concentration, which has been proven effective in reversing the chirality at or below 50 nM (Bao et al., 2020; Chin et al., 2018; Kwong et al., 2019; Wan et al., 2011). Similarly, the 2 µM A23187 treatment concentration was selected to initiate the actin remodeling (Shao et al., 2015). Furthermore, NSC23677 at 100 µM was found to efficiently inhibit the Rac1 activation and resulted in a distinct change in actin structure (Chen et al., 2011; Gao et al., 2004), enhancing ACW chiral expression. The revision is made on pages 6 to 7, lines 202 to 211.

      (5) Using a small molecule that binds the barbed end (e.g., cytochalasin) could conceivably be used to selectively remove longer actin filaments, which the radial fibers have compared to the lamellipodia and the transverse arcs. The authors should articulate how the actin cytoskeleton is being changed by latruculin treatment and the impact on chirality. Is it just that the radial stress fibers are not elongating? There seems to be more radial stress fibers than in controls, rather than an absence of radial stress fibers.

      We appreciate the reviewer’s comment. Our results showed Latrunculin A treatment reversed the cell chirality. To compare the amount of radial fiber and transverse arc, we calculated the structures' pixel percentage. We found that, the percentage of radial fibers pixel with LatA treatment was reduced compared to that of the control, while the percentage of transverse arcs pixel increased (Figure 3— figure supplement 5). This result suggests that radial fibers are inhibited under Latrunculin A treatment.

      Furthermore, the elongation rate of radial fibers is reduced by Latrunculin A treatment (Figure 4). This result, along with the reduction of radial fiber percentage under Latrunculin A treatment suggests the significant impact of radial fiber on the ACW chirality.  Revisions are made on pages 7 to 8, lines 244 to 250 and the new Figure 3— figure supplement 5 and Figure 3— figure supplement 6.

      (6) Similar problems arise from the other small molecules as well. LPA has more effects than simply activating RhoA. Additionally, many of the quantifiable effects of LPA treatment are apparent only after the cells are serum starved, which does not seem to be the case here.

      We appreciate the reviewer’s comment. The reviewer mentioned that the quantifiable effects of LPA treatments were seen after the cells were serum-starved. LPA is known to be a serum component and has an affinity to albumin in serum (Moolenaar, 1995). Serum starvation is often employed to better observe the effects of LPA by comparing conditions with and without LPA. We agree with the reviewer that the effect of LPA cannot be fully seen under the current setting. Based on the reviewer’s comment and after careful consideration, we have decided to remove the data related to LPA from our manuscript. Revisions are made on pages 6 to 7, 17 and Figure 3— figure supplement 4.

      (7) Furthermore, inhibiting ROCK with, Y-27632, effects myosin light chain phosphorylation and is not specific to myosin IIA. Are the two other myosin II paralogs expressed in these cells (myosin IIB and myosin IIC)? If so, the authors’ statements about this experiment should refer to myosin II not myosin IIa.

      We appreciate the reviewer’s comment. We agree that ensuring accuracy and clarity in our statements is important. The terminology is revised to myosin II regarding the Y27632 experiment for a more concise description. Revision is made on pages 9 to 10 and 29, lines 317 to 341, 845 and 848.  

      (8) None of the uses of the small molecules above have supporting data using a different experimental method. For example, backing up the LPA experiment by perturbing RhoA tho.

      We appreciate the reviewer’s comment. After careful consideration, we have decided to remove the data related to LPA from our manuscript. Revisions are made on pages 6 to 7, 17 and Figure 3— figure supplement 4.

      (9) The use of SMIFH2 as a "formin inhibitor" is also problematic. SMIFH2 also inhibits myosin II contractility, making interpreting its effects on cells difficult to impossible. The authors present data of mDia2 knockdown, which would be a good control for this SMIFH2.

      We appreciate the reviewer’s comment. We agree that there is potential interference of SMIFH2 with myosin II contractility, which could introduce confounding factors to the results. Based on your comment and further consideration, we have decided to remove the data related to SMIFH2 from our manuscript. Revisions are made on pages 6 to 7, 10, 17 and Figure 3— figure supplement 4.

      (10) However, the authors claim that mDia2 "typically nucleates tropomyosin-decorated actin filaments, which recruit myosin II and anneal endwise with α-actinin- crosslinked actin filaments."

      There is no reference to this statement and the authors own data shows that both arcs and radial fibers are reduced by mDia2 knockdown. Overall, the formin data does not support the conclusions the authors report.

      We appreciate the reviewer’s comment. We apologize for the lack of citation for this claim. To address this, we have added a reference to support this claim in the revised manuscript (Tojkander et al., 2011). Revision is made on page 10, line 345 to 347.

      Regarding the actin structure of mDia2 gene silencing, our results showed that myosin II was disassociated from the actin filament compared to the control. At the same time, there is no considerable differences in the actin structure of radial fibers and transverse arcs between the mDia2 gene silencing and the control.  

      (11) The data in Figure 7 does not support the conclusion that myosin IIa is exclusively on top of the cell. There are clear ventral stress fibers in A (actin) that have myosin IIa localization. The authors simply chose to not draw a line over them to create a height profile.

      We appreciate the reviewer’s comment. To better illustrate myosin IIa distribution in a cell, we have included a video showing the myosin IIa staining from the base to the top of the cell (Video 7). At the cell base, the intensity of myosin IIa is relatively low at the center. However, when the focal plane elevates, we can clearly see the myosin II localizes near the top of the cell (Figure 7B and Video 7). Revision is made on page 12, lines 421 to 424, and the new Video 7. 

      Reviewer #2 (Public Review):

      Summary:

      Chirality of cells, organs, and organisms can stem from the chiral asymmetry of proteins and polymers at a much smaller lengthscale. The intrinsic chirality of actin filaments (F-actin) is implicated in the chiral arrangement and movement of cellular structures including F-actin-based bundles and the nucleus. It is unknown how opposite chiralities can be observed when the chirality of F-actin is invariant. Kwong, Chen, and co-authors explored this problem by studying chiral cell-scale structures in adherent mammalian cultured cells. They controlled the size of adhesive patches, and examined chirality at different timepoints. They made various molecular perturbations and used several quantitative assays. They showed that forces exerted by antiparallel actomyosin bundles on parallel radial bundles are responsible for the chirality of the actomyosin network at the cell scale.

      Strengths:

      Whereas previously, most effort has been put into understanding radial bundles, this study makes an important distinction that transverse or circumferential bundles are made of antiparallel actomyosin arrays. A minor point that was nice for the paper to make is that between the co-existing chirality of nuclear rotation and radial bundle tilt, it is the F-actin driving nuclear rotation and not the other way around. The paper is clearly written.

      Weaknesses:

      The paper could benefit from grammatical editing. Once the following Major and Minor points are addressed, which may not require any further experimentation and does not entail additional conditions, this manuscript would be appropriate for publication in eLife.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      Major:

      (1) The binary classification of cells as exhibiting clockwise or anticlockwise F-actin structures does not capture the instances where there is very little chirality, such as in the mDia2-depleted cells on small patches (Figure 6B). Such reports of cell chirality throughout the cell population need to be reported as the average angle of F-actin structures on a per cell basis as a rose plot or scatter plot of angle. These changes to cell-scoring and data display will be important to discern between conditions where chirality is random (50% CW, 50% ACW) from conditions where chirality is low (radial bundles are radial and transverse arcs are circumferential).

      We appreciate the reviewer’s comment. We apologize if we did not convey our analysis method clearly enough. Throughout the manuscript, unless mentioned otherwise, the chirality analysis was based on the chiral nucleus rotation within a period of observation. The only exception is the F-actin structure chirality, in Figure 3—figure supplement 1, which we analyzed the angle of radial fiber of the control cell on 2500 µm2. It was described on pages 5 to 6, lines 169-172, and the method section “Analysis of fiber orientation and actin structure on circular pattern” on page 17.

      Based on the feedback, we attempted to use a scatter plot to present the mDia2 overexpression and silencing to show the randomness of the result. However, because scatter plots primarily focus on visualizing the distribution, they become cluttered and visually overwhelming, as shown below.

      Author response image 1.

      (A) Percentage of ACW nucleus rotational bias on 2500 µm2 with untreated control (reused data from Figure 3D, n = 57), mDia2 silencing (n = 48), and overexpression (n = 25). (B) Probability of ACW/CW rotation on 750 µm2 pattern with untreated control (reused data from Figure 3E, n = 34), mDia2 silencing (n = 53), and overexpressing (n = 22). Mean ± SEM. Two-sample equal variance two-tailed t-test.

      Therefore, in our manuscript, the presentation primarily used a column bar chart with statistical analysis, the Student T-test. The column bar chart makes it easier to understand and compare values. In brief, the Student T-test is commonly used to evaluate whether the means between the two groups are significantly different, assuming equal variance. As such, the Student T-test is able to discern the randomness of the chirality.

      (2) The authors need to discuss the likely nucleator of F-actin in the radial bundles, since it is apparently not mDia2 in these cells.

      We appreciate the reviewer’s comment. In our manuscript, we originally focused on mDia2 and Tpm4 as they are the transverse arc nucleator and the mediator of myosin II motion. However, we agree with the reviewer that discussing the radial fiber nucleator would provide more insight into radial fiber polymerization in ACW chirality and improve the completeness of the story.

      Radial fiber polymerizes at the focal adhesion. Serval proteins are involved in actin nucleation or stress fiber formation at the focal adhesion, such as Arp2/3 complex (Serrels et al., 2007), Ena/VASP (Applewhite et al., 2007; Gateva et al., 2014), and formins (Dettenhofer et al., 2008; Sahasrabudhe et al., 2016; Tsuji et al., 2002), etc. Within the formin family, mDia1 is the likely nucleator of F-actin in the radial bundle. The presence of mDia1 facilitates the elongation of actin bundles at focal adhesion (Hotulainen and Lappalainen, 2006). Studies by Jalal, et al (2019) (Jalal et al., 2019) and Tee, et al (2023) (Tee et al., 2023), have demonstrated the silencing of mDia1 abolished the ACW actin expression. Silencing of other nucleation proteins like Arp2/3 complex or Ena/VASP would only reduce the ACW actin expression without abolishing it.

      Based on these findings, the attenuation of radial fiber elongation would abolish the ACW chiral expression, providing more support for our model in explaining chirality expression.

      This part is incorporated into the Discussion. The revision is made on page 13, lines 443, 449 to 459.

      Minor:

      (1) In the introduction, additional observations of handedness reversal need to be referenced (line 79), including Schonegg, Hyman, and Wood 2014 and Zaatri, Perry, and Maddox 2021.

      We appreciate the reviewer’s comment. The observations of handedness reversal references are cited on page 3, line 78 to 79.

      (2) For clarity of logic, the authors should share the rationale for choosing, and results from administering, the collection of compounds as presented in Figure 3 one at a time instead of as a list.

      We appreciate the reviewer’s comment. The concentration of drugs was determined based on existing literature and their known outcomes on actin arrangement or chiral expression.

      To elucidate, the use of Latrunculin A was based on previous studies, which have demonstrated to reverse the chirality at or below 50 nM (Bao et al., 2020; Chin et al., 2018; Kwong et al., 2019; Wan et al., 2011).  Because inhibiting F-actin assembly can lead to the expression of CW chirality, we hypothesized that the opposite treatment might enhance ACW chirality. Therefore, we chose A23187 treatment with 2 µM concentration as it could initiate the actin remodeling and stress fiber formation (Shao et al., 2015).

      Furthermore, in the attempt to replicate the reversal of chirality by inhibiting F-actin assembly through other pathways, we explored NSC23677 at 100 µM, which was found to inhibit the Rac1 activation (Chen et al., 2011; Gao et al., 2004) and reduce cortical F-actin assembly (Head et al., 2003). However, it failed to reverse the chirality but enhanced the ACW chirality of the cell.

      We carefully selected the drugs and the applied concentration to investigate various pathways and mechanisms that influence actin arrangement and might affect the chiral expression. We believe that this clarification strengthens the rationale behind our choice of drug. The revision is made on pages 6 to 7, lines 202 to 211.

      (3) "Image stacking" isn't a common term to this referee. Its first appearance in the main text (line 183) should be accompanied with a call-out to the Methods section. The authors could consider referring to this approach more directly. Related issue: Image stacking fails to report the prominent enrichment of F-actin at the very cell periphery (see Figure 3 A and F) except for with images of cells on small islands (Figure 3H). Since this data display approach seems to be adding the intensity from all images together, and since cells on circular adhesive patches are relatively radially symmetric, it is unclear how to align cells, but perhaps cells could be aligned based on a slight asymmetry such as the peripheral location with highest F-actin intensity or the apparent location of the centrosome.

      We appreciate the reviewer’s comment. We fully acknowledge the uncommon use of “image stacking” and the insufficient description of image stacking under the Method section. First, we have added a call-out to the Methods section at its first appearance (Page 6, Lines 182 to 183). The method of image stacking is as follows. During generating the distribution heatmap, the images were taken under the same setting (e.g., staining procedure, fluorescent intensity, exposure time, etc.). The prerequisite of cells to be included in image stacking was that they had to be fully spread on either 2500 µm2 or 750 µm2 circular patterns. Then, the consistent position for image stacking could be found by identifying the center of each cell spreading in a perfect circle. Finally, the images were aligned at the center to calculate the averaged intensity to show the distribution heatmap on the circular pattern.

      We agree with the reviewer that our image alignment and stacking are based on cells that are radially symmetric. As such, the intensity distribution of stacked image is to compare the difference of F-actin along the radial direction. Revision is made on page 19, lines 668 to 682.

      (4) The authors need to be consistent with wording about chirality, avoiding "right" and left (e.g. lines 245-6) since if the cell periphery were oriented differently in the cropped view, the tilt would be a different direction side-to-side but the same chirality. This section is confusing since the peripheral radial bundles are quite radial, and the inner ones are pointing from upper left to lower right, pointing (to the right) more downward over time, rather than more right-ward, in the cropped images.

      We appreciate the reviewer’s comment. We apologize for the confusion caused by our description of the tilting direction. For consistency in our later description, we mention the “right” or “left” direction of the radial fibers referencing to the elongation of the radial fiber, which then brings the “rightward tilting” toward the ACW rotation of the chiral pattern. To maintain the word “rightward tilting”, we added the description to ensure accurate communication in our writing. We also rearrange the image in the new Figure 4A and Video 2 for better observation. Revision is made on page 8, lines 262 to 263.

      (5) Why are the cells Figure 4A dominated by radial (and more-central, tilting fibers, while control cells in 4D show robust circumferential transverse arcs? Have these cells been plated for different amounts of time or is a different optical section shown?

      We appreciate the reviewer’s comment. The cells in Figure 4A and Figure 4D are prepared with similar conditions, such as incubation time and optical setting. Actin organization is a dynamic process, and cells can exhibit varied actin arrangements, transitioning between different forms such as circular, radial, chordal, chiral, or linear patterns, as they spread on a circular island (Tee et al., 2015). In Figure 4A, the actin is arranged in a chiral pattern, whereas in Figure 4D, the actin exhibits a radial pattern. These variations reflect the natural dynamics of actin organization within cells during the imaging process.

      (6) All single-color images (such as Fig 5 F-actin) need to be black-on-white, since it is far more difficult to see F-actin morphology with red on black.

      We appreciate the reviewer’s comment. We have changed all F-actin images (single color) into black and white for better image clarity. Revisions are made in the new Figure 5, Figure 6 and Figure 7.

      (7) Figure 5A, especially the F-actin staining, is quite a bit blurrier than other micrographs. These images should be replaced with images of comparable quality to those shown throughout.

      We appreciate the reviewer’s comment. We agree that the F-actin staining in Figure 5 is difficult to observe. To improve image clarity, the F-actin staining images are replaced with more zoomed-in image. Revision is made in the new Figure 5.

      (8) F-actin does not look unchanged by Y27632 treatment, as the authors state in line 306. This may be partially due to image quality and the ambiguities of communicating with the blue-to-red colormap. Similarly, I don't agree that mDia2 depletion did not change F-actin distribution (line 330) as cells in that condition had a prominent peripheral ring of F-actin missing from cells in other conditions.

      We appreciate the reviewer’s comment. We agree with the reviewer’s observation that the F-actin distribution is indeed changed under Y27632 treatment compared to the control in Figure 5A-B. Here, we would like to emphasize that the actin ring persists despite the actin structure being altered under the Y27632 treatment. The actin ring refers to the darker red circle in the distribution heatmap. It presents the condensed actin structure, including radial fibers and transverse arcs. This important structure remains unaffected despite the disruption of myosin II, the key component in radial fiber.

      Furthermore, we agree with the reviewer that mDia2 depletion does change F-actin distribution. Similar to the Y27632 treatment, the actin ring persists despite the actin structure being altered under mDia2 gene silencing. Moreover, compared to other treatments, mDia2 depletion has less significant impact on actin distribution. To address these points more comprehensively, we have made revision in Y27632 treatment and mDia2 sections. The revisions of Y27632 and mDia2 are made on pages 10, lines 324-327 and 352-353, respectively.

      (9) The colormap shown for intensity coding should be reconsidered, as dark red is harder to see than the yellow that is sub-maximal. Verdis is a colormap ranging from cooler and darker blue, through green, to warmer and lighter yellow as the maximum. Other options likely exist as well.

      We appreciate the reviewer’s comment. We carefully considered the reviewer’s concern and explored other color scale choices in the colormap function in Matlab. After evaluating different options, including “Verdis” color scale, we found that “jet” provides a wide range of colors, allowing the effective visual presentation of intensity variation in our data. The use of ‘jet’ allows us to appropriately visualize the actin ring distribution, which represented in red or dark re. While we understand that dark red could be harder to see than the sub-maximal yellow, we believe that “jet” serves our purpose of presenting the intensity information.

      (10) For Figure 6, why doesn't average distribution of NMMIIa look like the example with high at periphery, low inside periphery, moderate throughout lamella, low perinuclear, and high central?

      We appreciate the reviewer’s comment. We understand that the reviewer’s concern about the average distribution of NMMIIa not appearing as the same as the example. The chosen image is the best representation of the NMMIIa disruption from the transverse arcs after the mDia2 silencing. Additionally, it is important to note that the average distribution result is a stacked image which includes other images. As such, the NMMIIA example and the distribution heatmap might not necessarily appear identical.

      (11) In 2015, Tee, Bershadsky and colleagues demonstrated that transverse bundles are dorsal to radial bundles, using correlative light and electron microscopy. While it is important for Kwong and colleagues to show that this is true in their cells, they should reference Tee et al. in the rationale section of text pertaining to Figure 7.

      We appreciate the reviewer’s comment. Tee, et al (Tee et al., 2015) demonstrated the transverse fiber is at the same height as the radial fiber based on the correlative light and electron microscopy. Here, using the position of myosin IIa, a transverse arc component, our results show the dorsal positioning of transverse arcs with connection to the extension of radial fibers (Figure 7C), which is consistent with their findings. It is included in our manuscript, page 12, lines 421 to 424, and page 14 lines 477 to 480.

      Reference

      Applewhite, D.A., Barzik, M., Kojima, S.-i., Svitkina, T.M., Gertler, F.B., and Borisy, G.G. (2007). Ena/Vasp Proteins Have an Anti-Capping Independent Function in Filopodia Formation. Mol. Biol. Cell. 18, 2579-2591. DOI: https://doi.org/10.1091/mbc.e06-11-0990

      Bao, Y., Wu, S., Chu, L.T., Kwong, H.K., Hartanto, H., Huang, Y., Lam, M.L., Lam, R.H., and Chen, T.H. (2020). Early Committed Clockwise Cell Chirality Upregulates Adipogenic Differentiation of Mesenchymal Stem Cells. Adv. Biosyst. 4, 2000161. DOI: https://doi.org/10.1002/adbi.202000161

      Chen, Q.-Y., Xu, L.-Q., Jiao, D.-M., Yao, Q.-H., Wang, Y.-Y., Hu, H.-Z., Wu, Y.-Q., Song, J., Yan, J., and Wu, L.-J. (2011). Silencing of Rac1 Modifies Lung Cancer Cell Migration, Invasion and Actin Cytoskeleton Rearrangements and Enhances Chemosensitivity to Antitumor Drugs. Int. J. Mol. Med. 28, 769-776. DOI: https://doi.org/10.3892/ijmm.2011.775

      Chin, A.S., Worley, K.E., Ray, P., Kaur, G., Fan, J., and Wan, L.Q. (2018). Epithelial Cell Chirality Revealed by Three-Dimensional Spontaneous Rotation. Proc. Natl. Acad. Sci. U.S.A. 115, 12188-12193. DOI: https://doi.org/10.1073/pnas.1805932115

      Dettenhofer, M., Zhou, F., and Leder, P. (2008). Formin 1-Isoform IV Deficient Cells Exhibit Defects in Cell Spreading and Focal Adhesion Formation. PLoS One 3, e2497. DOI:  https://doi.org/10.1371/journal.pone.0002497

      Gao, Y., Dickerson, J.B., Guo, F., Zheng, J., and Zheng, Y. (2004). Rational Design and Characterization of a Rac GTPase-Specific Small Molecule Inhibitor. Proc. Natl. Acad. Sci. U.S.A. 101, 7618-7623. DOI: https://doi.org/10.1073/pnas.0307512101

      Gateva, G., Tojkander, S., Koho, S., Carpen, O., and Lappalainen, P. (2014). Palladin Promotes Assembly of Non-Contractile Dorsal Stress Fibers through Vasp Recruitment. J. Cell Sci. 127, 1887-1898. DOI: https://doi.org/10.1242/jcs.135780

      Haarer, B., and Brown, S.S. (1990). Structure and Function of Profilin.

      Head, J.A., Jiang, D., Li, M., Zorn, L.J., Schaefer, E.M., Parsons, J.T., and Weed, S.A. (2003). Cortactin Tyrosine Phosphorylation Requires Rac1 Activity and Association with the Cortical Actin Cytoskeleton. Mol. Biol. Cell. 14, 3216-3229. DOI: https://doi.org/10.1091/mbc.e02-11-0753

      Hotulainen, P., and Lappalainen, P. (2006). Stress Fibers are Generated by Two Distinct Actin Assembly Mechanisms in Motile Cells. J. Cell Biol. 173, 383-394. DOI: https://doi.org/10.1083/jcb.200511093

      Jalal, S., Shi, S., Acharya, V., Huang, R.Y., Viasnoff, V., Bershadsky, A.D., and Tee, Y.H. (2019). Actin Cytoskeleton Self-Organization in Single Epithelial Cells and Fibroblasts under Isotropic Confinement. J. Cell Sci. 132. DOI: https://doi.org/10.1242/jcs.220780

      Kwong, H.K., Huang, Y., Bao, Y., Lam, M.L., and Chen, T.H. (2019). Remnant Effects of Culture Density on Cell Chirality after Reseeding. J. Cell Sci. 132. DOI: https://doi.org/10.1242/jcs.220780

      Moolenaar, W.H. (1995). Lysophosphatidic Acid, a Multifunctional Phospholipid Messenger. J. Cell Sci. 132. DOI: https://doi.org/10.1242/jcs.220780

      Mukherjee, K., Ishii, K., Pillalamarri, V., Kammin, T., Atkin, J.F., Hickey, S.E., Xi, Q.J., Zepeda, C.J., Gusella, J.F., and Talkowski, M.E. (2016). Actin Capping Protein Capzb Regulates Cell Morphology, Differentiation, and Neural Crest Migration in Craniofacial Morphogenesis. Hum. Mol. Genet. 25, 1255-1270. DOI: https://doi.org/10.1093/hmg/ddw006

      Sahasrabudhe, A., Ghate, K., Mutalik, S., Jacob, A., and Ghose, A. (2016). Formin 2 Regulates the Stabilization of Filopodial Tip Adhesions in Growth Cones and Affects Neuronal Outgrowth and Pathfinding In Vivo. Development 143, 449-460. DOI: https://doi.org/10.1242/dev.130104

      Serrels, B., Serrels, A., Brunton, V.G., Holt, M., McLean, G.W., Gray, C.H., Jones, G.E., and Frame, M.C. (2007). Focal Adhesion Kinase Controls Actin Assembly via a Ferm-Mediated Interaction with the Arp2/3 Complex. Nat. Cell Biol. 9, 1046-1056. DOI: https://doi.org/10.1038/ncb1626

      Shao, X., Li, Q., Mogilner, A., Bershadsky, A.D., and Shivashankar, G. (2015). Mechanical Stimulation Induces Formin-Dependent Assembly of a Perinuclear Actin Rim. Proc. Natl. Acad. Sci. U.S.A. 112, E2595-E2601. DOI: https://doi.org/10.1073/pnas.1504837112

      Tee, Y.H., Goh, W.J., Yong, X., Ong, H.T., Hu, J., Tay, I.Y.Y., Shi, S., Jalal, S., Barnett, S.F., and Kanchanawong, P. (2023). Actin Polymerisation and Crosslinking Drive Left-Right Asymmetry in Single Cell and Cell Collectives. Nat. Commun. 14, 776. DOI: https://doi.org/10.1038/s41467-023-35918-1

      Tee, Y.H., Shemesh, T., Thiagarajan, V., Hariadi, R.F., Anderson, K.L., Page, C., Volkmann, N., Hanein, D., Sivaramakrishnan, S., Kozlov, M.M., and Bershadsky, A.D. (2015). Cellular Chirality Arising from the Self-Organization of the Actin Cytoskeleton. Nat. Cell Biol. 17, 445-457. DOI: https://doi.org/10.1038/ncb3137

      Tojkander, S., Gateva, G., Schevzov, G., Hotulainen, P., Naumanen, P., Martin, C., Gunning, P.W., and Lappalainen, P. (2011). A Molecular Pathway for Myosin II Recruitment to Stress Fibers. Curr. Biol. 21, 539-550. DOI: https://doi.org/10.1016/j.cub.2011.03.007

      Tsuji, T., Ishizaki, T., Okamoto, M., Higashida, C., Kimura, K., Furuyashiki, T., Arakawa, Y., Birge, R.B., Nakamoto, T., Hirai, H., and Narumiya, S. (2002). Rock and mdia1 Antagonize in Rho-Dependent Rac Activation in Swiss 3T3 Fibroblasts. J. Cell Biol. 157, 819-830. DOI: https://doi.org/10.1083/jcb.200112107

      Wan, L.Q., Ronaldson, K., Park, M., Taylor, G., Zhang, Y., Gimble, J.M., and Vunjak-Novakovic, G. (2011). Micropatterned Mammalian Cells Exhibit Phenotype-Specific Left-Right Asymmetry. Proc. Natl. Acad. Sci. U.S.A. 108, 12295-12300. DOI: https://doi.org/10.1073/pnas.1103834108

      Witke, W. (2004). The Role of Profilin Complexes in Cell Motility and Other Cellular Processes. Trends Cell Biol. 14, 461-469. DOI: https://doi.org/10.1016/j.tcb.2004.07.003

    1. eLife assessment

      This important study presents work on the molecular mechanism driving asymmetric cell division and fate decisions during embryonic development of echinoids. The evidence supporting the claims of the authors is solid overall but with some concerns about quantification and a lack of explanation for some of the findings. The work will be of interest to developmental biologists and cell biologists working in the field of self-renewal.

    1. eLife assessment

      This manuscript describes the characterization of the conformational dynamics of two chemokine receptors at the single-molecule level using FRET. The authors make a convincing case for attributing the distinct interaction and pharmacology of the two receptors to differences in their conformational energy landscape. These important findings will be of interest to scientists working on activation mechanisms of GPCRs and signal transduction.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper uses single-molecule FRET to investigate the molecular basis for the distinct activation mechanisms between 2 GPCR responding to the chemokine CXCL12 : CXCR4, that couples to G-proteins, and ACKR3, which is G-protein independent and displays a higher basal activity.

      Strengths:

      It nicely combines the state-of-the-art techniques used in the studies of the structural dynamics of GPCR. The receptors are produced from eukaryotic cells, mutated, and labeled with single molecule compatible fluorescent dyes. They are reconstituted in nanodiscs, which maintain an environment as close as possible to the cell membrane, and immobilized through the nanodisc MSP protein, to avoid perturbing the receptor's structural dynamics by the use of an antibody for example.

      The smFRET data are analysed using the HHMI technique, and the number of states to be taken into account is evaluated using a Bayesian Information Criterion, which constitutes the state-of-the-art for this task.

      The data show convincingly that the activation of the CXCR4 and ACKR3 by an agonist leads to a shift from an ensemble of high FRET states to an ensemble of lower FRET states, consistent with an increase in distance between the TM4 and TM6. The two receptors also appear to explore a different conformational space. A wider distribution of states is observed for ACKR3 as compared to CXCR4, and it shifts in the presence of agonists toward the active states, which correlates well with ACKR3's tendency to be constitutively active. This interpretation is confirmed by the use of the mutation of Y254 to leucine (the corresponding residue in CXCR4), which leads to a conformational distribution that resembles the one observed with CXCR4. It is correlated with a decrease in constitutive activity of ACKR3.

      Weaknesses:

      Although the data overall support the claims of the authors, there are however some details in the data analysis and interpretation that should be modified, clarified, or discussed in my opinion.

      Concerning the amplitude of the changes in FRET efficiency: the authors do not provide any structural information on the amplitude of the FRET changes that are expected. To me, it looks like a FRET change from ~0.9 to ~0.1 is very important, for a distance change that is expected to be only a few angstroms concerning the movement of the TM6. Can the authors give an explanation for that? How does this FRET change relate to those observed with other GPCRs modified at the same or equivalent positions on TM4 and TM6?

      Concerning the intermediate states: the authors observe several intermediate states.

      (1) First I am surprised, looking at the time traces, by the dwell times of the transitions between the states, which often last several seconds. Is such a long transition time compatible with what is known about the kinetic activation of these receptors?

      (2) Second is it possible that these « intermediate » states correspond to differences in FRET efficiencies, that arise from different photophysical states of the dyes? Alexa555 and Cy5 are Cyanines, that are known to be very sensitive to their local environment. This could lead to different quantum yields and therefore different FRET efficiencies for a similar distance. In addition, the authors use statistical labeling of two cysteines, and have therefore in their experiment a mixture of receptors where the donor and acceptor are switched, and can therefore experience different environments. The authors do not speculate structurally on what these intermediate states could be, which is appreciated, but I think they should nevertheless discuss the potential issue of fluorophore photophysics effects.

      (3) It would also have been nice to discuss whether these types of intermediate states have been observed in other studies by smFRET on GPCR labeled at similar positions.

      On line 239: the authors talk about the R↔R' transitions that are more probable. In fact it is more striking that the R'↔R* transition appears in the plot. This transition is a signature of the behaviour observed in the presence of an agonist, although IT1t is supposed to be an inverse agonist. This observation is consistent with the unexpected (for an inverse agonist) shift in the FRET histogram distribution. In fact, it appears that all CXCR4 antagonists or inverse agonists have a similar (although smaller) effect than the agonist. Is this related to the fact that these (antagonist or inverse agonist) ligands lead to a conformation that is similar to the agonists, but cannot interact with the G-protein ?? Maybe a very interesting experiment would be here to repeat these measurements in the presence of purified G-protein. G-protein has been shown to lead to a shift of the conformational space explored by GPCR toward the active state (using smFRET on class A and class C GPCR). It would be interesting to explore its role on CXCR4 in the presence of these various ligands. Although I am aware that this experiment might go beyond the scope of this study, I think this point should be discussed nevertheless.

      The authors also mentioned in Figure 6 that the energetic landscape of the receptors is relatively flat ... I do not really agree with this statement. For me, a flat conformational landscape would be one where the receptors are able to switch very rapidly between the states (typically in the submillisecond timescale, which is the timescale of protein domain dynamics). Here, the authors observed that the transition between states is in the second timescale, which for me implies that the transition barrier between the states is relatively high to preclude the fast transitions.

    3. Reviewer #2 (Public Review):

      Summary:

      This manuscript uses single-molecule fluorescence resonance energy transfer (smFRET) to identify differences in the molecular mechanisms of CXCR4 and ACKR3, two 7-transmembrane receptors that both respond to the chemokine CXCL12 but otherwise have very different signaling profiles. CXCR4 is highly selective for CXCL12 and activates heterotrimeric G proteins. In contrast, ACKR3 is quite promiscuous and does not couple to G proteins, but like most G protein-coupled receptors (GPCRs), it is phosphorylated by GPCR kinases and recruits arrestins. By monitoring FRET between two positions on the intracellular face of the receptor (which highlights the movement of transmembrane helix 6 [TM6], a key hallmark of GPCR activation), the authors show that CXCR4 remains mostly in an inactive-like state until CXCL12 binds and stabilizes a single active-like state. ACKR3 rapidly exchanges among four different conformations even in the absence of ligands, and agonists stabilize multiple activated states.

      Strengths:

      The core method employed in this paper, smFRET, can reveal dynamic aspects of these receptors (the breadth of conformations explored and the rate of exchange among them) that are not evident from static structures or many other biophysical methods. smFRET has not been broadly employed in studies of GPCRs. Therefore, this manuscript makes important conceptual advances in our understanding of how related GPCRs can vary in their conformational dynamics.

      Weaknesses:

      (1) The cysteine mutations in ACKR3 required to site-specifically install fluorophores substantially increase its basal and ligand-induced activity. If, as the authors posit, basal activity correlates with conformational heterogeneity, the smFRET data could greatly overestimate the conformational heterogeneity of ACKR3.

      (2) The probes used cannot reveal conformational changes in other positions besides TM6. GPCRs are known to exhibit loose allosteric coupling, so the conformational distribution observed at TM6 may not fully reflect the global conformational distribution of receptors. This could mask important differences that determine the ability of intracellular transducers to couple to specific receptor conformations.

      (3) While it is clear that CXCR4 and ACKR3 have very different conformational dynamics, the data do not definitively show that this is the main or only mechanism that contributes to their functional differences. There is little discussion of alternative potential mechanisms.

      (4) The extent to which conformational heterogeneity is a characteristic feature of ACKRs that contributes to their promiscuity and arrestin bias is unclear. The key residue the authors find promotes ACKR3 conformational heterogeneity is not conserved in most other ACKRs, but alternative mechanisms could generate similar heterogeneity.

      (5) There are no data to confirm that the two receptors retain the same functional profiles observed in cell-based systems following in vitro manipulations (purification, labeling, nanodisc reconstitution).

    4. Reviewer #3 (Public Review):

      Summary:

      This is a well-designed and rigorous comparative study of the conformational dynamics of two chemokine receptors, the canonical CXCR4 and the atypical ACKR3, using single-molecule fluorescence spectroscopy. These receptors play a role in cell migration and may be relevant for developing drugs targeting tumor growth in cancers. The authors use single-molecule FRET to obtain distributions of a specific intermolecular distance that changes upon activation of the receptor and track differences between the two receptors in the apo state, and in response to ligands and mutations. The picture emerging is that more dynamic conformations promote more basal activity and more promiscuous coupling of the receptor to effectors.

      Strengths:

      The study is well designed to test the main hypothesis, the sample preparation and the experiments conducted are sound and the data analysis is rigorous. The technique, smFRET, allows for the detection of several substates, even those that are rarely sampled, and it can provide a "connectivity map" by looking at the transition probabilities between states. The receptors are reconstituted in nanodiscs to create a native-like environment. The examples of raw donor/acceptor intensity traces and FRET traces look convincing and the data analysis is reliable to extract the sub-states of the ensemble. The role of specific residues in creating a more flat conformational landscape in ACKR3 (e.g., Y257 and the C34-C287 bridge) is well documented in the paper.

      Weaknesses:

      The kinetics side of the analysis is mentioned, but not described and discussed. I am not sure why since the data contains that information. For instance, it is not clear if greater conformational flexibility is accompanied by faster transitions between states or not.

      The method to choose the number of states seems reasonable, but the "similarity" of states argument (Figures S4 and S6) is not that clear.

      Also, the "dynamics" explanation offered for ACKR3's failure to couple and activate G proteins is not very convincing. In other studies, it was shown that activation of GPCRs by agonists leads to an increase in local dynamics around the TM6 labelling site, but that did not prevent G protein coupling and activation.

    5. Author response:

      The authors intend to submit a revised manuscript that addresses the questions raised in the public reviews.

    1. eLife assessment

      This valuable contribution studies factors that impact molecular exchange between dense and dilute phases of biomolecular condensates through continuum models and coarse-grained simulations. The authors provide convincing evidence that interfacial resistance can cause molecules to bounce off the interface and limit mixing. Results like these can inform how experimental results in the field of biological condensates are interpreted.

    2. Reviewer #1 (Public Review):

      Summary:

      In this paper by Zhang, the authors build a physical framework to probe the mechanisms that underlie exchange of molecules between coexisting dense and dilute liquid-like phases of condensates. They first propose a continuum model, in the context of a FRAP-like experiment where the fluorescently labeled molecules inside the condensate are bleached at t=0 and the recovery of fluorescence is measured. Through this model, they identify how the key timescales of internal molecular mixing, replenishment from dilute phase, and interface transfer contribute to molecular exchange timescale. Motivated by a recent experiment reported by some of the co-authors previously (Brangwynne et al. in 2019) finding strong interfacial resistance in in vitro protein droplets of LAF-1, they seek to understand the microscopic features contributing to the interfacial conductance (inversely proportional to the resistance). To check, they perform coarse-grained MD-simulations of sticker-spacer self-associative polymers and report how conductance varies significantly even across the few explored sequences. Further, by looking at individual trajectories, they postulate the "bouncing" i.e., molecules that approach the interface but are not successfully absorbed is a strong contributor to this mass transfer limitation. Consistent with their predictions, sequences that have more free unbound stickers (i.e., for example through imbalance sequence sticker stoichiometries) have higher conductances and they show a simple linear scaling between number of unbound stickers and conductance. Finally, they predict that an droplet-size dependent transition in recovery time behavior.

      Strengths:

      (1) This paper is overall well-written and clear to understand.<br /> (2) By combining coarse-grained simulations, continuum modeling, and comparison to published data, the authors provide a solid picture of how their proposed framework relates to molecular exchange mechanisms that are dominated by interface resistance and LAF-1 droplets.<br /> (3) The choice of different ways to estimate conductance from simulation and reported data are thoughtful and convincing on their near-agreement (although a little discussion of why and when they differ would be merited as well).

      Updated re-review:

      This revised update by Zhang et al. is improved and addresses many of the concerns raised by myself and the other reviewer, especially with the expanded discussion, contextualized text in model description, and the addition of a nice example case-study in revised Fig. 4. I believe the paper provides solid evidence of how "bouncing" may contribute to interfacial resistance/exchange dynamics in biomolecular condensates and is a useful study for the community.

      Note:<br /> In their response, the authors bring up an important point in references for LAF1 mutant FRAP data. While I found a few papers, for example https://www.pnas.org/doi/abs/10.1073/pnas.2000223117 and https://www.cell.com/biophysj/fulltext/S0006-3495(23)00464-2 , these are likely to be not whole droplet bleaches. I wonder whether it may be possible to approximately predict the conductance from other parameters (such as from effective expressions in eq 14) to roughly estimate what the effect maybe since LAF-1 has fairly "known" stickers and spacers. Note that this is not required at all, but I just bring this up in case it may be of interest to authors!

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This valuable contribution studies factors that impact molecular exchange between dense and dilute phases of biomolecular condensates through continuum models and coarse-grained simulations. The authors provide solid evidence that interfacial resistance can cause molecules to bounce off the interface and limit mixing. Results like these can inform how experimental results in the field of biological condensates are interpreted.

      We would like to sincerely thank the editors for spending time on our manuscript and for the very positive assessment of our work. We have carefully considered and addressed the reviewers’ comments in the point-by-point response below and have revised our manuscript accordingly.

      Reviewer #1 (Public Review):

      Summary:

      In this paper by Zhang, the authors build a physical framework to probe the mechanisms that underlie the exchange of molecules between coexisting dense and dilute liquid-like phases of condensates. They first propose a continuum model, in the context of a FRAP-like experiment where the fluorescently labeled molecules inside the condensate are bleached at t=0 and the recovery of fluorescence is measured. Through this model, they identify how the key timescales of internal molecular mixing, replenishment from dilute phase, and interface transfer contribute to molecular exchange timescale. Motivated by a recent experiment reported by some of the co-authors previously (Brangwynne et al. in 2019) finding strong interfacial resistance in in-vitro protein droplets of LAF-1, they seek to understand the microscopic features contributing to the interfacial conductance (inversely proportional to the resistance). To check, they perform coarse-grained MD simulations of sticker-spacer self-associative polymers and report how conductance varies significantly even across the few explored sequences. Further, by looking at individual trajectories, they postulate that "bouncing" - i.e., molecules that approach the interface but are not successfully absorbed - is a strong contributor to this mass transfer limitation. Consistent with their predictions, sequences that have more free unbound stickers (i.e., for example through imbalance sequence sticker stoichiometries) have higher conductances and they show a simple linear scaling between the number of unbound stickers and conductance. Finally, they predict a droplet-size-dependent transition in recovery time behavior.

      Strengths:

      (1) This paper is well-written overall and clear to understand.

      (2) By combining coarse-grained simulations, continuum modeling, and comparison to published data, the authors provide a solid picture of how their proposed framework relates to molecular exchange mechanisms that are dominated by interface resistance and LAF-1 droplets.

      (3) The choice of different ways to estimate conductance from simulation and reported data are thoughtful and convincing in their near agreement (although a little discussion of why and when they differ would be merited as well).

      We would like to thank the reviewer for the positive evaluation of our work. Indeed, we are grateful to the reviewer for this thoughtful, detailed, and constructive report, which has helped us strengthen the manuscript.

      Weaknesses:

      (1) Almost the entirety of this paper is motivated by a previously reported FRAP experiment on a particular LAF-1 droplet in vitro. There are a few major concerns I have with how the original data is used, how these results may generalize, and the lack of connection of predictions with any other experiments (published or new).

      a. The mean values of cdense, cdilute, diffusivities, etc. are taken from Taylor et al. to rule in the importance of interfacial mass transfer limits. While this may be true, the values originally inferred (in the 2019 paper that this paper is strongly built off) report extremely large confidence intervals/inferred standard errors. The authors should accordingly report all their inferences with correct standardized errors or confidence intervals, which in turn, allow us to better understand these data.

      Yes, agreed. We have now included the standard errors of the parameters from Taylor et al. (2019), and reported the corresponding standard errors for the timescales and interface conductance using error propagation. We have modified Fig. 1C right panel as well as the text in the figure caption:

      “(Right) Expected recovery times and if the slowest recovery process was either the flux from the dilute phase or diffusion within the droplet, respectively, with and taken from Taylor et al. (2019). While the timescale associated with interface resistance is unknown, the measured recovery time is much longer than and , suggesting the recovery is limited by flux through the interface, with an interface conductance of  (Below Figure 1)”

      b. The generalizability of this model is hard to gauge when all comparisons are made to a single experiment reported in a previous paper.

      i. Conceptually, the model is limited to single-component sticker-spacer polymers undergoing phase separation which is already a very simplified model of condensates - for e.g., LAF1 droplets in the cell have no perceptible interfacial mass limitations, also reported in Taylor et al. 2019 - so how these mechanisms relate to living systems as opposed to specific biochemistry experiments. So the authors need to discuss the implications and limitations of their model in the living context where there are multiple species, finite-size effects, and active processes at play.

      We thank the reviewer for the critical comment. To address this point, we have included a paragraph in the Discussion regarding in vivo situations:

      “In this work, we focused on the exchange dynamics of in vitro single-component condensates. How is the picture modified for condensates inside cells? It has been shown that Ddx4-YFP droplets in the cell nucleus exhibit negligible interface resistance Taylor et al. (2019), which raises the question whether interface resistance is relevant to natural condensates in vivo. Future quantitative FRAP and single-molecule tracking experiments on different types of droplets in the cell will address this question. One complication is that condensates in cells are almost always multi-component, which can increase the complexity of the exchange dynamics. Interestingly, formation of multiple layers or the presence of excess molecules of one species coating the droplet is likely to increase interface resistance. A notable example is the Pickering effect, in which adsorbed particles partially cover the interface, thereby reducing the accessible area and the overall condensate surface tension, slowing down the exchange dynamics Folkmann et al. (2021). The development of theory and modeling for the exchange dynamics of multi-component condensates is currently underway. (Lines 323-334)”

      ii. Second, can the authors connect their model to make predictions of the impact of perturbations to LAF-1 on exchange timescales? For example, are mutants (which change the number or positioning of "stickers") expected to show particular trends in conductances or FRAP timescales? Since LAF-1 is a relatively well-studied protein in vitro, can the authors further contrast their expectations with already published datasets that explore these perturbations, even if they don't generate new data?

      Our model is intended to address interface exchange dynamics at the conceptual level. The underlying mechanism for the large interface resistance of LAF-1 droplets could be more complicated than explored in our work. To study the impact of perturbations to LAF-1 on exchange timescales likely requires substantially more sophisticated molecular dynamics simulations. We undertook an extensive search for FRAP experiments on LAF-1 droplets where the whole droplet is photobleached, but were not able to find another dataset. We would be grateful if the reviewer is aware of such data and can point us to it.

      iii. A key prediction of the interface limitation model is the size-dependent crossover in FRAP dynamics. Can the authors reanalyze published data on LAF-1 (albeit of different-size droplets) to check their predictions? At the least, is the crossover radius within experimentally testable limits?

      Based on our prediction, the crossover radius for LAF-1 droplet is around 70 𝜇m. We have added a sentence in the text to point this out:

      “We also predict the crossover for LAF-1 droplets to be around 𝑅 = 71 𝜇m, which in principle can be tested experimentally. (Lines 285-286)”

      Unfortunately, most of FRAP experiments in Taylor at al. (2019) are partial FRAP experiments, in which only part of the dense phase is photobleached. The recovery time for such experiments reflects primarily the internal mixing speed of the dense phase rather than the exchange dynamics at the interface or transport from the dilute phase.

      c. The authors nicely relate the exchange timescale to various model parameters. Is LAF-1 the only protein for which the various dilute/dense concentrations/diffusivities are known? Given the large number of FRAP and other related studies, can the authors report on a few other model condensate protein systems? This will help broaden the reach of this model in the context of other previously reported data. If such data are lacking, a discussion of this would be important.

      Yes, indeed, we have found numerous publications with FRAP experiments performed on whole droplets of various proteins. However, none of these have provided a complete set of parameters to allow a quantitative analysis. Part of the reason is because it is nontrivial to have an accurate measurement of the partition coefficient (cden/cdil). We have added a sentence in the Discussion to promote future quantitative experiment and analysis of condensate exchange dynamics:

      “We hope that our study will motivate further experimental investigations into the anomalous exchange dynamics of LAF-1 droplets and potentially other condensates, and the mechanisms underlying interface resistance. (Lines 320-322)”

      To broaden the audience for this work in the hope of stimulating such studies, we have also modified the title and abstract so that it will be more visible to the FRAP community:

      “The exchange dynamics of biomolecular condensates (Line 1)”

      “A hallmark of biomolecular condensates formed via liquid-liquid phase separation is that they dynamically exchange material with their surroundings, and this process can be crucial to condensate function. Intuitively, the rate of exchange can be limited by the flux from the dilute phase or by the mixing speed in the dense phase. Surprisingly, a recent experiment suggests that exchange can also be limited by the dynamics at the droplet interface, implying the existence of an “interface resistance”. Here, we first derive an analytical expression for the timescale of condensate material exchange, which clearly conveys the physical factors controlling exchange dynamics. We then utilize sticker-spacer polymer models to show that interface resistance can arise when incident molecules transiently touch the interface without entering the dense phase, i.e., the molecules “bounce” from the interface. Our work provides insight into condensate exchange dynamics, with implications for both natural and synthetic systems. (Lines 16-26)”

      (2) The reported sticker-spacer simulations, while interesting, represent a very small portion of the parameter space. Can the authors - through a combination of simulation, analyses, or physical reasoning, comment on how the features of their underlying microscopic model (sequence length, implicit linker length, relative stoichiometry of A/B for a given length, overall concentration, sequence pattern properties like correlation length) connect to conductance? This will provide more compelling evidence relating their studies beyond the cursory examination of handpicked sequences. A more verbose description of some of the methods would be appreciated as well, including specifically how to (a) calculate the bond lifetime of isolated A-B pair, and (b) how equilibration/convergence of MD simulations is established.

      In our simulation, the interface conductance is essentially controlled by the fraction of unbound stickers, the encounter rate of a pair of unbound stickers, the dilute- and dense-phase concentrations, and the width of the interface. As a result, weaker binding strength and/or deviation of A:B stoichiometry from 1:1 result in a higher interface conductance. A6B6 polymers with long blocks of stickers of the same type (compared to (A2B2)3 and (A3B3)2) have a lower dilute-phase concentration and thinner interface width, so lower conductance. Sequence length and implicit linker length can have more complex effects, which are beyond the scope of the current study. We have now provided an explicit expression for 𝜅 in Equation (14) and added a discussion sentence in the text:

      “More generally, we find that the interface conductance of the sticker-spacer polymers is controlled by the encounter rate of a pair of unbound stickers and the availability of these stickers, which in turn depends on the sticker-sticker binding strength, the dilute- and dense-phase polymer concentrations, and the width of the interface:

      where 𝓃 is the number of monomers in a polymer,  is the global stoichiometry (i.e., ), and are the fractions of unbound A/B monomers in the dilute and dense phases. (Lines 208-214)”

      We have also added a few sentences in Appendix 2 to describe how we calculate the bond lifetime of an isolated A-B pair and how equilibration in simulations is established.

      “Briefly, the bond lifetime of an isolated pair is obtained by simulating a bound pair of A-B stickers in a box and recording the time when they first separate by the cutoff distance of the attractive interaction nm. The mean bond lifetime 𝜏 is found by averaging results of 1000 replicates with different random seeds. (Lines 642-645)”

      “To test if the system has reached equilibrium, we compare the dense- and dilute-phase concentrations derived from the first and second halves of the recorded data. The agreement indicates that the system has reached equilibrium. (Lines 586-589)”

      (3) A lot of the main text repeats previously published models (continuum ones in Taylor et al. 2019 and Hubsatch et al., 2021, amongst others) and the idea of interface resistance being limiting was already explored quantitatively in Taylor 2019 (including approximate estimates of mass transfer limitations) - this is fine in context. While the authors do a good job of referring to past work in context, the main results of this paper, in my reading, are:

      - a simplified physical form relating conductance timescales.

      - sticker-spacer simulations probing microscopic origins.

      - analysis of size-dependent FRAP scaling.

      I am stating this not as a major weakness, but, rather - I would recommend summarizing and categorizing the sections to make the distinctions between previously reported work and current advances sufficiently clear.

      We thank the reviewer for a clear summary of the contributions of our work. We have highlighted our main contributions in multiple places:

      “Here, we first derive an analytical expression for the timescale of condensate material exchange, which clearly conveys the physical factors controlling exchange dynamics. We then utilize sticker-spacer polymer models to show that interface resistance can arise when incident molecules transiently touch the interface without entering the dense phase, i.e., the molecules “bounce” from the interface. (Lines 21-25)”

      “In the following, we first derive an analytical expression for the timescale of condensate material exchange, which conveys a clear physical picture of what controls this timescale. We then utilize a “sticker-spacer” polymer model to investigate the mechanism of interface resistance. We find that a large interface resistance can occur when molecules bounce off the interface rather than being directly absorbed. We finally discuss characteristic features of the FRAP recovery pattern of droplets when the exchange dynamics is limited by different factors. (Lines 65-70)”

      “Specifically, we first derived an analytical expression for the exchange rate, which conveys the clear physical picture that this rate can be limited by the flux of molecules from the dilute phase, by the speed of mixing inside the dense phase, or by the dynamics of molecules at the droplet interface. Motivated by recent FRAP measurements Taylor et al. (2019) that the exchange rate of LAF-1 droplets can be limited by interface resistance, which contradicts predictions of conventional mean-field theory, we investigated possible physical mechanisms underlying interface resistance using a “sticker-spacer” model. Specifically, we demonstrated via simulations a notable example in which incident molecules have formed all possible internal bonds, and thus bounce from the interface, giving rise to a large interface resistance. Finally, we discussed the signatures in FRAP recovery patterns of the presence of a large interface resistance. (Lines 291-300)”

      Reviewer #2 (Public Review):

      Summary:

      In this paper, the authors have obtained an analytical expression that provides intuition about regimes of interfacial resistance that depend on droplet size. Additionally, through simulations, the authors provide microscopic insight into the arrangement of sticky and non-sticky functional groups at the interface. The authors introduce bouncing dynamics for rationalizing quantity recovery timescales.

      I found several sections that felt incomplete or needed revision and additional data to support the central claim and make the paper self-contained and coherent.

      We thank the reviewer for spending time on our manuscript and for the helpful critical comments.

      First, the analytical theory operates with diffusion coefficients for dilute and dense phases. For the dilute phase, this is fine. For the dense phase, I have doubts that dynamics can be described as diffusive. Most likely, dynamics is highly subdiffusive due to crowded, entangled, and viscoelastic environments of densely packed interactive biomolecules. Some explanation and justification are in order here.

      The reviewer is correct in noting that molecules within a condensate can move subdiffusively due to the viscoelastic nature of the condensate. However, subdiffusion only occurs at short time and small length scales, the motion of molecules becomes diffusive at longer time and larger length scales. The crossover time here is the terminal relaxation time measured to be on the order of milliseconds to seconds for typical condensates (see Alshareedah, Ibraheem, et al. "Determinants of viscoelasticity and flow activation energy in biomolecular condensates" Science Advances 10.7, 2024). We previously have also found that, for sticker-spacer polymers, this relaxation time is determined by the time it takes for a sticker to switch to a new partner (see Ronceray et al. (2022) in References), which is therefore largely determined by the bond lifetime of a sticker pair. The crossover length scale is expected to be comparable to the size of a molecule based on the theory of polymer disentanglement. Importantly, in order for the bleached droplet to recover its fluorescence, the bleached molecules must travel for a much longer time and a much larger length than the crossover time and length. It is therefore expected that the molecules move diffusively on the relevant timescale of a FRAP experiment, albeit with a diffusion coefficient that reflects crowding and entanglement on short time and length scales.

      The second major issue is that I did not find a clean comparison of simulations with the derived analytical expression. Simulations test various microscopic properties on the value of k, which is important. But how do we know that it is the same quantity that appears in the expressions? Also, how can we be sure that analytical expressions can guide simulations and experiments as claimed? The authors should provide sound evidence of the predictive aspect of their derived expressions.

      We thank the reviewer for raising this critical issue. We agree with the reviewer that we did not perform an explicit simulation to validate the developed theory, which leaves a gap between our theory and simulations. The main reason is because simulation of an in silico “FRAP experiment” on a 3D droplet is very computationally costly. Nevertheless, following the reviewer’s suggestion, we have now performed such a simulation in which we “bleached” a small A6B6 droplet and measured its recovery time. The good agreement between simulation and theory helps validate our overall combined computational and analytical approach. We have incorporated the new simulation and results into the manuscript. Two new sections including new figures (Figure 4 and Appendix 2 Figure 4) are added: “Direct simulation of droplet FRAP” in the main text (lines 232-261) and “Details of simulation and theory of FRAP recovery of an A6B6 droplet” in Appendix 2 (lines 665-715).

      Are the plots in Figure 4 coming from experiment, theory, and simulation? I could not find any information either in the text or in the caption.

      Figure 4 (now Figure 5) is from theory which uses parameters of the A6B6 system in simulation. We have added the following sentences to clarify:

      “We compare the measured FRAP recovery time for the small droplet (green circle) to theoretical predictions from Equation (6) (gray) and Equations (1) - (4) (black) in Figure 5A. (Lines 255-257)”

      “Figure 5. FRAP recovery patterns for large versus small droplets can be notably different for condensates with a sufficiently large interface resistance. (A) Expected relaxation time as a function of droplet radius for in silico “FRAP experiments” on the A6B6 system. The interface resistance dominates recovery times for smaller droplets, whereas dense-phase diffusion dominates recovery times for larger droplets. Green circle: FRAP recovery time obtained from direct simulation of an A6B6 droplet of radius 37 nm. Black curve: the recovery time as a function of droplet radius from a single exponential fit of the exact solution of the recovery curve from Equations (1) - (4). Gray curve: the recovery time predicted by Equation (6). Yellow, blue, and red curves: the recovery time when dense-phase, dilute-phase, and interface flux limit the exchange dynamics, i.e., the first, second, and last term in Equation (6), respectively. Parameters matched to the simulated A6B6 system in the slab geometry: (B) Time courses of fluorescence profiles for A6B6 droplets of radius  (top) and  (bottom); red is fully bleached, green is fully recovered. These concentration profiles are the numerical solutions of Equations (1) - (3) using the parameters in (A). (Below Figure 5)”

    1. Author response:

      We thank the reviewers for their insightful comments on our model and manuscript. In this provisional response, we would like to comment on some of the issues raised and how we plan to address them.

      First, the reviewers correctly pointed out that only a small part of the full model was openly available. We have now rectified this and the full model is available at: https://dataverse.harvard.edu/dataverse/sscx.

      Next, we would like to comment on the perceived lack of clarity of certain descriptions in the manuscript. We note that individual techniques and parts of the model have been developed, justified, and validated in previous publications. This left us with the question of how much of the contents of those papers we should re-describe. Too much, and the manuscript becomes overly long; too little, and the reader cannot gain a sufficient understanding of the model building process. The reviewers' comments made it clear that some aspects of the model should be described in more detail and we plan to address this in a revision. Crucially, one missing item raised by all reviewers was a comparison of local connection probabilities to the literature. This will be provided in the revision. Additionally, the reviewers questioned our decision to use a connectivity algorithm that is not based on direct parameterization of target connection probabilities. While this is a limitation of the algorithm we employed, it also has unique strengths, providing non-random aspects of connectivity that have been proven to be impossible to model with algorithms that enforce given connection probabilities or degree distributions. We plan to explain this better in a revision.

      We will also comment on the challenges associated with the interpretation of experimentally measured connection probabilities and employing them for the parameterization of a biophysically detailed model spanning millimeters.

      The reviewers also suggested several aspects of the model that could be improved. Whilst we see merit with all of them, we would like to briefly comment on model completeness in general. First, this model - and any model - can probably never be considered complete. Instead, the model has to be continuously refined, which one reviewer phrased as the "live nature" of the model. However, to demonstrate the model's utility and justify the expense of modeling, we also have to use the model in projects that explore specific scientific questions. To undertake and complete such a project, one must select and "freeze" a given version of the model-- otherwise the project will never conclude. Further, we believe that it is advantageous if several projects use the same version of the model. In that case, a reader who is already familiar with the model from one paper may find it easier to understand other papers using the same model. The goal of this manuscript is to describe the version of the model that we used in several ongoing and concluded follow-up projects, including its limitations and opportunities for refinement. As such, we do not plan to add further improvements to the model for this reviewed pre-print. We will, however, continue to refine the model outside of the scope of this publication. Since we believe the development and bottom up models are best done in a community driven manner, we encourage interested parties to participate.

      We invite anyone with ideas of how the model could be refined to contact us to discuss how we could integrate these changes into the model together using our tools.

    2. eLife assessment

      This manuscript reports a detailed model of juvenile rat somatosensory cortex, consisting of 4.2 million morphologically and biophysically detailed neuron models, arranged in space and connected according to diverse experimental data - a valuable tool for the field. The construction of the model is based on a solid methodology, but the supporting evidence is incomplete, as it is currently not emulating known local connection probabilities and variations in cortical thickness. It should be noted that, by necessity, such a large-scale model development involves many assumptions, interpolations, and decisions that could have compounding downstream effects on further analyses that may be difficult to disambiguate.

    3. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors describe the construction of an extremely large-scale anatomical model of juvenile rat somatosensory cortex (excluding the barrel region), which extends earlier iterations of these models by expanding across multiple interconnected cortical areas. The models are constructed in such a way as to maintain biological detail from a granular scale - for example, individual cell morphologies are maintained, and synaptic connectivity is founded on anatomical contacts. The authors use this model to investigate a variety of properties, from cell-type specific targeting (where the model results are compared to findings from recent large-scale electron microscopy studies) to network metrics. The model is also intended to serve as a platform and resource for the community by being a foundation for simulations of neuronal circuit activity and for additional anatomical studies that rely on the detailed knowledge of cellular identity and connectivity.

      Strengths:

      As the authors point out, the combination of scale and granularity of their model is what makes this study valuable and unique. The comparisons with recent electron microscopy findings are some of the most compelling results presented in the study, showing that certain connectivity patterns can arise directly from the anatomical configuration, while other discrepancies highlight where more selective targeting rules (perhaps based on molecular cues) are likely employed. They also describe intriguing effects of cortical thickness and curvature on circuit connectivity and characterize the magnitude of those effects on different cortical layers.

      The detailed construction of the model is drawn on a wide range of data sources (cellular and synaptic density measures, neuronal morphologies, cellular composition measures, brain geometry, etc.) that are integrated together; other data sources are used for comparison and validation. This consolidation and comparison also represent a valuable contribution to the overall understanding of the modeled system.

      Weaknesses:

      The scale of the model, which is a primary strength, also can carry some drawbacks. In order to integrate all the diverse data sources together, many specific decisions must be made about, for example, translating findings from different species or regions to the modeled system, or deciding which aspects of the system can be assumed to be the same and which should vary. All these decisions will have effects on the predicted results from the model, which could limit the types of conclusions that can be made (both by the others and by others in the community who may wish to use the model for their own work).

      As an example, while it is interesting that broad brain geometry has effects on network structure (Figure 7), it is not clear how those effects are actually manifested. I am not sure if some of the effects could be due to the way the model is constructed - perhaps there may be limited sets of morphologies that fit into columns of particular thicknesses, and those morphologies may have certain idiosyncrasies that could produce different statistics of connectivities where they are heavily used. That may be true to biology, but it may also be somewhat artifactual if, for example, the only neurons in the library that fit into that particular part of the cortex differ from the typical neurons that are actually found in that region (but may not have been part of the morphological sampling). I also wonder how much the assumption that the layers have the same relative thicknesses everywhere in the cortex affects these findings, since layer thicknesses do in fact vary across the cortex.

      In addition, the complexity of the model means that some complicated analyses and decisions are only presented in this manuscript with perhaps a single panel and not much textual explanation. I find, for example, that the panels of Figure S2 seem to abstract or simplify many details to the point where I am not clear about what they are actually illustrating - how does Figure S2D represent the results of "the process illustrated in B"? Why are there abrupt changes in connectivity at region borders (shown as discontinuous colors), when dendrites and axons span those borders and so would imply interconnectivity across the borders? What do the histograms in E1 and E2 portray, and how are they related to each other?

      Overall, the model presented in this study represents an enormous amount of work and stands as a unique resource for the community, but also is made somewhat unwieldy for the community to employ due to the weight of its manifold specific construction decisions, size, and complexity.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors build a colossal anatomical model of juvenile rat non-barrel primary somatosensory cortex, including inputs from the thalamus. This enhances past models by incorporating information on the shape of the cortex and estimated densities of various types of excitatory and inhibitory neurons across layers. This is intended to enable an analysis of the micro- and mesoscopic organisation of cortical connectivity and to be a base anatomical model for large-scale simulations of physiology.

      Strengths:

      • The authors incorporate many diverse data sources on morphology and connectivity.

      • This paper takes on the challenging task of linking micro- and mesoscale connectivity.

      • By building in the shape of the cortex, the authors were able to link cortical geometry to connectivity. In particular, they make an unexpected prediction that cortical conicality affects the modularity of local connectivity, which should be testable.

      • The author's analysis of the model led to the interesting prediction that layer 5 neurons connect local modules, which may be testable in the future, and provide a basis to link from detailed anatomy to functional computations.

      • The visualisation of the anatomy in various forms is excellent.

      • A subnetwork of the model is openly shared (but see question below).

      Weaknesses:

      • Why was non-barrel S1 of the juvenile rat cortex selected as the target for this huge modelling effort? This is not explained.

      • There is no effort to determine how specific or generalisable the findings here are to other parts of the cortex.

      • Although there is a link to physiological modelling in another paper, there is no clear pathway to go from this type of model to understand how the specific function of the modelled areas may emerge here (and not in other cortical areas).

      • In a few places the manuscript could be improved by being more specific in the language, for example:<br /> - "our anatomy-based approach has been shown to be powerful", I would prefer instead to read about specific contributions of past papers to the field, and how this builds on them.<br /> - similarly: "ensuring that the total number of synapses in a region-to-region pathway matches biology." Biology here is a loose term and implies too much confidence in the matching to some ground truth. Please instead describe the source of the data, including the type of experiment.

      • Some of the decisions seem a little ad-hoc, and the means to assess those decisions are not always available to the reader e.g.<br /> - pg. 10. "Based on these results, we decided that the local connectome sufficed to model connectivity within a region.". What is the basis for this decision? Can it be formalised?<br /> - "In the remaining layers the results of the objective classification were used to validate the class assignments of individual pyramidal cells. We found the objective classification to match the expert classification closely (i.e., for 80-90% of the morphologies). Consequently, we considered the expert classification to be sufficiently accurate to build the model." The description of the validation is a little informal. How many experts were there? What are their initials? Was inter-rater or intra-rater reliability assessed? What are these numbers? The match with Kanari's classification accuracy should be reported exactly. There are clearly experts among the author list, but we are all fallible without good controls in place, and they should be more explicit about those controls here, in my opinion.<br /> - "Morphology selection was then performed as previously (Markram et al., 2015), that is, a morphology was selected randomly from the top 10% scorers for a given position." A lot of the decisions seem a little ad-hoc, without justification other than this group had previously done the same thing. For example, why 10% here? Shouldn't this be based on selecting from all of the reasonable morphologies?

      • I would like to know if one of the key results relating to modularity and cortical geometry can be further explored. In particular, there seem to be sharp changes in the data at the end of the modelled cortical regions, which need to be explored or explained further.

      • The shape of the juvenile cortex - a key novelty of this work - was based on merely a scalar reduction of the adult cortex. This is very surprising, and surely an oversimplification. Huge efforts have gone into modelling the complex nonlinear development of the cortex, by teams including the developing Human Connectome Project. For such a fundamental aspect of this work, why isn't it possible to reconstruct the shape of this relatively small part of the juvenile rat cortex?

      • The same relative laminar depths are used for all subregions. This will have a large impact on the model. However, relative laminar depths can change drastically across the cortex (see e.g. many papers by Palomero-Gallagher, Zilles, and colleagues). The authors should incorporate the real laminar depths, or, failing that, show evidence to show that the laminar depth differences across the subregions included in the model are negligible.

      • The authors perform an affine mapping between mouse and rat cortex. This is again surprising. In human imaging, affine mappings are insufficient to map between two individual brains of the same species and nonlinear transformations are instead used. That an affine transformation should be considered sufficient to map between two different species is then very surprising. For some models, this may be fine, but there is a supposed emphasis here on biological precision in terms of anatomical location.

      • One of the most interesting conclusions, that the connectivity pattern observed is in part due to cooperative synapse formation, is based on analyses that are unfortunately not shown.

      • Open code:<br /> - Why is only a subvolume available to the community?<br /> - Live nature of the model. This is such a colossal model, and effort, that I worry that it may be quite difficult to update in light of new data. For example, how much person and computer time would it take to update the model to account for different layer sizes across subregions? Or to more precisely account for the shape of the juvenile rat cortex?

    5. Reviewer #3 (Public Review):

      This manuscript reports a detailed model of the rat non-barrel somatosensory cortex, consisting of 4.2 million morphologically and biophysically detailed neuron models, arranged in space and connected according to highly sophisticated rules informed by diverse experimental data. Due to its breadth and sophistication, the model will undoubtedly be of interest to the community, and the reporting of anatomical details of modeling in this paper is important for understanding all the assumptions and procedures involved in constructing the model. While a useful contribution to this field, the model and the manuscript could be improved by employing data more directly and comparing simple features of the model's connectivity - in particular, connection probabilities - with relevant experimental data.

      The manuscript is well-written overall but contains a substantial number of confusing or unclear statements, and some important information is not provided.

      Below, major concerns are listed, followed by more specific but still important issues.

      MAJOR ISSUES

      (1) Cortical connectivity.

      Section 2.3, "Local, mid-range and extrinsic connectivity modeled separately", and Figure 4: I am confused about what is done here and why. The authors have target data for connectivity (Figure 4B1). But then they use an apposition-based algorithm that results in connectivity that is quite different from the data (Figure 4B2, C). They then use a correction based on the data (Figure 4E) to arrive at a more realistic connectivity. Why not set the connectivity based on the data right away then? That would seem like a more straightforward approach.

      The same comment applies to Section 2.4., "Specificity of axonal targeting": the distributions of synapses on different types of target cell compartments were not well captured by the original model based on axon-dendrite overlap and pruning, so the authors introduced further pruning to match data specificity. While details of this process and what worked and what didn't may be interesting to some, overall it is not surprising, as it has been well known that cell types exhibit connectivity that is much more specific than "Peters rule" or its simple variations. The question is, since one has the data, why not use the data in the first place to set up the connectivity, instead of using the convoluted process of employing axon-dendrite overlap followed by multiple corrections?

      Most importantly, what is missing from the whole paper is the characterization of connection probabilities, at least for the local circuit within one area. Such connection probabilities can be obtained from the data that the authors already use here, such as the MICRONS dataset. Another good source of such data is Campagnola et al., Science, 2022. Both datasets are for mouse V1, but they provide a comprehensive characterization across all cortical layers, thus offering a good benchmark for comparison of the model with the data. It would be important for the authors to show how connection probabilities realized in their model for different cell types compared to these data.

      (2) Section 2.5, "Structure of thalamic inputs" and Figure 6.

      The text in section 2.5 should provide more details on what was done - namely, that the thalamic axons were generated based on the axon density profiles and then synapses were established based on their overall with cortical dendrites. Figure S10 where the target axon densities from data and the model axon densities are compared is not even mentioned here. Now, Figure S10 only shows that the axon densities were generated in a way that matches the data reasonably well. However, how can we know that it results in connectivity that agrees with data? Are there data sources that can be used for that purpose? For example, the authors show that in their model "the peaks of the mean number of thalamic inputs per neuron occur at lower depths than the peaks of the synaptic density". Is this prediction of the model consistent with any available data?

      Most importantly, the authors should show how the different cell types in their model are targeted by the thalamic inputs in each layer. Experimental studies have been done suggesting specificity in targeting of interneuron types by thalamic axons, such as PV cells being targeted strongly whereas SST and VIP cells being targeted less.

      (3) "We have therefore made not only the model but also most of our tool chain openly available to the public (Figure 1; step 7)."<br /> In fact it is not the whole model that is made publicly available, but only about 5% of it (211,000 out of 4,200,000 neurons). Also, why is "most" of the tool chain made openly available, and not the whole tool chain?

      OTHER ISSUES

      "At each soma location, a reconstruction of the corresponding m-type was chosen based on the size and shape of its dendritic and axonal trees (Figure S6). Additionally, it was rotated to according to the orientation towards the cortical surface at that point."

      After this procedure, were cells additionally rotated around the white matter-pia axis? If yes, then how much and randomly or not? If not, then why not? Such rotations would seem important because otherwise additional order potentially not present in the real cortex is introduced in the model affecting connectivity and possibly also in vivo physiology (such as the dynamics of the extracellular electric field).

      The term "new in vivo reconstructions" for the 58 neurons used in this paper in addition to "in vitro reconstructions" is a misnomer. It is not straightforward to see where the procedure is described, but then one finds that the part of Methods that describes experimental manipulations is mostly about that (so, a clearer pointer to that part of Methods could be useful). However, the description in Methods makes it clear that it is only labeling that is done in vivo; the microscopy and reconstruction are done subsequently in vitro. I would recommend changing the terminology here, as it is confusing. Also, can the authors show reconstructions of these neurons in the supplementary figures? Is the reconstruction shown in Figure 4A representative?

      In the Discussion, "This was taken into account during the modeling of the anatomical composition, e.g. by using three-dimensional, layer-specific neuron density profiles that match biological measurements, and by ensuring the biologically correct orientation of model neurons with respect to the orientation towards the cortical surface. As local connectivity was derived from axo-dendritic appositions in the anatomical model, it was strongly affected by these aspects.<br /> However, this approach alone was insufficient at the large spatial scale of the model, as it was limited to connections at distances below 1000μm."

      As mentioned above, it is not clear that this approach was sufficient for local connectivity either. It would be great if the authors showed a systematic comparison of local connection probabilities between different cell types in their model with experimental data and commented here in the Discussion about how well the model agrees with the data.

      In the Discussion: "The combined connectome therefore captures important correlations at that level, such as slender-tufted layer 5 PCs sending strong non-local cortico-cortical connections, but thick-tufted layer 5 PCs not." (Also the corresponding findings in Results.)

      If I understand this statement correctly, it may not agree with biological data. See analysis from MICRONS dataset in Bodor et al., https://www.biorxiv.org/content/10.1101/2023.10.18.562531v1.

      Table 2 is confusing. What do pluses and minuses mean? What does it mean that some entries have two pluses? This table is not mentioned anywhere else in the text. If pluses mean some meaningful predictions of the model, then their distribution in the table seems quite liberal and arbitrary. It is not clear to me that the model makes that many predictions, especially for type-specificity and plasticity. Also, why is the hippocampus mentioned in this table? I don't see anything about the hippocampus anywhere else in the paper.

      In the Discussion, "Thus, we made the tools to improve our model also openly available (see Data and Code availability section)."<br /> As mentioned before, the authors themselves write that they made "most of our tool chain openly available to the public", but not all of it.

      Table S2 has multiple question marks. It is not clear whether the "predictions" listed in that table are truly well-thought-out and/or whether experimental confirmations are real.

      Introduction: It would be quite appropriate to cite here Einevoll et al., Neuron, 2019 ("The Scientific Case for Brain Simulations").

    1. eLife assessment

      The authors proposed a novel deep learning framework to estimate posterior distributions of tissue microstructure parameters. This provides a valuable methodology with practical implications for automatically estimating parameter distributions from different biophysical models. The experiments show solid evidence for generalizing the method to use data from different protocol acquisitions and work with models of varying complexity.

    2. Reviewer #1 (Public Review):

      The authors proposed a framework to estimate the posterior distribution of parameters in biophysical models. The framework has two modules: the first MLP module is used to reduce data dimensionality and the second NPE module is used to approximate the desired posterior distribution. The results show that the MLP module can capture additional information compared to manually defined summary statistics. By using the NPE module, the repetitive evaluation of the forward model is avoided, thus making the framework computationally efficient. The results show the framework has promise in identifying degeneracy. This is an interesting work.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors improve the work of Jallais et al. (2022) by including a novel module capable of automatically learning feature selection from different acquisition protocols inside a supervised learning framework. Combining the module above with an estimation framework for estimating the posterior distribution of model parameters, they obtain rich probabilistic information (uncertainty and degeneracy) on the parameters in a reasonable computation time.

      The main contributions of the work are:<br /> (1) The whole framework allows the user to avoid manually defining summary statistics, which may be slow and tedious and affect the quality of the results.<br /> (2) The authors tested the proposal by tackling three different biophysical models for brain tissue and using data with characteristics commonly used by the diffusion-MR-microstructure research community.<br /> (3) The authors validated their method well with the state-of-the-art.

      The main weakness is:<br /> (1) The methodology was tested only on scenarios with a signal-to-noise ratio (SNR) equal to 50. It is interesting to show results with lower SNR and without noise that the method can detect the model's inherent degenerations and how the degeneration increases when strong noise is present. I suggest expanding the Figure in Appendix 1 to include this information.

      The authors showed the utility of their proposal by computing complex parameter descriptors automatically in an achievable time for three different and relevant biophysical models.

      Importantly, this proposal promotes tackling, analyzing, and considering the degenerated nature of the most used models in brain microstructure estimation.

    4. Author response:

      We appreciate the time and effort that you and the reviewers have dedicated to providing valuable feedback on our manuscript. We are grateful to the reviewers for their insightful comments.

      Reviewer #1:<br /> We thank the reviewer for the positive comments made on our manuscript.

      Reviewer #2:<br /> We thank the reviewer for these positive remarks.

      Concerning the main weakness highlighted by the reviewer:

      We presented results in our submitted work both without noise and with a signal-to-noise ratio (SNR) equal to 50. Figure 5 shows exemplar posterior distributions obtained in a noise-free scenario, and Table 1 reports the number of degeneracies for each model on 10000 noise-free simulations. These results highlight that the presence of degeneracies is inherent to the model definition. Figures 3, 6 and 7 present results considering an SNR of 50. Results with lower SNR have indeed not been included into this work. We agree that adding a figure showing the impact of noise on the posterior distributions will be a good addition to this work. We will include an additional figure in the second version, as interestingly suggested.

    1. eLife assessment

      This study shows that Znhit1, a regulator of chromatin and of the histone variant H2A.Z, is required for progression through meiotic prophase. It is an important observation that describes the role of epigenetics and gene expression during meiosis. The analysis is based on complementary approaches at the cytological, single-cell, and genomic levels that provide solid evidence for the role of Znhit1 in the control of gene expression and in the loading of H2A.Z in mouse spermatocytes.

    2. Reviewer #1 (Public Review):

      Summary:

      Sun et al. generated germline-specific cKO mice for the Znhit1 gene and examined its effect on male meiosis. The authors found that the loss of Znhit1 affects the transcriptional activation of pachytene. Znhit1 is a subunit of the SRCAP chromatin remodeling complex and a depositor of H2AZ, and in cKO spermatocytes, H2AZ is not deposited into the gene region. The authors claim that this is why the PGA was not activated. These findings provide important insights into the mechanisms of transcriptional regulation during the meiotic prophase.

      Strengths:

      The authors used samples from their original mouse model, analyzing both the epigenome and the transcriptome in detail using diverse NGS analyses to gain new insights into PGA. The quality of the results appeared excellent.

      Weaknesses:

      Overall, the data is inconsistent with the authors' claims and does not support their final conclusions. In addition, the sample used may not be the most suitable for the analysis, but a more suitable sample would dramatically improve the overall quality of the paper.

    3. Reviewer #2 (Public Review):

      Summary:

      The study demonstrates that Znhit1 regulates male meiosis, with deletion causing pachytene failure associated with defective expression of pachytene genes and subtle effects on X-Y pairing and DSB repair. The authors attribute this phenotype to the defective incorporation of the Znhit1 target H2A.Z into chromatin.

      Strengths:

      The paper and the figures are well presented and the narrative is clear. Evidence that the conditional deletion strategy removes Znhit1 is strong, with multiple orthogonal approaches used. Most of the meiotic phenotyping is well performed, and the omics analysis clearly identifies a dramatic effect on the meiotic gene expression program. The link to H2A.Z and A-MYB adds a mechanistic angle to the study.

      Weaknesses:

      (1) Current literature demonstrates that meiotic mutants arrest at one of two stages: midpachytene (stage IV of the seminiferous cycle) or metaphase I (stage XII of the seminiferous cycle). This study documents that in the Znhit1 KO the midpachytene marker H1t appears normally, but that cells arrest before diplotene. If this is true, then arrest must occur during late pachytene, which based on my knowledge has never been documented for a meiotic KO. To resolve this, the authors should present stronger histological substaging evidence to support their claim.

      (2) The authors overlooked the possible effects of Znhit1 deletion on MSCI. Defective MSCI is a well-established cause of pachytene arrest. Actually, the fact that they see X-Y pairing failure should alert them even more strongly to this possibility because MSCI failure is often associated with defective X-Y pairing. This could be easily addressed by examination of their RNAseq data.

      (3) The recombination assays need attention.<br /> - In the text the authors state that they studied RPA2 and DMC1, but the figures show RPA2 and RAD51.<br /> - The RPA counts are not quantitated.<br /> - The conclusion that crossover formation fails (based on MLH1 staining) is not justified. This marker does not appear in wt males until late pachytene, so if cells in this mutant are dying before that stage, MLH1 cannot be assessed.<br /> - The authors state that gH2AZ persists in the KO, but I'm not convinced that they are comparing equivalent stages in the wt and KO. In Figure 3C, the pachytene cell is late, whereas in the mutant the pachytene cell is early or mid (when residual gH2AX is expected, even in wt males).<br /> - Previous work (PMID: 23824539) has shown that antibodies reportedly detecting pATM in the sex body are non-specific. I therefore advise caution with the data shown in Figure 3D.

      (4) RNAseq data. The authors show convincingly that Znhit1 activates genes that are normally upregulated at the zyg-pachytene transition. They should repeat the analysis for genes normally upregulated at the prelep- lep and lep-zyg transition to show that this effect is really pachytene-gene specific.

      (5) I am puzzled that the title and overall gist of the study focuses on H2A.Z, when it is Znhit1 that has been deleted.

    4. Reviewer #3 (Public Review):

      Summary:

      Sun et al. present a manuscript detailing the phenotypic characterization of loss of Znhit1 in male germ cells. Znhit1 is a subunit of the chromatin regulating complex SRCAP that functions to deposit the histone variant H2A.Z. Given that meiosis, and specifically meiotic recombination, occurs in the context of the dynamic condensing of chromosomes, the role of chromatin regulators in general, and histone variants specifically, in mammalian meiosis is an active area of research. Previous work has shown that H2A.Z is found at the locations of recombination in plants, although H2A.Z was previously not found at recombination sites in mammalian meiosis. Here the authors use a conditional approach to ablate Znhit1 in spermatocytes and characterize a block in meiosis in prophase I in the transition from pachytene to diplotene stage.

      Strengths:

      The authors combine current methods in immunohistochemistry and functional genomics to provide strong evidence of meiotic block upon the loss of Znhit1. They find that loss of Znhit1 leads to reduced incorporation of the histone variant H2A.Z, specifically at promoters and enhancers. Further, RNA sequencing found more genes are down-regulated upon loss of Znhit1 compared to upregulated, suggesting that incorporation of H2A.Z is critical for the expression of genes necessary for successful meiotic progression.

      A strength of the manuscript is tying the locations of changes in H2A.Z deposition with binding of the transcription factor A-MYB, providing a mechanism that can potentially combine the changes in chromatin regulation with variable binding of a transcription factor in gene expression in pachytene stage spermatocytes.

      Weaknesses:

      A weakness in the single-cell RNA experiment using cells from 16-day-old male mice. The authors suggest that the rationale for the experiment was to determine where the Znhit1-sKO mutant showed an arrest in meiosis, and claim that this is the pachytene stage. However, in the 'first wave' of meiosis 16-day-old mice are just beginning to enter pachytene, so cells from later meiotic stages will be largely absent in these tubules. This is clear from the UMAP showing a similar pattern of cell distributions between wild-type and mutant mice. Using older mice would have better demonstrated where the mutant and wild-type mice differ in cell-type composition.

      The authors use the term pachytene genome activation (PGS) in the manuscript to suggest a novel process by which genes are specifically increased in expression in the pachytene stage of meiotic prophase I, without reference to literature that establishes the term. If the authors are putting forward a new concept defined by this term, it would strengthen the manuscript to describe it further and delineate what the genes are that are activated and discuss potential mechanisms.

      Generally speaking, the authors present solid evidence for a pachytene block in male germ cell development in mice lacking Znhit1 in spermatocytes. The evidence supporting a change in gene expression during pachytene, that more genes are downregulated in the mutant compared to increased expression, and changes in histone modification dynamics and placement of H2A.Z all support a role in alterations in meiotic gene regulation. However, the support that changes in H2A.Z impacting meiotic recombination (as suggested in the manuscript title) is less supported, rather than a general cell arrest in the pachytene stage leading to cell death. The conclusions around the role of Znhit1 influencing meiotic recombination directly could use further justification or mechanistic hypothesis.

    1. eLife assessment

      This work describes a convincingly validated non-invasive tool for in vivo metabolic phenotyping of aggressive brain tumors in mice brains. The analysis provides a valuable technique that tackles the unmet need for patient stratification and hence for early assessment of therapeutic efficacy. However, wider clinical applicability of the findings can be attained by expanding the work to include more diverse tumor models.

    2. Reviewer #1 (Public Review):

      Summary:

      This work introduces a new imaging tool for profiling tumor microenvironments through glucose conversion kinetics. Using GL261 and CT2A intracranial mouse models, the authors demonstrated that tumor lactate turnover mimicked the glioblastoma phenotype, and differences in peritumoral glutamate-glutamine recycling correlated with tumor invasion capacity, aligning with histopathological characterization. This paper presents a novel method to image and quantify glucose metabolites, reducing background noise and improving the predictability of multiple tumor features. It is, therefore, a valuable tool for studying glioblastoma in mouse models and enhances the understanding of the metabolic heterogeneity of glioblastoma.

      Strengths:

      By combining novel spectroscopic imaging modalities and recent advances in noise attenuation, Simões et al. improve upon their previously published Dynamic Glucose-Enhanced deuterium metabolic imaging (DGE-DMI) method to resolve spatiotemporal glucose flux rates in two commonly used syngeneic GBM mouse models, CT2A and GL261. This method can be standardized and further enhanced by using tensor PCA for spectral denoising, which improves kinetic modeling performance. It enables the glioblastoma mouse model to be assessed and quantified with higher accuracy using imaging methods.

      The study also demonstrated the potential of DGE-DMI by providing spectroscopic imaging of glucose metabolic fluxes in both the tumor and tumor border regions. By comparing these results with histopathological characterization, the authors showed that DGE-DMI could be a powerful tool for analyzing multiple aspects of mouse glioblastoma, such as cell density and proliferation, peritumoral infiltration, and distant migration.

      Weaknesses:

      Although the paper provides clear evidence that DGE-DMI is a potentially powerful tool for the mouse glioblastoma model, it fails to use this new method to discover novel features of tumors. The data presented mainly confirm tumor features that have been previously reported. While this demonstrates that DGE-DMI is a reliable imaging tool in such circumstances, it also diminishes the novelty of the study.

      When using DGE-DMI to quantitatively map glycolysis and mitochondrial oxidation fluxes, there is no comparison with other methods to directly identify the changes. This makes it difficult to assess how sensitive DGE-DMI is in detecting differences in glycolysis and mitochondrial oxidation fluxes, which undermines the claim of its potential for in vivo GBM phenotyping.

      The study only used intracranial injections of two mouse glioblastoma cell lines, which limits the application of DGE-DMI in detecting and characterizing de novo glioblastomas. A de novo mouse model can show tumor growth progression and is more heterogeneous than a cell line injection model. Demonstrating that DGE-DMI performs well in a more clinically relevant model would better support its claimed potential usage in patients.