23,970 Matching Annotations
  1. Apr 2024
    1. Reviewer #1 (Public Review):

      Summary

      A new method, tCFS, is introduced to offer richer and more efficient measurement of interocular suppression. It generates a new index, the suppression depth, based on the contrast difference between the up-ramped contrast for the target to breakthrough suppression and the down-ramped contrast for the target to disappear into suppression. A uniform suppression depth regardless of image types (e.g., faces, gratings and scrambles) was discovered in the paper, favoring an early-stage mechanism involving CFS. Discussions about claims of unconscious processing and the related mechanisms.

      Strength

      The tCFS method adds to the existing bCFS paradigms by providing the (re-)suppression threshold and thereafter the depression depth. Benefiting from adaptive procedures with continuous trials, the tCFS is able to give fast and efficient measurements. It also provides a new opportunity to test theories and models about how information is processed outside visual awareness.

      Weakness:

      This paper reports the surprising finding of uniform suppression depth over a variety of stimuli. This is novel and interesting. But given the limited samples being tested, the claim of uniformity suppression depth needs to be further examined, with respect to different complexities and semantic meanings.

      From an intuitive aspect, the results challenged previous views about "preferential processing" for certain categories, though it invites further research to explore what exactly could suppression depth tell us about unconscious visual processing.

    2. Reviewer #3 (Public Review):

      Summary:

      In the 'bCFS' paradigm, a monocular target gradually increases in contrast until it breaks interocular suppression by a rich monocular suppressor in the other eye. The present authors extend the bCFS paradigm by allowing the target to reduce back down in contrast until it becomes suppressed again. The main variable of interest is the contrast difference between breaking suppression and (re) entering suppression. The authors find this difference to be constant across a range of target types, even ones that differ substantially in the contrast at which they break interocular suppression (the variable conventionally measured in bCFS). They also measure how the difference changes as a function of other manipulations. Interpretation is in terms of the processing of unconscious visual content, as well as in terms of the mechanism of interocular suppression.

      Strengths:

      Interpretation of bCFS findings is mired in controversy, and this is an ingenuous effort to move beyond the paradigm's exclusive focus on breaking suppression. The notion of using the contrast difference between breaking and entering suppression as an index of suppression depth is interesting. The finding that this difference is similar for a range of target types that do differ in the contrast at which they break suppression, suggests a common mechanism of suppression across those target types.

    1. eLife assessment

      This important study shows that distinct midbrain dopaminergic axons in the medial prefrontal cortex respond to aversive and rewarding stimuli and suggest that they are biased toward aversive processing. The use of innovative microprism based two-photon calcium imaging to study single axon heterogeneity is convincing, although the experimental design makes it difficult to definitively distinguish aversive valence from stimulus salience in this dopamine projection. This work will be of interest to neuroscientists working on neuromodulatory systems, cortical function and decision making.

    2. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Abe and colleagues employ in vivo 2-photon calcium imaging of dopaminergic axons in the mPFC. The study reveals that these axons primarily respond to unconditioned aversive stimuli (US) and enhance their responses to initially-neutral stimuli after classical association learning. The manuscript is well-structured and presents results clearly. The utilization of a refined prism-based imaging technique, though not entirely novel, is well-implemented. The study's significance lies in its contribution to the existing literature by offering single-axon resolution functional insights, supplementing prior bulk measurements of calcium or dopamine release. Given the current focus on neuromodulator neuron heterogeneity, the work aligns well with current research trends and will greatly interest researchers in the field.

      Comment on the revised version:

      In my opinion, the authors did a great job with the revision of the manuscript.

    3. Reviewer #3 (Public Review):

      Summary:

      The authors image dopamine axons in medial prefrontal cortex (mPFC) using microprism-mediated two-photon calcium imaging. They image these axons as mice learn that two auditory cues predict two distinct outcomes, tailshock, or water delivery. They find that some axons show a preference for encoding of the shock and some show a preference for encoding of water. The authors report a greater number of dopamine axons in mPFC that respond to shock. Across time, the shock-preferring axons begin to respond preferentially to the cue predicting shock, while there is a less pronounced increase in the water-responsive axons that acquire a response to the water-predictive cue (these axons also increase non-significantly to the shock-predictive cue). These data lead the authors to argue that dopamine axons in mPFC preferentially encode aversive stimuli.

      Strengths:

      The experiments are beautifully executed and the authors have mastered an impressively complex technique. Specifically, they are able to image and track individual dopamine axons in mPFC across days of learning. And this technique is used the way it should be: the authors isolate distinct dopamine axons in mPFC and characterize their encoding preferences and how this evolves across learning of cue-shock and cue-water contingencies. Thus, these experiments are revealing novel information about how aversive and rewarding stimuli is encoded at the level of individual axons, in a way that has not been done before. This is timely and important.

      Weaknesses:

      The overarching conclusion of the paper is that dopamine axons preferentially encode aversive stimuli. However, this is confounded by differences in the strength of the aversive and appetitive outcomes. As the authors point out, the axonal response to stimuli is sensitive to outcome magnitude (Supp Fig 3). That is, if you increase the magnitude of water or shock that is delivered, you increase the change in fluorescence that is seen in the axons. Unsurprisingly, the change in fluorescence that is seen to shock is considerably higher than water reward. Further, over 40% of the axons respond to water early in training [yet just a few lines below the authors write: "Previous studies have demonstrated that the overall dopamine release at the mPFC or the summed activity of mPFC dopamine axons exhibits a strong response to aversive stimuli (e.g., tail shock), but little to rewards", which seems inconsistent with their own data]. Given these aspects of the data, it could be the case that the dopamine axons in mPFC encodes different types of information and delegates preferential processing to the most salient outcome across time. The use of two similar sounding tones (9Khz and 12KHz) for the reward and aversive predicting cues are likely to enhance this as it requires a fine-grained distinction between the two cues in order to learn effectively. That is not to say that the mice cannot distinguish between these cues, rather that they may require additional processes to resolve the similarity, which are known to be dependent on the mPFC.

      There is considerable literature on mPFC function across species that would support such a view. Specifically, theories of mPFC function (in particular prelimbic cortex, which is where the axon images are mostly taken) generally center around resolution of conflict in what to respond, learn about, and attend to. That is, mPFC is important for devoting the most resources (learning, behavior) to the most relevant outcomes in the environment. This data then, provides a mechanism for this to occur in mPFC. That is, dopamine axons signal to the mPFC the most salient aspects of the environment, which should be preferentially learnt about and responded towards. This is also consistent with the absence of a negative prediction error during omission: the dopamine axons show increases in responses during receipt of unexpected outcomes but do not encode negative errors. This supports a role for this projection in helping to allocate resources to the most salient outcomes and their predictors, and not learning per se. Below are a just few references from the rich literature on mPFC function (some consider rodent mPFC analogous to DLPFC, some mPFC), which advocate for a role in this region in allocating attention and cognitive resources to most relevant stimuli, and do not indicate preferential processing of aversive stimuli.

      References:<br /> 1. Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual review of neuroscience, 24(1), 167-202.<br /> 2. Bissonette, G. B., Powell, E. M., & Roesch, M. R. (2013). Neural structures underlying set-shifting: roles of medial prefrontal cortex and anterior cingulate cortex. Behavioural brain research, 250, 91-101.<br /> 3. Desimone, R., & Duncan, J. (1995). Neural mechanisms of selective visual attention. Annual review of neuroscience, 18(1), 193-222.<br /> 4. Sharpe, M. J., Stalnaker, T., Schuck, N. W., Killcross, S., Schoenbaum, G., & Niv, Y. (2019). An integrated model of action selection: distinct modes of cortical control of striatal decision making. Annual review of psychology, 70, 53-76.<br /> 5. Ridderinkhof, K. R., Ullsperger, M., Crone, E. A., & Nieuwenhuis, S. (2004). The role of the medial frontal cortex in cognitive control. science, 306(5695), 443-447.<br /> 6. Nee, D. E., Kastner, S., & Brown, J. W. (2011). Functional heterogeneity of conflict, error, task-switching, and unexpectedness effects within medial prefrontal cortex. Neuroimage, 54(1), 528-540.<br /> 7. Isoda, M., & Hikosaka, O. (2007). Switching from automatic to controlled action by monkey medial frontal cortex. Nature neuroscience, 10(2), 240-248.

    1. eLife assessment

      This magnetoencephalography study reports important new findings regarding the nature of memory reactivation during cued recall. It replicates previous work showing that such reactivation can be sequential or clustered, with sequential reactivation being more prevalent in low performers. It adds convincing evidence, even though based on limited amounts of data, that high memory performers tend to show simultaneous (i.e., clustered) reactivation, varying in strength with item distance in the learned graph structure. The study will be of interest to scientists studying memory replay.

    2. Reviewer #1 (Public Review):

      Summary:

      Previous work in humans and non-human animals suggests that during offline periods following learning, the brain replays newly acquired information in a sequential manner. The present study uses an MEG-based decoding approach to investigate the nature of replay/reactivation during a cued recall task directly following a learning session, where human participants are trained on a new sequence of 10 visual images embedded in a graph structure. During retrieval, participants are then cued with two items from the learned sequence, and neural evidence is obtained for the simultaneous or sequential reactivation of future sequence items. The authors find evidence for both sequential and clustered (i.e., simultaneous) reactivation. Replicating previous work, low-performing participants tend to show sequential, temporally segregated reactivation of future items, whereas high-performing participants show more clustered reactivation. Adding to previous work, the authors show that an image's reactivation strength varies depending on its proximity to the retrieval cue within the graph structure.

      Strengths:

      As the authors point out, work on memory reactivation has largely been limited to the retrieval of single associations. Given the sequential nature of our real-life experiences, there is clearly value in extending this work to structured, sequential information. State-of-the-art decoding approaches for MEG are used to characterize the strength and timing of item reactivation. The manuscript is very well written with helpful and informative figures in the main sections. The task includes an extensive localizer with 50 repetitions per image, allowing for stable training of the decoders and the inclusion of several sanity checks demonstrating that on-screen items can be decoded with high accuracy.

      Weaknesses:

      Of major concern, the experiment is not optimally designed for analysis of the retrieval task phase, where only 4 min of recording time and a single presentation of each cue item are available for the analyses of sequential and non-sequential reactivation. In their revision, the authors include data from the learning blocks in their analysis. These blocks follow the same trial structure as the retrieval task, and apart from adding more data points could also reveal a possible shift from sequential to clustered reactivation as learning of the graph structure progresses. The new analyses are not entirely conclusive, maybe given the variability in the number of learning blocks that participants require to reach criterion. In principal, they suggest that reactivation strength increases from learning (pre-rest) to final retrieval (post-rest).

      On a more conceptual note, the main narrative of the manuscript implies that sequential and clustered reactivation are mutually exclusive, such that a single participant would show either one or the other type. With the analytic methods used here, however, it seems possible to observe both types of reactivation. For example, the observation that mean reactivation strength (across the entire trial, or in a given time window of interest) varies with graph distance does not exclude the possibility that this reactivation is also sequential. In fact, the approach of defining one peak time window of reactivation may bias towards simultaneous, graded reactivation. It would be helpful if the authors could clarify this conceptual point. A strong claim that the two types of reactivation are mutually exclusive would need to be substantiated by further evidence, for instance a suitable metric contrasting "sequenceness" vs "clusteredness".

      On the same point, the non-sequential reactivation analyses use a time window of peak decodability that is determined based on the average reactivation of all future items, irrespective of graph distance. In a sequential forward cascade of reactivations, it could be assumed that the reactivation of near items would peak earlier than the reactivation of far items. In the revised manuscript, the authors now show the "raw" timecourses of item decodability at different graph distances, clearly demonstrating their peak reactivation times, which show convincingly that reactivation for near and far items occurs at very similar time points. The question that remains, therefore, is whether the method of pre-selecting a time window of interest described above could exert a bias towards finding clustered reactivation.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors investigate replay (defined as sequential reactivation) and clustered reactivation during retrieval of an abstract cognitive map. Replay and clustered reactivation were analysed based on MEG recordings combined with a decoding approach. While the authors state to find evidence for both, replay and clustered reactivation during retrieval, replay was exclusively present in low performers. Further, the authors show that reactivation strength declined with an increasing graph distance.

      Strengths:

      The paper raises interesting research questions, i.e., replay vs. clustered reactivation and how that supports retrieval of cognitive maps. The paper is well written, well structured and easy to follow. The methodological approach is convincing and definitely suited to address the proposed research questions.

      The paper is a great combination between replicating previous findings (Wimmer et al. 2020) with a new experimental approach but at the same time presenting novel evidence (reactivation strength declines as a function of graph distance).

      What I also want to positively highlight is their general transparency. For example, they pre-registered this study but with a focus on a different part of the data and outlined this explicitly in the paper.

      The paper has very interesting findings. However, there are some shortcomings especially in the experimental design. These are shortly outlined below but are also openly and in detail discussed by the authors.

      Weaknesses:

      The individual findings are interesting. However, due to some shortcomings in the experimental design they cannot be profoundly related to each other. For example, the authors show that replay is present in low but not in high performers with the assumption that high performers tend to simultaneously reactivate items. But then, the authors do not investigate clustered reactivation (= simultaneous reactivation) as a function of performance due to a low number of retrieval trials and ceiling performance in most participants.<br /> As a consequence of the experimental design, some analyses are underpowered (very low number of trials, n = ~10, and for some analyses, very low number of participants, n = 14).

    1. eLife assessment

      This useful study reports the behavioural and physiological effects of the longitudinal activation of neurons associated with negative experiences. The main claims of the paper are supported by solid experimental evidence, but the specificity of the long-term manipulation requires additional validation. This study will be of interest to neuroscientists working on memory.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, Jellinger et al. performed engram-specific sequencing and identified genes that were selectively regulated in positive/negative engram populations. In addition, they performed chronic activation of the negative engram population over 3 months and observed several effects on fear/anxiety behavior and cellular events such as upregulation of glial cells and decreased GABA levels.

      Strengths:

      They provide useful engram-specific GSEA data and the main concept of the study, linking negative valence/memory encoding to cellular level outcomes including upregulation of glial cells, is interesting and valuable.

      Weaknesses:

      A number of experimental shortcomings make the conclusion of the study largely unsupported. In addition, the observed differences in behavioral experiments are rather small, inconsistent, and the interpretation of the differences is not compelling.

      Major points for improvement:

      (1) Lack of essential control experiments

      With the current set of experiments, it is not certain that the DREADD system they used was potent and stable throughout the 3 months of manipulations. Basic confirmatory experiments (e.g., slice physiology at 1m vs. 3m) to show that the DREADD effects on these vHP are stable would be an essential bottom line to make these manipulation experiments convincing.

      Furthermore, although the authors use the mCherry vector as a control, they did not have a vehicle/saline control for the hM3Dq AAV. Thus, the long-term effects such as the increase in glial cells could simply be due to the toxicity of DREADD expression, rather than an induced activity of these cells.

      (2) Figure 1 and the rest of the study are disconnected

      The authors used the cFos-tTA system to label positive/negative engram populations, while the TRAP2 system was used for the chronic activation experiments. Although both genetic tools are based on the same IEG Fos, the sensitivity of the tools needs to be validated. In particular, the sensitivity of the TRAP2 system can be arbitrarily altered by the amount of tamoxifen (or 4OHT) and the administration protocols. The authors should at least compare and show the percentage of labeled cells in both methods and discuss that the two experiments target (at least slightly) different populations. In addition, the use of TRAP2 for vHP is relatively new; the authors should confirm that this method actually captures negative engram populations by checking for reactivation of these cells during recall by overlap analysis of Fos staining or by artificial activation.

      (3) Interpretation of the behavior data

      In Figures 3a and b, the authors show that the experimental group showed higher anxiety based on time spent in the center/open area. However, there were no differences in distance traveled and center entries, which are often reduced in highly anxious mice. Thus, it is not clear what the exact effect of the manipulation is. The authors may want to visualize the trajectories of the mice's locomotion instead of just showing bar graphs.

      In addition, the data shown in Figure 4b is somewhat surprising - the 14MO control showed more freezing than the 6MO control, which can be interpreted as "better memory in old". As this is highly counterintuitive, the authors may want to discuss this point. The authors stated that "Mice typically display increased freezing behavior as they age, so these effects during remote recall are expected" without any reference. This is nonsense, as just above in Figure 4a, older mice actually show less freezing than young mice.

      Overall, the behavioral effects are rather small and random. I would suggest that these data be interpreted more carefully.

      (4) Lack of citation and discussion of relevant study

      Khalaf et al. 2018 from Gräff lab showed that experimental activation of recall-induced populations leads to fear attenuation. Despite the differences in experimental details, the conceptual discrepancy should be discussed.

    3. Reviewer #2 (Public Review):

      Summary:

      Jellinger, Suthard, et al. investigated the transcriptome of positive and negative valence engram cells in the ventral hippocampus, revealing anti- and pro-inflammatory signatures of these respective valences. The authors further reactivated the negative valence engram ensembles to assay the effects of chronic negative memory reactivation in young and old mice. This chronic re-activation resulted in differences in aspects of working memory, and fear memory, and caused morphological changes in glia. Such reactivation-associated changes are putatively linked to GABA changes and behavioral rumination.

      Strengths:

      Much of the content of this manuscript is of benefit to the community, such as the discovery of differential engram transcriptomes dependent on memory valence. The chronic activation of neurons, and the resultant effects on glial cells and behavior, also provide the community with important data. Laudable points of this manuscript include the comprehensiveness of behavioral experiments, as well as the cross-disciplinary approach.

      Weaknesses:

      There are several key claims made that are unsubstantiated by the data, particularly regarding the anthropomorphic framing of "rumination" on a mouse model and the role of GABA. The conclusions and inferences in these areas need to be carefully considered.

      (1) There are many issues regarding the arguments for the behavioural data's human translation as "rumination." There is no definition of rumination provided in the manuscript, nor how rumination is similar/different to intrusive thoughts (which are psychologically distinct but used relatively interchangeably in the manuscript), nor how rumination could be modelled in the rodent. The authors mention that they are attempting to model rumination behaviours by chronically reactivating the negative engram ("To understand if our experimental model of negative rumination..."), but this occurs almost at the very end of the results section, and no concrete evidence from the literature is provided to attempt to link the behavioural results (decreased working memory, increased fear extinction times) to rumination-like behaviours. The arguments in the final paragraph of the Discussion section about human rumination appear to be unrelated to the data presented in the manuscript and contain some uncited statements. Finally, the rumination claims seem to be based largely upon a single data figure that needs to be further developed (Figure 6, see also point 2 below).

      (2) The staining and analysis in Figure 6 are challenging to interpret, and require more evidence to substantiate the conclusions of these results. The histological images are zoomed out, and at this resolution, it appears that only the pyramidal cell layer is being stained. A GABA stain should also label the many sparsely spaced inhibitory interneurons existing across all hippocampal layers, yet this is not apparent here. Moreover, both example images in the treatment group appear to have lower overall fluorescence intensity in both DAPI and GABA. The analysis is also unclear: the authors mention "ROIs" used to measure normalized fluorescence intensity but do not specify what the ROI encapsulates. Presumably, the authors have segmented each DAPI-positive cell body and assessed fluorescence - however, this is not explicated nor demonstrated, making the results difficult to interpret.

      (3) A smaller point, but more specific detail is needed for how genes were selected for GSEA analysis. As GSEA relies on genes to be specified a priori, to avoid a circular analysis, these genes need to be selected in a blind/unbiased manner to avoid biasing downstream results and conclusions. It's likely the authors have done this, but explicitly noting how genes were selected is an important context for this analysis.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors note that negative ruminations can lead to pathological brain states and mood/anxiety dysregulation. They test this idea by using mouse engram-tagging technology to label dentate gyrus ensembles activated during a negative experience (fear conditioning). They show that chronic chemogenetic activation of these ensembles leads to behavioral (increased anxiety, increased fear generalization, reduced fear extinction) and neural (increases in neuroinflammation, microglia, and astrocytes).

      Strengths:

      The question the authors ask here is an intriguing one, and the engram activation approach is a powerful way to address the question. Examination of a wide range of neural and behavioral dependent measures is also a strength.

      Weaknesses:

      The major weakness is that the authors have found a range of changes that are correlates of chronic negative engram reactivation. However, they do not manipulate these outcomes to test whether microglia, astrocytes, or neuroinflammation are causally linked to the dysregulated behaviors.

    1. eLife assessment

      This important work provides insights into the neural mechanisms regulating specific parental behaviors. By identifying a key role for oxytocin synthesizing cells in the paraventricular nucleus of the hypothalamus and their projections to the medial prefrontal cortex in promoting pup care and inhibiting infanticide, the study advances our understanding of the neurobiological basis of these contrasting behaviors in male and female mandarin voles. The evidence supporting the authors' conclusions is solid but lacks some critical methodological detail. The work should be of interest to researchers studying neuropeptide control of social behaviors in the brain.

    2. Reviewer #1 (Public Review):

      Summary:

      This important study investigated the role of oxytocin (OT) neurons in the paraventricular nucleus (PVN) and their projections to the medial prefrontal cortex (mPFC) in regulating pup care and infanticide behaviors in mandarin voles. The researchers used techniques like immunofluorescence, optogenetics, OT sensors, and peripheral OT administration. Activating OT neurons in the PVN reduced the time it took pup-caring male voles to approach and retrieve pups, facilitating pup-care behavior. However, this activation had no effect on females. Interestingly, this same PVN OT neuron activation also reduced the time for both male and female infanticidal voles to approach and attack pups, suggesting PVN OT neuron activity can promote pup care while inhibiting infanticide behavior. Inhibition of these neurons promoted infanticide. Stimulating PVN->mPFC OT projections facilitated pup care in males and in infanticide-prone voles, activation of these terminals prolonged latency to approach and attack. Inhibition of PVN->mPFC OT projections promoted infanticide. Peripheral OT administration increased pup care in males and reduced infanticide in both sexes. However, some results differed in females, suggesting other mechanisms may regulate female pup care.

      Strengths:

      This multi-faceted approach provides converging evidence, strengthens the conclusions drawn from the study, and makes them very convincing. Additionally, the study examines both pup care and infanticide behaviors, offering insights into the mechanisms underlying these contrasting behaviors. The inclusion of both male and female voles allows for the exploration of potential sex differences in the regulation of pup-directed behaviors. The peripheral OT administration experiments also provide valuable information for potential clinical applications and wildlife management strategies.

      Weaknesses:

      While the study presents exciting findings, there are several weaknesses that should be addressed. The sample sizes used in some experiments, such as the Fos study and optogenetic manipulations, appear to be small, which may limit the statistical power and generalizability of the results. Effect sizes are not reported, making it difficult to evaluate the practical significance of the findings. The imaging parameters and analysis details for the Fos study are not clearly described, hindering the interpretation of these results (i.e., was the entire PVN counted?). Also, does the Fos colocalization align with previous studies that look at PVN Fos and maternal/ paternal care? Additionally, the study lacks electrophysiological data to support the optogenetic findings, which could provide insights into the neural mechanisms underlying the observed behaviors.

      The study has several limitations that warrant further discussion. Firstly, the potential effects of manipulating OT neurons on the release of other neurotransmitters (or the influence of other neurochemicals or brain regions) on pup-directed behaviors, especially in females, are not fully explored. Additionally, it is unclear whether back-propagation of action potentials during optogenetic manipulations causes the same behavioral effect as direct stimulation of PVN OT cells. Moreover, the authors do not address whether the observed changes in behavior could be explained by overall increases or decreases in locomotor activity.

      The authors do not specify the percentage of PVN->mPFC neurons labeled that were OT-positive, nor do they directly compare the sexes in their behavioral analysis (or if they did, it is not clear statistically). While the authors propose that the sex difference in pup-directed behaviors is due to females having greater OT expression, they do not provide evidence to support this claim from their labeling data. It is also uncertain whether more OT neurons were manipulated in females compared to males. The study could benefit from a more comprehensive discussion of other factors that could influence the neural circuit under investigation, especially in females.

    3. Reviewer #2 (Public Review):

      Summary:

      This series of experiments studied the involvement of PVN OT neurons and their projection to the mPFC in pup-care and attack behavior in virgin male and female Mandarin voles. Using Fos visualization, optogenetics, fiber photometry, and IP injection of OT the results converge on OT regulating caregiving and attacks on pups. Some sex differences were found in the effects of the manipulations.

      Strengths:

      Major strengths are the modern multi-method approaches and involving both sexes of Mandarin vole in every experiment.

      Weaknesses:

      Weaknesses include the lack of some specific details in the methods that would help readers interpret the results. These include:

      (1) No description of diffusion of centrally injected agents.

      (2) Whether all central targets were consistent across animals included in the data analyses. This includes that is not stated if the medial prelimbic mPFC target was in all optogenetic study animals as shown in Figure 4 and if that is the case, there is no discussion of that subregion's function compared to other mPFC subregions.

      (3) How groups of pup-care and infanticidal animals were created since there was no obvious pre-test mentioned so perhaps there was the testing of a large number of animals until getting enough subjects in each group.

      (4) The apparent use of a 20-minute baseline data collection period for photometry that started right after the animals were stressed from handling and placement in the novel testing chamber.

      (5) A weakness in the results reporting is that it's unclear what statistics are reported (2 x 2 ANOVA main effect of interaction results, t-test results) and that the degrees of freedom expected for the 2 X 2 ANOVAs in some cases don't appear to match the numbers of subjects shown in the graphs; including sample sizes in each group would be helpful because the graph panels are very small and data points overlap.

      The additional context that could help readers of this study is that the authors overlook some important mPFC and pup caregiving and infanticide studies in the introduction which would help put this work in better context in terms of what is known about the mPFC and these behaviors. These previous studies include Febo et al., 2010; Febo 2012; Peirera and Morrell, 2011 and 2020; and a very relevant study by Alsina-Llanes and Olazábal, 2021 on mPFC lesions and infanticide in virgin male and female mice. The introduction states that nothing is known about the mPFC and infanticide. In the introduction and discussion, stating the species and sex of the animals tested in all the previous studies mentioned would be useful. The authors also discuss PVN OT cell stimulation findings seen in other rodents, so the work seems less conceptually novel. Overall, the findings add to the knowledge about OT regulation of pup-directed behavior in male and female rodents, especially the PVN-mPFC OT projection.

    4. Reviewer #3 (Public Review):

      Summary:

      Here Li et al. examine pup-directed behavior in virgin Mandarin voles. Some males and females tend towards infanticide, others tend towards pup care. c-Fos staining showed more oxytocin cells activated in the paraventricular nucleus (PVN) of the hypothalamus in animals expressing pup care behaviors than in infanticidal animals. Optogenetic stimulation of PVN oxytocin neurons (with an oxytocin-specific virus to express the opsin transgene) increased pup-care, or in infanticidal voles increased latency towards approach and attack.

      Suppressing the activity of PVN oxytocin neurons promoted infanticide. The use of a recent oxytocin GRAB sensor (OT1.0) showed changes in medial prefrontal cortex (mPFC) signals as measured with photometry in both sexes. Activating mPFC oxytocin projections increased latency to approach and attack in infanticidal females and males (similar to the effects of peripheral oxytocin injections), whereas in pup-caring animals only males showed a decrease in approach. Inhibiting these projections increased infanticidal behaviors in both females and males and had no effect on pup caretaking.

      Strengths:

      Adopting these methods for Mandarin voles is an impressive accomplishment, especially the valuable data provided by the oxytocin GRAB sensor. This is a major achievement and helps promote systems neuroscience in voles.

      Weaknesses:

      The study would be strengthened by an initial figure summarizing the behavioral phenotypes of voles expressing pup care vs infanticide: the percentages and behavioral scores of individual male and female nulliparous animals for the behaviors examined here. Do the authors have data about the housing or life history/experiences of these animals? How bimodal and robust are these behavioral tendencies in the population?

      Optogenetics with the oxytocin promoter virus is a nice advance here. More details about their preparation and methods should be in the main text, and not simply relegated to the methods section. For optogenetic stimulation in Figure 2, how were the stimulation parameters chosen? There is a worry that oxytocin neurons can co-release other factors- are the authors sure that oxytocin is being released by optogenetic stimulation as opposed to other transmitters or peptides, and acting through the oxytocin receptor (as opposed to a vasopressin receptor)?

      Given that they are studying changes in latency to approach/attack, having some controls for motion when oxytocin neurons are activated or suppressed might be nice. Oxytocin is reported to be an anxiolytic and a sedative at high levels.

      The OT1.0 sensor is also amazing, these data are quite remarkable. However, photometry is known to be susceptive to motion artifacts and I didn't see much in the methods about controls or correction for this. It's also surprising to see such dramatic, sudden, and large-scale suppression of oxytocin signaling in the mPFC in the infanticidal animals - does this mean there is a substantial tonic level of oxytocin release in the cortex under baseline conditions?

      Figure 5 is difficult to parse as-is, and relates to an important consideration for this study: how extensive is the oxytocin neuron projection from PVN to mPFC?

      In Figures 6 and 7, the authors use the phrase 'projection terminals'; however, to my knowledge, there have not been terminals (i.e., presynaptic formations opposed to a target postsynaptic site) observed in oxytocin neuron projections into target central regions.

      Projection-based inhibition as in Figure 7 remains a controversial issue, as it is unclear if the opsin activation can be fast enough to reduce the fast axonal/terminal action potential. Do the authors have confirmation that this works, perhaps with the oxytocin GRAB OT sensor?

      As females and males had similar GRAB OT1.0 responses in mPFC, why would the behavioral effects of increasing activity be different between the sexes?

    1. eLife assessment

      This method paper proposes a valuable Oscillation Component Analysis (OCA) approach, in analogy to Independent Component Analysis (ICA), in which source separation is achieved through biophysically inspired generative modeling of neural oscillations. The empirical evidence justifying the approach's advantage is incomplete. This work will be of interest to cognitive neuroscience, neural oscillation, and MEG/EEG.

    2. Reviewer #1 (Public Review):

      Summary:

      The present paper introduces Oscillation Component Analysis (OCA), in analogy to ICA, where source separation is underpinned by a biophysically inspired generative model. It puts the emphasis on oscillations, which is a prominent characteristic of neurophysiological data.

      Strengths:

      Overall, I find the idea of disambiguating data-driven decompositions by adding biophysical constrains useful, interesting and worth-pursuing. The model incorporates both a component modelling of oscillatory responses that is agnostic about the frequency content (e.g.. doesn't need bandpass filtering or predefinition of bands) and a component to map between sensor and latent-space. I feel these elements can be useful in practice.

      Weaknesses:

      Lack of empirical support: I am missing empirical justification of the advantages that are theoretically claimed in the paper. I feel the method needs to be compared to existing alternatives.

    1. eLife assessment

      The manuscript looks at how dysregulated purine metabolism in mutants for the Aprt gene impacts survival, motor and sleep behavior in the fruit fly. Interestingly, although several deficits arise from dopaminergic neurons, dopamine levels are increased in Aprt mutants. Instead the biochemical change responsible for Aprt mutant neurobehavioural phenotypes appears to be a reduction in levels of adenosine. This valuable study suggests that Drosophila Aprt mutants may serve as a model for understanding Lesch-Nyhan Disease (LND), caused by mutations in the human HPRT1 gene, and may also potentially serve as a model to screen for drugs for the neurobehavioural deficits observed in LND. The strength of evidence is solid.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This important study advances our understanding of how past and future information is jointly considered in visual working memory by studying gaze biases in a memory task that dissociates the locations during encoding and memory tests. The evidence supporting the conclusions is convincing, with state-of-the-art gaze analyses that build on a recent series of experiments introduced by the authors. This work, with further improvements incorporating the existing literature, will be of broad interest to vision scientists interested in the interplay of vision, eye movements, and memory.

      We thank the Editors and the Reviewers for their enthusiasm and appreciation of our task, our findings, and our article. We also wish to thank the Reviewers for their constructive comments that we have embraced to improve our article. Please find below our point-by-point responses to this valuable feedback, where we also state relevant revisions that we have made to our article.

      In addition, please note that we have now also made our data and code publicly available.

      Reviewer 1, Comments:

      In this study, the authors offer a fresh perspective on how visual working memory operates. They delve into the link between anticipating future events and retaining previous visual information in memory. To achieve this, the authors build upon their recent series of experiments that investigated the interplay between gaze biases and visual working memory. In this study, they introduce an innovative twist to their fundamental task. Specifically, they disentangle the location where information is initially stored from the location where it will be tested in the future. Participants are tasked with learning a novel rule that dictates how the initial storage location relates to the eventual test location. The authors leverage participants' gaze patterns as an indicator of memory selection. Intriguingly, they observe that microsaccades are directed toward both the past encoding location and the anticipated future test location. This observation is noteworthy for several reasons. Firstly, participants' gaze is biased towards the past encoding location, even though that location lacks relevance to the memory test. Secondly, there's a simultaneous occurrence of an increased gaze bias towards both the past and future locations. To explore this temporal aspect further, the authors conduct a compelling analysis that reveals the joint consideration of past and future locations during memory maintenance. Notably, microsaccades biased towards the future test location also exhibit a bias towards the past encoding location. In summary, the authors present an innovative perspective on the adaptable nature of visual working memory. They illustrate how information relevant to the future is integrated with past information to guide behavior.

      Thank you for your enthusiasm for our article and findings as well as for your constructive suggestions for additional analyses that we respond to in detail below.

      This short manuscript presents one experiment with straightforward analyses, clear visualizations, and a convincing interpretation. For their analysis, the authors focus on a single time window in the experimental trial (i.e., 0-1000 ms after retro cue onset). While this time window is most straightforward for the purpose of their study, other time windows are similarly interesting for characterizing the joint consideration of past and future information in memory. First, assessing the gaze biases in the delay period following the cue offset would allow the authors to determine whether the gaze bias towards the future location is sustained throughout the entire interval before the memory test onset. Presumably, the gaze bias towards the past location may not resurface during this delay period, but it is unclear how the bias towards the future location develops in that time window. Also, the disappearance of the retro cue constitutes a visual transient that may leave traces on the gaze biases which speaks again for assessing gaze biases also in the delay period following the cue offset.

      Thank you for raising this important point. We initially focused on the time window during the cue given that our central focus was on gaze-biases associated with mnemonic item selection. By zooming in on this window, we could best visualize our main effects of interest: the joint selection (in time) of past and future memory attributes.

      At the same time, we fully agree that examining the gaze biases over a more extended time window yields a more comprehensive view of our data. To this end, we have now also extended our analysis to include a wider time range that includes the period between cue offset (1000 ms after cue onset) and test onset (1500 ms after cue onset). We present these data below. Because we believe our future readers are likely to be interested in this as well, we have now added this complementary visualization as Supplementary Figure 4 (while preserving the focus in our main figure on the critical mnemonic selection period of interest).

      Author response image 1.

      Supplementary Figure 4. Gaze biases in extended time window as a complement to Figure 1 and Supplementary Figure 2. This extended analysis reveals that while the gaze bias towards the past location disappears around 600 ms after cue onset, the gaze bias towards the future location persists (panel a) and that while the early (joint) future bias occurs predominantly in the microsaccade range below 1 degree visual angle, the later bias to the future location incorporates larger eye movement that likely involve preparing for optimally perceiving the anticipated test stimulus (panel b).

      This extended analysis reveals that while the gaze bias towards the past location disappears around 600 ms after cue onset (consistent with our prior reports of this bias), the gaze bias towards the future location persists. Moreover, as revealed by the data in panel b above, while the early (joint) future bias occurs predominantly in the microsaccade range below 1 degree visual angle, the later bias to the future location incorporates larger eye movement that likely involve preparing for optimally perceiving the anticipated test stimulus.

      We now also call out these additional findings and figure in our article:

      Page 2 (Results): “Gaze biases in both axes were driven predominantly by microsaccades (Supplementary Fig. 2) and occurred similarly in horizontal-to-vertical and vertical-tohorizontal trials (Supplementary Fig. 3). Moreover, while the past bias was relatively transient, the future bias continued to increase in anticipation of the of the test stimulus and increasingly incorporated eye-movements beyond the microsaccade range (see Supplementary Fig. 4 for a more extended time range)”.

      Moreover, assessing the gaze bias before retro-cue onset allows the authors to further characterize the observed gaze biases in their study. More specifically, the authors could determine whether the future location is considered already during memory encoding and the subsequent delay period (i.e., before the onset of the retro cue). In a trial, participants encode two oriented gratings presented at opposite locations. The future rule indicates the test locations relative to the encoding locations. In their example (Figure 1a), the test locations are shifted clockwise relative to the encoding location. Thus, there are two pairs of relevant locations (each pair consists of one stimulus location and one potential test location) facing each other at opposite locations and therefore forming an axis (in the illustration the axis would go from bottom left to top right). As the future rule is already known to the participants before trial onset it is possible that participants use that information already during encoding. This could be tested by assessing whether more microsaccades are directed along the relevant axis as compared to the orthogonal axis. The authors should assess whether such a gaze bias exists already before retro cue onset and discuss the theoretical consequences for their main conclusions (e.g., is the future location only jointly used if the test location is implicitly revealed by the retro cue).

      Thank you – this is another interesting point. We fully agree that additional analysis looking at the period prior to retrocue onset may also prove informative. In accordance with the suggested analysis, we have therefore now also analysed the distribution of saccade directions (including in the period from encoding to retrocue) as a function of the future rule (presented below, and now also included as Supplementary Fig. 5). Complementary recent work from our lab has shown how microsaccade directions can align to the axis of memory contents during retention (see de Vries & van Ede, eNeuro, 2024). Based on this finding, one may predict that if participants retain the items in a remapped fashion, their microsaccades may align with the axis of the future rule, and this could potentially already happen prior to cue onset.

      These complementary analyses show that saccade directions are predominantly influenced by the encoding locations rather than the test locations, as seen most clearly by the saccade distribution plots in the middle row of the figure below. To obtain time-courses, we categorized saccades as occurring along the axis of the future rule or along the orthogonal axis (bottom row of the figure below). Like the distribution plots, these time course plots also did not reveal any sign of a bias along the axis of the future rule itself.

      Importantly, note how this does not argue against our main findings of joint selection of past and future memory attributes, as for that central analysis we focused on saccade biases that were specific to the selected memory item, whereas the analyses we present below focus on biases in the axes in which both memory items are defined; not only the cued/selected memory item.

      Author response image 2.

      Supplementary Figure 5. Distribution of saccade directions relative to the future rule from encoding onset. (Top panel) The spatial layouts in the four future rules. (Middle panel) Polar distributions of saccades during 0 to 1500 ms after encoding onset (i.e., the period between encoding onset and cue onset). The purple quadrants represent the axis of the future rule and the grey quadrants the orthogonal axis. (Bottom panel) Time courses of saccades along the above two axes. We did not observe any sign of a bias along the axis of the future rule itself.

      We agree that these additional results are important to bring forward when we interpret our findings. Accordingly, we now mention these findings at the relevant section in our Discussion:

      Page 5 (Discussion): “First, memory contents could have directly been remapped (cf. 4,24–26) to their future-relevant location. However, in this case, one may have expected to exclusively find a future-directed gaze bias, unlike what we observed. Moreover, using a complementary analysis of saccade directions along the axis of the future rule (cf. 24), we found no direct evidence for remapping in the period between encoding and cue (Supplementary Fig. 5)”.

      Reviewer 2, Comments:

      The manuscript by Liu et al. reports a task that is designed to examine the extent to which "past" and "future" information is encoded in working memory that combines a retro cue with rules that indicate the location of an upcoming test probe. An analysis of microsaccades on a fine temporal scale shows the extent to which shifts of attention track the location of the location of the encoded item (past) and the location of the future item (test probe). The location of the encoded grating of the test probe was always on orthogonal axes (horizontal, vertical) so that biases in microsaccades could be used to track shifts of attention to one or the other axis (or mixtures of the two). The overall goal here was then to (1) create a methodology that could tease apart memory for the past and future, respectively, (2) to look at the time-course attention to past/future, and (3) to test the extent to which microsaccades might jointly encode past and future memoranda. Finally, some remarks are made about the plausibility of various accounts of working memory encoding/maintenance based on the examination of these time courses.

      Strengths:

      This research has several notable strengths. It has a clear statement of its aims, is lucidly presented, and uses a clever experimental design that neatly orthogonalizes "past" and "future" as operationalized by the authors. Figure 1b-d shows fairly clearly that saccade directions have an early peak (around 300ms) for the past and a "ramping" up of saccades moving in the forward direction. This seems to be a nice demonstration the method can measure shifts of attention at a fine temporal resolution and differentiate past from future-oriented saccades due to the orthogonal cue approach. The second analysis shown in Figure 2, reveals a dependency in saccade direction such that saccades toward the probe future were more likely also to be toward the encoded location than away from the encoded direction. This suggests saccades are jointly biased by both locations "in memory".

      Thank you for your overall appreciation of our work and for highlighting the above strengths. We also thank you for your constructive comments and call for clarifications that we respond to below.

      Weaknesses:

      (1) The "central contribution" (as the authors characterize it) is that "the brain simultaneously retains the copy of both past and future-relevant locations in working memory, and (re)activates each during mnemonic selection", and that: "... while it is not surprising that the future location is considered, it is far less trivial that both past and future attributes would be retained and (re)activated together. This is our central contribution." However, to succeed at the task, participants must retain the content (grating orientation, past) and probe location (future) in working memory during the delay period. It is true that the location of the grating is functionally irrelevant once the cue is shown, but if we assume that features of a visual object are bound in memory, it is not surprising that location information of the encoded object would bias processing as indicated by microsaccades. Here the authors claim that joint representation of past and future is "far less trivial", this needs to be evaluaed from the standpoint of prior empirical data on memory decay in such circumstances, or some reference to the time-course of the "unbinding" of features in an encoded object.

      Thank you. We agree that our participants have to use the future rule – as otherwise they do not know to which test stimulus they should respond. This was a deliberate decision when designing the task. Critically, however, this does not require (nor imply) that participants have to incorporate and apply the rule to both memory items already prior to the selection cue. It is at least as conceivable that participants would initially retain the two items at their encoded (past) locations, then wait for the cue to select the target memory item, and only then consider the future location associated with the target memory item. After all, in every trial, there is only 1 relevant future location: the one associated with the cued memory item. The time-resolved nature of our gaze markers argues against such a scenario, by virtue of our observation of the joint (simultaneous) consideration of past and future memory attributes (as opposed to selection of past-before-future). These temporal dynamics are central to the insights provided by our study.

      In our view, it is thus not obvious that the rule would be applied at encoding. In this sense, we do not assume that the future location is part of both memory objects from encoding, but rather ask whether this is the case – and, if so, whether the future location takes over the role of the past location, or whether past and future locations are retained jointly.

      Our statements regarding what is “trivial” and what is “less trivial” regard exactly this point: it is trivial that the future is considered (after all, our task demanded it). However, it is less trivial that (1) the future location was already available at the time of initial item selection (as reflected in the simultaneous engagement of past and future locations), and (2) that in presence of the future location, the past location was still also present in the observed gaze biases.

      Having said that, we agree that an interesting possibility is that participants remap both memory items to their future-relevant locations ahead of the cue, but that the past location is not yet fully “unbound” by the time of the cue. This may trigger a gaze bias not only to the new future location but also to the “sticky” (unbound) past location. We now acknowledge this possibility in our discussion (also in response to comment 3 below) where we also suggest how future work may be able to tap into this:

      Page 6 (Discussion): “In our study, the past location of the memory items was technically irrelevant for the task and could thus, in principle, be dropped after encoding. One possibility is that participants remapped the two memory items to their future locations soon after encoding, and had started – but not finished – dropping the past location by the time the cue arrived. In such a scenario, the past signal is merely a residual trace of the memory items that serves no purpose but still pulls gaze. Alternatively, however, the past locations may be utilised by the brain to help individuate/separate the two memory items. Moreover, by storing items with regard to multiple spatial frames (cf. 37) – here with regard to both past and future visual locations – it is conceivable that memories may become more robust to decay and/or interference. Also, while in our task past locations were never probed, in everyday life it may be useful to remember where you last saw something before it disappeared behind an occluder. In future work, it will prove interesting to systematically vary to the delay between encoding and cue to assess whether the reliance on the past location gradually dissipates with time (consistent with dropping an irrelevant feature), or whether the past trace remains preserved despite longer delays (consistent with preserving utility for working memory).”

      (2) The authors refer to "future" and "past" information in working memory and this makes sense at a surface level. However, once the retrocue is revealed, the "rule" is retrieved from long-term memory, and the feature (e.g. right/left, top/bottom) is maintained in memory like any other item representation. Consider the classic test of digit span. The digits are presented and then recalled. Are the digits of the past or future? The authors might say that one cannot know, because past and future are perfectly confounded. An alternative view is that some information in working memory is relevant and some is irrelevant. In the digit span task, all the digits are relevant. Relevant information is relevant precisely because it is thought be necessary in the future. Irrelevant information is irrelevant precisely because it is not thought to be needed in the immediate future. In the current study, the orientation of the grating is relevant, but its location is irrelevant; and the location of the test probe is also relevant.

      Thank you for this stimulating reflection. We agree that in our set-up, past location is technically “task-irrelevant” while future location is certainly “task-relevant”. At the same time, the engagement of the past location suggests to us that the brain uses past location for the selection – presumably because the brain uses spatial location to help individuate/separate the items, even if encoded locations are never asked about. Therefore, whether something is relevant or irrelevant ultimately depends on how one defines relevance (past location may be relevant/useful for the brain even if technically irrelevant from the perspective of the task). In comparison, the use of “past” and “future” may be less ambiguous.

      It is also worth noting how we interpret our findings in relation to demands on visual working memory, inspired by dynamic situations whereby visual stimuli may be last seen at one location but expected to re-appear at another, such as a bird disappearing behind a building (the example in our introduction). Thus, past for us does not refer to the memory item perse (like in the digit span analogue) but, rather, quite specifically to the past location of a dynamic visual stimulus in memory (which, in our experiment, was operationalised by the future rule, for convenience).

      (3) It is not clear how the authors interpret the "joint representation" of past and future. Put aside "future" and "past" for a moment. If there are two elements in memory, both of which are associated with spatial bindings, the attentional focus might be a spatial average of the associated spatial indices. One might also view this as an interference effect, such that the location of the encoded location attracts spatial attention since it has not been fully deleted/removed from working memory. Again, for the impact of the encoded location to be exactly zero after the retrieval cue, requires zero interference or instantaneous decay of the bound location information. It would be helpful for the authors to expand their discussion to further explain how the results fit within a broader theoretical framework and how it fits with empirical data on how quickly an irrelevant feature of an object can be deleted from working memory.

      Thank you also for this point (that is related to the two points above). As we stated in our reply to comment 1 above, we agree that one possibility is that the past location is merely “sticky” and pulls the task-relevant future bias toward the past location. If so, our time courses suggest that such “pulling” occurs only until approximately 600 ms after cue onset, as the past bias is only transient. An alternative interpretation is that the past location may not be merely a residual irrelevant trace, but actually be useful and used by the brain.

      For example, the encoded (past) item locations provide a coordinate system in which to individuate/separate the two memory items. While the future locations also provide such a coordinate system, the brain may benefit from holding onto both coordinate systems at the same time, rendering our observation of joint selection in both frames. Indeed, in a recent VR experiment in which we had participants (rather than the items) rotate, we also found evidence for the joint use of two spatial frames, even if neither was technically required for the upcoming task (see Draschkow, Nobre, van Ede, Nature Human Behaviour, 2022). Though highly speculative at this stage, such reliance on multiple spatial frames may make our memories more robust to decay and/or interference. Moreover, while past location was never explicitly probed in our task, in daily life the past location may sometimes (unexpectedly) become relevant, hence it may be useful to hold onto it, just in case. Thus, considering the past location merely as an “irrelevant feature” (that takes time to delete) may not do sufficient justice to the potential roles of retaining past locations of dynamic visual objects held in working memory.

      As also stated in response to comment 1 above, we now added these relevant considerations to our Discussion:

      Page 5 (Discussion): “In our study, the past location of the memory items was technically irrelevant for the task and could thus, in principle, be dropped after encoding. One possibility is that participants remapped the two memory items to their future locations soon after encoding, and had started – but not finished – dropping the past location by the time the cue arrived. In such a scenario, the past signal is merely a residual trace of the memory items that serves no purpose but still pulls gaze. Alternatively, however, the past locations may be utilised by the brain to help individuate/separate the two memory items. Moreover, by storing items with regard to multiple spatial frames (cf. 37) – here with regard to both past and future visual locations – it is conceivable that memories may become more robust to decay and/or interference. Also, while in our task past locations were never probed, in everyday life it may be useful to remember where you last saw something before it disappeared behind an occluder. In future work, it will prove interesting to systematically vary to the delay between encoding and cue to assess whether the reliance on the past location gradually dissipates with time (consistent with dropping an irrelevant feature), or whether the past trace remains preserved despite longer delays (consistent with preserving utility for working memory).”

      Reviewer 3, Comments:

      This study utilizes saccade metrics to explore, what the authors term the "past and future" of working memory. The study features an original design: in each trial, two pairs of stimuli are presented, first a vertical pair and then a horizontal one. Between these two pairs comes the cue that points the participant to one target of the first pair and another of the second pair. The task is to compare the two cued targets. The design is novel and original but it can be split into two known tasks - the first is a classic working memory task (a post-cue informs participants which of two memorized items is the target), which the authors have used before; and the second is a classic spatial attention task (a pre-cue signal that attention should be oriented left or right), which was used by numerous other studies in the past. The combination of these two tasks in one design is novel and important, as it enables the examination of the dynamics and overlapping processes of these tasks, and this has a lot of merit. However, each task separately is not new. There are quite a few studies on working memory and microsaccades and many on spatial attention and microsaccades. I am concerned that the interpretation of "past vs. future" could mislead readers to think that this is a new field of research, when in fact it is the (nice) extension of an existing one. Since there are so many studies that examined pre-cues and post-cues relative to microsaccades, I expected the interpretation here to rely more heavily on the existing knowledge base in this field. I believe this would have provided a better context of these findings, which are not only on "past" vs. "future" but also on "working memory" vs. "spatial attention".

      Thank you for considering our findings novel and important, while at the same time reminding us of the parallels to prior tasks studying spatial attention in perception and working memory. We fully agree that our task likely engages both attention to the (past) memory item as well as spatial attention to the upcoming (future) test stimulus. At the same time, there is a critical difference in spatial attention for the future in our task compared with ample prior tasks engaging spatial cueing of attention for perception. In our task, the cue never directly cues the future location. Rather, it exclusively cues the relevant memory item. It is the memory item that is associated with the relevant future location, according to the future rule. This integration of the rule-based future location into the memory representation is distinct from classical spatial-attention tasks in which attention is cued directly to a specific location via, for example, a spatial cue such as an arrow.

      Thus, if we wish to think about our task as engaging cueing of spatial attention for perception, we have to at least also invoke the process of cueing the relevant location via the appropriate memory item. We feel it is more parsimonious to think of this as attending to both the past and future location of a dynamic visual object in working memory.

      If we return to our opening example, when we see a bird disappear behind a building, we can keep in working memory where we last saw it, while anticipating where it will re-appear to guide our external spatial attention. Here too, spatial attention is fully dependent on working-memory content (the bird itself) – mirroring the dynamic semng in our study. Thus, we believe our findings contribute a fresh perspective, while of course also extending established fields. We now contextualize our finding within the literature and clarify our unique contribution in our revised manuscript:

      Page 5 (Discussion): “Building on the above, at face value, our task may appear like a study that simply combines two established tasks: tasks using retro-cues to study attention in working memory (e.g.,2,31-33) and tasks using pre-cues to study orienting of spatial attention to an upcoming external stimulus (e.g., 31,32,34–36). A critical difference with common pre-cue studies, however, is that the cue in our task never directly informed the relevant future location. Rather, as also stressed above, the future location was a feature of the cued memory item (according to the future rule), and not of the cue itself. Note how this type of scenario may not be uncommon in everyday life, such as in our opening example of a bird flying behind a building. Here too, the future relevant location is determined by the bird – i.e. the memory content – itself.”

      Reviewer 2, Recommendations:

      It would be helpful to set up predictions based on existing working memory models. Otherwise, the claim that the joint coding of past/future is "not trivial" is simply asserted, rather than contradicting an existing model or prior empirical results. If the non-trivial aspect is simply the ability to demonstrate the joint coding empirical through a good experimental design, make it clear that this is the contribution. For example, it may be that prevailing models predict exactly this finding, but nobody has been able to demonstrate it cleanly, as the authors do here. So the non-triviality is not that the result contradicts working memory models, but rather relates to the methodological difficulty of revealing such an effect.

      Thank you for your recommendation. First, please see our point-by-point responses to the individual comments above, where we also state relevant changes that we have made to our article, and where we clarify what we meant with “non trivial”. As we currently also state in our introduction, our work took as a starting point the framework that working memory is inherently about the past while being for the future (cf. van Ede & Nobre, Annual Review of Psychology, 2023). By virtue of our unique task design, we were able to empirically demonstrate that visual contents in working memory are selected via both their past and their future-relevant locations – with past and future memory attributes being engaged together in time. With “not trivial” we merely intend to make clear that there are viable alternatives than the findings we observed. For example, past could have been replaced by the future, or it could have been that item selection (through its past location) was required before its future-relevant location could be considered (i.e. past-before-future, rather than joint selection as we reported). We outline these alternatives in the second paragraph of our Discussion:

      Page 5 (Discussion): “Our finding of joint utilisation of past and future memory attributes emerged from at least two alternative scenarios of how the brain may deal with dynamic everyday working memory demands in which memory content is encoded at one location but needed at another.

      First, [….]”

      Our work was not motivated from a particular theoretical debate and did not aim to challenge ongoing debates in the working-memory literature, such as: slot vs. resource, active vs. silent coding, decay vs. interference, and so on. To our knowledge, none of these debates makes specific claims about the retention and selection of past and future visual memory attributes – despite this being an important question for understanding working memory in dynamics everyday semngs, as we hoped to make clear by our opening example.

      Reviewer 3, Recommendations:

      I recommend that the present findings be more clearly interpreted in the context of previous findings on working memory and attention. The task design includes two components - the first (post-cue) is a classic working memory task and the second (the pre-cue) is a classic spatial attention design. Both components were thoroughly studied in the past and this previous knowledge should be better integrated into the present conclusions. I specifically feel uncomfortable with the interpretation of past vs. future. I find this framework to be misleading because it reads like this paper is on a topic that is completely new and never studied before, when in fact this is a study on the interaction between working memory and spatial attention. I recommend the authors minimize this past-future framing or be more explicit in explaining how this new framework relates to the more common terminology in the field and make sure that the findings are not presented in a vacuum, as another contribution to the vibrant field that they are part of.

      Thank you for these recommendations. Please also see our point-by-point responses to the individual comments above. Here, we explained our logic behind using the terminology of past vs. future (in addition, see also our response to point 2 or reviewer 2). Here, we also stated relevant changes that we have made to our manuscript to explain how our findings complement – but are also distinct from – prior tasks that used pre-cues to direct spatial attention to an upcoming stimulus. As we explained above, in our task, the cue itself never contained information about the upcoming test location. Rather, the upcoming test location was a property of the memory item (given the future rule). Hence, we referred to this as a “future attribute” of the cued memory item, rather than as the “cued location” for external spatial attention. Still, we agree the future bias likely (also) reflects spatial allocation to the upcoming test array, and we explicitly acknowledge this in our discussion. For example:

      Page 5 (Discussion): “This signal may reflect either of two situations: the selection of a future-copy of the cued memory content or anticipatory attention to its the anticipated location of its associated test-stimulus. Either way, by the nature of our experimental design, this future signal should be considered a content-specific memory attribute for two reasons. First, the two memory contents were always associated with opposite testing locations, hence the observed bias to the relevant future location must be attributed specifically to the cued memory content. Second, we cued which memory item would become tested based on its colour, but the to-be-tested location was dependent on the item’s encoding location, regardless of its colour. Hence, consideration of the item’s future-relevant location must have been mediated by selecting the memory item itself, as it could not have proceeded via cue colour directly.”

      Page 6 (Discussion): “Building on the above, at face value, our task may appear like a study that simply combines two established tasks: tasks using retro-cues to study attention in working memory (e.g.,2,31-33) and tasks using pre-cues to study orienting of spatial attention to an upcoming external stimulus (e.g., 31,32,34–36). A critical difference with common pre-cue studies, however, is that the cue in our task never directly informed the relevant future location. Rather, as also stressed above, the future location was a feature of the cued memory item (according to the future rule), and not of the cue itself. Note how this type of scenario may not be uncommon in everyday life, such as in our opening example of a bird flying behind a building. Here too, the future relevant location is determined by the bird – i.e. the memory content – itself.”

    2. eLife assessment

      This important study advances our understanding of how past and future information is jointly considered in visual working memory by studying gaze biases in a memory task that dissociates the locations during encoding and memory tests. The evidence supporting the conclusions is convincing, with state-of-the-art gaze analyses that build on a recent series of experiments introduced by the authors. This work will be of broad interest to vision scientists interested in the interplay of vision, eye movements, and memory.

    3. Reviewer #1 (Public Review):

      In this study, the authors offer a fresh perspective on how visual working memory operates. They delve into the link between anticipating future events and retaining previous visual information in memory. To achieve this, the authors build upon their recent series of experiments that investigated the interplay between gaze biases and visual working memory. In this study, they introduce an innovative twist to their fundamental task. Specifically, they disentangle the location where information is initially stored from the location where it will be tested in the future. Participants are tasked with learning a novel rule that dictates how the initial storage location relates to the eventual test location. The authors leverage participants' gaze patterns as an indicator of memory selection. Intriguingly, they observe that microsaccades are directed towards both the past encoding location and the anticipated future test location. This observation is noteworthy for several reasons. Firstly, participants' gaze is biased towards the past encoding location, even though that location lacks relevance to the memory test. Secondly, there's a simultaneous occurrence of an increased gaze bias towards both the past and future locations. To explore this temporal aspect further, the authors conduct a compelling analysis that reveals the joint consideration of past and future locations during memory maintenance. Notably, microsaccades biased towards the future test location also exhibit a bias towards the past encoding location. In summary, the authors present an innovative perspective on the adaptable nature of visual working memory. They illustrate how information relevant to the future is integrated with past information to guide behavior.

    4. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Liu et al. reports a task that is designed to examine the extent to which "past" and "future" information is encoded in working memory that combines a retrocue with rules that indicate the location of an upcoming test probe. An analysis of microsaccades on a fine temporal scale shows the extent to which shifts of attention track the location of the encoded item (past) and the location of the future item (test probe). The location of the encoded grating and test probe were always on orthogonal axes (horizontal, vertical) so that biases in microsaccades could be used to track shifts of attention to one or the other axis (or mixtures of the two). The overall goal here was then to (1) create a methodology that could tease apart memory for the past and future, respectively, (2) to look at the time-course attention to past/future, and (3) to test the extent to which microsaccades might jointly encode past and future memoranda. Finally, some remarks are made about the plausibility of various accounts of working memory encoding/maintenance based on the examination of these time-courses.

      Strengths:

      This research has several notable strengths. It has a clear statement of its aims, is lucidly presented, and uses a clever experimental design that neatly orthogonalized "past" and "future" as operationalized by the authors. Figure 1b-d shows fairly clearly that saccade directions have an early peak (around 300ms) for the past and a "ramping" up of saccades moving in the forward direction. This seems to be a nice demonstration that the method can measure shifts of attention at a fine temporal resolution and differentiate past from future oriented saccades due to the orthogonal cue approach. The second analysis shown in Figure 2, reveals a dependency in saccade direction such that saccades toward the probe future were more likely also to be toward the encoded location than away from the encoded direction. This suggests saccades are jointly biased by both locations "in memory". The "central contribution" (as the authors characterize it) is that "the brain simultaneously retains the copy of both past and future-relevant locations in working memory, and (re)activates each during mnemonic selection", and that: "... while it is not surprising that the future location is considered, it is far less trivial that both past and future attributes would be retained and (re)activated together. This is our central contribution." The authors provide a nuanced analysis that offers persuasive evidence that past and future representations are jointly maintained in memory.

    1. Author response:

      Factual error in the eLife assessment to be corrected:

      In the eLife assessment, "ribosomal protein H59" should be changed to "helix 59 of the 28S ribosomal RNA" to make this factually correct.

      Provisional author response

      We thank the reviewers for their thorough and thoughtful readings of the manuscript. Our responses to the four suggestions made in their public reviews are below.

      Reviewer #1 (Public Review):

      Major points:

      (1) The identification of RAMP4 is a pivotal discovery in this paper. The sophisticated AlphaFold prediction, de novo model building of RAMP4's RBD domain, and sequence analyses provide strong evidence supporting the inclusion of RAMP4 in the ribosome-translocon complex structure.

      However, it is crucial to ensure the presence of RAMP4 in the purified sample. Particularly, a validation step such as western blotting for RAMP4 in the purified samples would strengthen the assertion that the ribosome-translocon complex indeed contains RAMP4. This is especially important given the purification steps involving stringent membrane solubilization and affinity column pull-down.

      As suggested, we will revise the manuscript to include Western blots showing that RAMP4 is retained at secretory translocons (and not multipass translocons) after solubilisation, affinity purification, and recovery of ribosome-translocon complexes.

      (2) Despite the comprehensive analyses conducted by the authors, it is challenging to accept the assertion that the extra density observed in TRAP class 1 corresponds to calnexin. The additional density in TRAP class 1 appears to be less well-resolved, and the evidence for assigning it as calnexin is insufficient. The extra density there can be any proteins that bind to TRAP. It is recommended that the authors examine the density on the ER lumen side. An investigation into whether calnexin's N-globular domain and P-domain are present in the ER lumen in TRAP class 1 would provide a clearer understanding.

      We agree that the Calnexin assignment is less confident than the other assignments in this manuscript, and that further support would be ideal. We have exhaustively searched our maps for any unexplained density connected with the putative Calnexin TMD, and have found none. This is consistent with Calnexin's lumenal domain being flexibly linked to its TMD, and thus would not be resolved in a ribosome-aligned reconstruction.

      Our assignment of this TMD to Calnexin was based on existing biochemical data (referenced in the paper) favouring this as the best working hypothesis by far: Calnexin is TRAP’s only abundant co-purifying factor, and their interaction is sensitive to point mutations in the Calnexin TMD. Recognising that this is not conclusive, we will ensure that the text and figures consistently describe this assignment as provisional or putative.

      (3) In the section titled 'TRAP competes and cooperates with different translocon subunits,' the authors present a compelling explanation for why TRAP delta defects can lead to congenital disorders of glycosylation. To enhance this explanation, it would be valuable if the authors could provide additional analyses based on mutations mentioned in the references. Specifically, examining whether these mutations align with the TRAP delta-OSTA structure models would strengthen the link between TRAP delta defects and the observed congenital disorders of glycosylation.

      We agree that mapping disease-causing point mutants to the TRAP delta structure could be potentially informative. Unfortunately, the referenced TRAP delta disease mutants act by simply impairing TRAP delta expression, and thus admit no such fine-grained analyses. However, sequence conservation is our next best guide to mutant function. We note in the text that the contact site charges on TRAP delta and RPN2 are conserved, and that the closest-juxtaposed interaction pair (K117 on TRAPδ and D386 on RPN2) is also the most conserved.

      Reviewer #2 (Public Review):

      Strengths:

      The manuscript contains numerous novel new structural analyses and their potential functional implications. While all findings are exciting, the highlight is the discovery of RAMP4/SERP1 near the Sec61 lateral gate. Overall, the strength is the thorough and extensive structural analysis of the different high-resolution RTC classes as well as the expert bioinformatic evolutionary analysis.

      Weaknesses:

      A minor downside of the manuscript is the sheer volume of analyses and mechanistic hypotheses, which makes it sometimes difficult to follow. The authors might consider offloading some analyses based on weaker evidence to the supplement to maximize impact.

      We agree that the manuscript is long, and we will seek ways to streamline it in revision while avoiding the undesirable side effect of making important findings undiscoverable via literature searches (an unfortunate consequence of many supplemental data). Indeed, we chose eLife for its flexibility regarding article length and suitability for extended and detailed analyses.

    2. Reviewer #1 (Public Review):

      The paper 'Structural Analysis of the Dynamic Ribosome-Translocon Complex,' authored by Lewis et al., meticulously explores various conformations and states of the ribosome-translocon complex. Employing advanced techniques such as cryoEM structural determination and AlphaFold modeling, the study delves into the dynamic nature of the ribosome-translocon complex. The findings from these analyses unveil crucial insights, significantly advancing our understanding of the co-translational translocation process in cellular mechanisms.

      To begin with, the authors employed a construct comprising the first two transmembrane domains of rhodopsin as a model for studying protein translocation. They conducted in vitro translation, followed by the purification of the ribosome-translocon complex, and determined its cryoEM structures. An in-depth analysis of their ribosome-translocon complex structure revealed that the nascent chain can pass through the lateral gate of translocon Sec61, akin to the behavior of a Signaling Peptide. Additionally, Sec61 was found to interact with 28S rRNA helix 24 and the ribosomal protein uL24. In summary, their structural model aligns with the through-pore model of insertion, contradicting the sliding model.

      Secondly, the authors successfully identified RAMP4 in their ribosome-translocon complex structure. Notably, the transmembrane domain of RAMP4 mimics the binding of a Signaling Peptide at the lateral gate of Sec61, albeit without unplugging. Intriguingly, RAMP4 is exclusively present in the non-multipass translocon ribosome-translocon complex, not in those containing multipass translocon. This observation suggests that co-translational translocation specifically occurs in the Sec61 channel that includes bound RAMP4. Additionally, the authors discovered an interaction between the C-tail of ribosomal proteins uL22 and the translocon Sec61, providing valuable insights into the nascent chain's behavior.

      Moving on to the third point, the focused classification unveiled TRAP complex interactions with various components. The authors propose that the extra density observed in their novel ribosome-translocon complex can be attributed to calnexin, a major binder of TRAP according to previous studies. Furthermore, the new structure reveals a TRAP-OSTA interaction. This newly identified TRAP-OSTA interaction offers a potential explanation for why patients with TRAP delta defects exhibit congenital disorders of glycosylation.

      In conclusion, this paper presents a robust contribution to the field with its thorough structural and modeling analyses. The significance of the findings is evident, providing valuable insights into the intricate mechanisms of protein co-translational translocation. The well-crafted writing, meticulous analyses, and clear figures collectively contribute to the overall strength of the paper.

      Major points:

      (1) The identification of RAMP4 is a pivotal discovery in this paper. The sophisticated AlphaFold prediction, de novo model building of RAMP4's RBD domain, and sequence analyses provide strong evidence supporting the inclusion of RAMP4 in the ribosome-translocon complex structure.

      However, it is crucial to ensure the presence of RAMP4 in the purified sample. Particularly, a validation step such as western blotting for RAMP4 in the purified samples would strengthen the assertion that the ribosome-translocon complex indeed contains RAMP4. This is especially important given the purification steps involving stringent membrane solubilization and affinity column pull-down.

      (2) Despite the comprehensive analyses conducted by the authors, it is challenging to accept the assertion that the extra density observed in TRAP class 1 corresponds to calnexin. The additional density in TRAP class 1 appears to be less well-resolved, and the evidence for assigning it as calnexin is insufficient. The extra density there can be any proteins that bind to TRAP. It is recommended that the authors examine the density on the ER lumen side. An investigation into whether calnexin's N-globular domain and P-domain are present in the ER lumen in TRAP class 1 would provide a clearer understanding.

      (3) In the section titled 'TRAP competes and cooperates with different translocon subunits,' the authors present a compelling explanation for why TRAP delta defects can lead to congenital disorders of glycosylation. To enhance this explanation, it would be valuable if the authors could provide additional analyses based on mutations mentioned in the references. Specifically, examining whether these mutations align with the TRAP delta-OSTA structure models would strengthen the link between TRAP delta defects and the observed congenital disorders of glycosylation.

    3. eLife assessment

      This fundamental study offers new structural insights into the form and functions of the ribosome-translocon complex. Through a combination of in vitro translation, cryoEM imaging, and comprehensive AlphaFold comparative modeling, the authors offer convincing support for the lateral gate model of co-translational ER protein biogenesis, including the location of RAMP4 near the Sec61 lateral gate and the plausible role of helix 59 of the 28S ribosomal RNA as a determinant of the positive-inside rule. While the reviewers identified minor limitations, such as the need to validate RAMP4 presence with orthogonal measures, these results will be broadly impactful.

    4. Reviewer #2 (Public Review):

      Summary:

      In the manuscript 'Structural analysis of the dynamic ribosome-translocon complex' Lewis and Hegde present a structural study of the ribosome-bound multipass translocon (MPT) based on re-analysis of cryo-EM single particle data of ribosome-MPTs processing the multipass transmembrane substrate RhoTM2 from a previous publication (Smalinskaité et al, Nature 2022) and AlphaFold2 multimer modeling. Detailed analysis of the laterally open Sec61 is obtained from PAT-less particles.

      The following major claims are made:

      - TMs can bind similarly to the Sec61 lateral gate as signal peptides.

      - Ribosomal H59 is in immediate proximity to basic residues of TMs and signal peptides, suggesting it may contribute to the positive-inside rule.

      - RAMP4/SERP1 binds to the Sec61 lateral gate and the ribosome near 28S rRNA's helices 47, 57, and 59 as well as eL19, eL22, and eL31.

      - uL22 C-terminal tail binds H24/47 blocking a potential escape route for nascent peptides to the cytosol.

      - TRAP and BOS compete for binding to Sec61 hinge.

      - Calnexin TM binds to TRAPg.

      - NOMO wedges between TRAP and MPT.

      Strengths:

      The manuscript contains numerous novel new structural analyses and their potential functional implications. While all findings are exciting, the highlight is the discovery of RAMP4/SERP1 near the Sec61 lateral gate. Overall, the strength is the thorough and extensive structural analysis of the different high-resolution RTC classes as well as the expert bioinformatic evolutionary analysis.

      Weaknesses:

      A minor downside of the manuscript is the sheer volume of analyses and mechanistic hypotheses, which makes it sometimes difficult to follow. The authors might consider offloading some analyses based on weaker evidence to the supplement to maximize impact.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1:

      We are grateful for the overall positive feedback from the reviewer.

      We agree with the reviewer that our data showing cellular co-localization between PRC1 and BIN1 requires further investigation in future studies, however, we are confident that in the current form, our manuscript already presents multiple evidences for the role of BIN1 in mitotic processes. We would like to emphasize that PRC1 is not the sole BIN1 partner that connects it to mitotic processes, but it is only one out of more than a dozen that we identified in our study. Furthermore, the mitotic connection with BIN1 is not absolutely novel as BIN1 levels are mildly fluctuating during the cell cycle, similar to other proteins involved in the regulation of the cell cycle (Santos et al., 2015) and because DNM2 is also a well-accepted actor during mitosis (Thompson et al., 2002).

      The less marked co-localization between BIN1 and PRC1 compared to the strong co-localization between BIN1 and DNM2 can be a consequence of their weaker affinity and their partial binding. Yet, this does not necessarily imply that stronger interactions have more biological significance. For example, weaker affinities can be compensated by local concentrations to achieve an even higher degree of cellular complexes than of strongly binding interactions that are separated within the cell. Furthermore, even the degree of complex formation cannot be used intuitively to estimate the biological significance of a complex because complexes can trigger very important biological processes even at very low abundances, e.g. by catalyzing enzymatic reactions. Deciding what is and what is not “biologically significant” among the identified interactions remains to be answered in the future, once we are able to overview complex biological processes in a holistic manner.

      In the revised version, we implemented minor changes to further clarify the raised points.

      Reviewer #2:

      We thank the reviewer for the careful assessment and we are pleased to see the positive enthusiasm regarding our affinity interactomic strategy.

      The reviewer points out that affinities were only measured with a single technique, which is relatively unproven. While it is true that our work uses two techniques building on the same holdup concept, we rather believe that this approach is well-proven. The original holdup method was described almost 20 years ago and since then, it has been used in more than 10 publications for quantitative interactomics. Over the years, at least five distinct generations of the assay were developed, all building on the expertise of the preceding one. In the past, we extensively proved that the resulting affinities show excellent agreement with affinities measured with other methods, such as fluorescence polarization, isothermal titration calorimetry, or surface plasmon resonance (for example in Vincentelli et al. Nat. Meth. 2015; Gogl et al. 2020 Structure; Gogl et al. 2022 Nat.Com.). However, it is true that the most recent variation of this method family, called native holdup, is a fairly new approach published just a bit more than a year ago and this is only the third work that utilizes this method. Yet, in our original work describing the method, we demonstrated good agreement with the results of previous holdup experiments, as well as with orthogonal affinity measurements (Zambo et al. 2022).

      Importantly, the reviewer raises concerns regarding the number of replicates used in our study, as well as the reliability of our methodology. We are glad for such a comment as it allows us to explain our motives behind experimental design which is most often left out from scientific works to save space and keep focus on results. The reason why we use technical replicates instead of the typical biological replicates lies in the nature of the holdup assay. In a typical interactomic assay, such as immunoprecipitation, a lot of variables can perturb the outcome of the measurement, such as bait immobilization, or captured prey leakage during washing steps. The output of such an experiment is a list of statistically significant partners and to minimize these variabilities, biological replicates are used. In the case of a native holdup approach, a panel of an equal amount of resins, all saturated with different baits or controls, is mixed with an equal amount of cell extract, taken from a single tube, and after a brief incubation, the supernatant of this mixture is analyzed. The output of such an experiment is a list of relative concentrations of prey and to maximize its accuracy, we use technical replicates. Using an ideal analytical method, such as fluorescence, it is not necessary to use technical replicates to reach accurate results. For example, the general accuracy of a holdup experiment coupled with a robust analytical approach can be seen clearly in our fragmentomic holdup data shown in Figure 7C where mutant domains that do not have any impact on the interactome show extreme agreement in affinities. Unfortunately, mass spectrometry is less accurate as an analytical method, hence we use technical triplicates to compensate for this. Finally, in the case of BIN1, an independent nHU measurement was also performed using a less capable mass spectrometer. Not counting the 117 detected partners of BIN1 that were only detected in only one of these proteomic measurements, 29 partners were identified as common significant partners in both of these measurements showing nearly identical affinities with a mean standard deviation between measured pKapp values of 0.18, meaning that the obtained dissociation constants are within a <2.5-fold range with >95% probability. There were also 61 BIN1 partners that were detected in both proteomic measurements but were only identified as a significant interaction partner in one of these experiments. Yet many of them show binding in both assays, albeit were found to be not significant in one of these assays. For example, CDC20 shows 66% depletion in one assay (significant binding) while it shows 54% depletion in the other (not significant binding), or CKAP2 shows 58% depletion in one assay (significant binding) while it shows 41% depletion in the other (not significant binding). We hope that these examples show that statistical significance in nHU experiments rather signifies how certain we are in a particular affinity measurement and not the accuracy of the affinity measurement itself. While there are true discrepancies between some of the affinity measurements between these experiments, that would be possible to clarify with more experimental replicates, the raw data presented in our work clearly demonstrate the strength and robustness of a fully quantitative interactomic assay.

      In the revised version, we clarified the number of replicates in the text, in the figure legends, and included some of this discussion in the method section.

      The reviewer had some very useful comments regarding affinity differences between short fragments and full-length proteins. In his comment, he possibly made a typo as we find that fulllength proteins typically interact with higher affinities compared to short PxxP motif fragments in isolation and not weaker. The reviewer also comments that we explain this difference with cooperativity. In a previous preprint version, which the reviewer may have seen, this was indeed the case, but since we realized that we did not have sufficient evidence supporting this model, therefore we did not discuss this in detail in the last version submitted to eLife. To clarify this, we included more discussion about the observed differences in the affinities between fragments and full-length proteins, but since we have limited data to make solid conclusions, we do not go into details about underlying models.

      Instead of cooperativity, the reviewer suggests that the observed differences may originate from additional residues that were not included in our peptides. Indeed, many similar experiments fail because of suboptimal peptide library design. Our peptide library was constructed as 15-mer, xxxxxxPxxPxxxxx motifs and we do not see a strong contribution of residues at the far end of these peptides. Specificity logo reconstructions are expected to identify all key residues that participate in SH3 domain binding, and based on this, all key residues of the identified motifs can be included in shorter 10-mer, xxxPxxPxxx motifs. Therefore, it is unlikely that residues outside our peptide regions will greatly contribute to the site-specific interactions of SH3 domains. It is however possible that other sites, that are sequentially far away from the studied PxxP motifs, are also capable of binding to SH3 through a different surface, but in light of the small size of an isolated SH3 domain, we believe it is very unlikely. It is also possible that BIN1 could also interact with other types of SH3 binding motifs that were not included in our peptide library. We think a more likely explanation is some sort of cooperativity. Cooperativity, or rather synergism between different sites can be easily explained in typical situations, such as in the case of a bimolecular interaction that is mediated by two independent sites. In such an event, once one site is bound, the second binding event will likely also occur because of the high effective local concentration of the binding sites. However, cooperativity can also form in atypical conditions and a molecular explanation for these events is rather elusive. As BIN1 contains a single SH3 domain, its binding to targets containing more binding sites can be challenging to interpret. If these sites are part of a greater Pro-rich region, such as in the case of DNM2, it is possible that the entire region adopts a fuzzy, malleable, yet PPII-like helical conformation. Once the SH3 domain is recruited to this helical region, it can freely trans-locate within this region via lateral diffusion and it will pause on optimal PxxP motifs. As an alternative to this sliding mechanism, a diffusion-limited cooperative binding can also occur. If the two motifs are not part of the same Pro-rich region, but are relatively close in space, such as in the case of ITCH or PRC1, once a BIN1 molecule dissociates from one site, it has a higher chance to rebind to the second site due to higher local concentrations. Such an event can more likely occur if a transient, but relatively stable encounter complex exists between the two molecules, from which complex formation can occur at both sites (A+B↔AB; AB↔ABsite1; AB*↔ABsite2). However, this large effective local concentration in this encounter complex is only temporary because diffusion rapidly diminishes it, although weak electrostatic interactions can increase the lifetime of such encounter complexes. In contrast, the large effective local concentration in conventional multivalent binding is time-independent and only determined by the geometry of the complex. Finally, it may also occur that our empirical bait concentration estimation for immobilized biotinylated proteins is less accurate than the concentration estimation of peptide baits because we approximate this value based on peptide baits. For this technical reason, which was discussed in detail in the original paper describing the nHU approach, we are carefully using apparent affinities for nHU experiments. Nevertheless, even without accurate bait concentrations, our nHU experiment provides precise relative affinities and, thus partner ranking. Either of the mechanisms underlying the interactions we study would be difficult to further explore experimentally, especially at the proteomic level.

    2. eLife assessment

      This work describes a novel affinity interactomics approach that allows investigators to identify networks of protein-protein interactions in cells. The important findings presented here describe the application of this technique to the SH3 domain of the membrane remodeling Bridging Integrator 1 (BIN1), the truncation of which leads to centronuclear myopathy. The authors present solid evidence that BIN1 SH3 engages with an unexpectedly high number of cellular proteins, many of which are linked to skeletal muscle disease, and evidence is presented to suggest that BIN1 may play a role in mitosis creating the potential for new avenues in drug development efforts. Some of the findings, however, remain rather preliminary, lack sufficient replicates and may require additional experiments to definitively support the conclusions.

    3. Reviewer #1 (Public Review):

      Original review:<br /> The authors report here interesting data on the interactions mediated by the SH3 domain of BIN1 that expand our knowledge on the role of the SH3 domain of BIN1 in terms of mediating specific interactions with a potentially high number of proteins and how variants in this region alter or prevent these protein-protein interactions. These data provide useful information that will certainly help to further dissect the networks of proteins that are altered in some human myopathies as well as the mechanisms that govern the correct physiological activity of muscle cells.

      The work is mostly based on improved biochemical techniques to measure protein-protein interaction and provide solid evidence that the SH3 domain of BIN1 can establish an unexpectedly high number of interactions with at least a hundred cellular proteins, among which the authors underline the presence of other proteins known to be causative of skeletal muscle diseases and not known to interact with BIN1. This represents an unexpected and interesting finding relevant to better define the network of interactions established among different proteins that, if altered, can lead to muscle disease. An interesting contribution is also the detailed identification of the specific sites, namely the Proline-Rich Motifs (PRMs) that in the interacting proteins mediate binding to the BIN1 SH3 domain. Less convincing, or too preliminary in my opinion, are the data supporting BIN1 co-localization with PRC1. Indeed, the affinity of PRC1 is significantly lower than that of DNM2, an established BIN1 interacting protein. Thus, this does not provide compelling evidence to support PRC1 as a significant interactor of BIN1. Similarly, the localization data appears somewhat preliminary to substantiate a role of BIN1 in mitotic processes. These findings may necessitate additional experimental work to be more convincing.

      Comments on revision:<br /> I acknowledge the significant changes made by the authors in the revised manuscript. However, I remain puzzled by the data concerning the interaction between BIN1 and PRC1. While I agree with the authors that even weak interactions among proteins can be significant, I am hesitant to accept a priori that the lack of clear evidence of colocalization between proteins can be justified solely by their low affinity.

      Moreover, the possibility that other mitotic proteins may be potential partners of BIN1 does not inherently support an interaction between BIN1 and PRC1. I suggest that the authors present the interaction with PRC1 as a potential event and emphasize that further studies are needed to definitively establish it.

    4. Reviewer #2 (Public Review):

      Original review:<br /> Summary:<br /> In this paper, Zambo and coworkers use a powerful technique, called native holdup, to measure the affinity of the SH3 domain of BIN1 for cellular partners. Using this assay, they combine data using cellular proteins and proline-containing fragments in these proteins to identify 97 distinct direct binding partners of BIN1. They also compare the binding interactome of the BIN1 SH3 domain to the interactome of several other SH3 domains, showing varying levels of promiscuity among SH3 domains. The authors then use pathway analysis of BIN1 binding partners to show that BIN1 may be involved in mitosis. Finally, the authors examine the impact of clinically relevant mutations of the BIN1 SH3 domain on the cellular interactome. The authors were able to compare the interactome of several different SH3 domains and provide novel insight into the cellular function of BIN1. Generally, the data supports the conclusions, although the reliance on one technique and the low number of replicates in each experiment is a weakness of the study.

      Strengths:<br /> The major strength of this paper is the use of holdup and native holdup assays to measure the affinity of SH3 domains to cellular partners. The use of both assays using cell-derived proteins and peptides derived from identified binding partners allows the authors to better identify direct binding partners. This assay has some complexity but does hold the possibility of being used to measure the affinity of the cellular interactome of other proteins and protein domains. Beyond the utility of the technique, this study also provides significant insight into the cellular function of BIN1. The authors have strong evidence that BIN1 might have an undiscovered function in cellular mitosis, which potentially highlights BIN1 as a drug target. Finally, the study provides outstanding data on the cellular binding properties and partners of seven distinct SH3 domains, showing surprising differences in the promiscuity of these proteins.

      Weaknesses:<br /> There are three major weaknesses of the study. First, the authors rely completely on a single technique to measure the affinity of the cellular interactome. The native holdup is a relatively new technique that is powerful yet relatively unproven. However, it appears to have the capacity to measure the relative affinity of proteins. Second, the authors appear to use a relatively small number of replicates for the holdup assays. There is no information in the legends about the number of replicates but the materials and methods suggest the native holdup data is from a single experimental replicate with multiple technical replicates. Finally, the authors' data using cellular proteins and fragments show that the affinity of the whole proteins is 5-20 fold lower than individual proline-containing fragments. The authors state that this difference suggests that there is cooperativity between different proline-rich sites of the binding partners of BIN1, yet BIN1 only has one SH3 domain. It is unclear what the molecular mechanism of the cooperative interaction would be exactly since there would be only one SH3 domain to bind the partner. An alternative interpretation would be that the BIN 1 SH3 domain requires sequences outside of the short proline-rich regions for high-affinity interactions with cellular partners, a hypothesis that is supported by other studies.

      Comments on revision:<br /> I thank the authors for their thoughtful response. I have additional comments.

      I appreciate that this is not a techniques paper and that the authors have done more detailed work in a separate publication. It would be helpful to readers not familiar with this new method to more fully describe this technique in this manuscript.

      I also thank the authors for their description of why they performed only 1 biological replicate of the experiment. However, I still believe that multiple biological replicates will provide more rigorous and reproducible data. The data the authors provide actually argues for the inclusion of more biological replicates. They state they performed 2 separate nHU replicates using different mass spectrometers. It is unclear if this data uses the same lysates and protein preparations, but by the data, the two methods detected a total of 207 distinct binding partners. Only 29 of these were significant binders in both replicates and only 90 were detected binders in both replicates. 117 binding partners were found in only one replicate suggesting a significant differences between replicates. Different batches of SH3 domains can have different activities and different replicates of cell lysates can vary, even when made from the same cell line. Thus, there can still be significant differences between replicates in this method. I appreciate the difficulty of performing and analyzing multiple biological replicates, but it is the most rigorous way to identify potential cellular partners.

      I also thank the author for including the mechanistic discussion about the differences between peptides and whole proteins. There is literature showing that regions outside of the short PxxP regions drive binding to SH3 domains, especially for the GRB2 family of adaptor proteins.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Recommendations For The Authors):

      The data is poorly dealt with, and the figures are shown poorly. For example, Figure 2A is not even shown totally.

      We apologize for any difficulties that the reviewer encountered while attempting to view the figures. We have confirmed that all figures, including all panels of Figure 2, display correctly on the HTML and PDF versions of the article hosted at bioRxiv. The HTML and PDF versions generated by eLife also appears to contain all figures and panels in their entirety.

      Reviewer #2 (Recommendations For The Authors):

      Please refer to the public review for possible revisions.

      We thank Reviewer #2 for the summary and thoughtful comments provided in the Public Review. We note the point of possible revision noted from the Public Review: “It can be informative to directly demonstrate DPYD promoter-enhancer interactions. However, the genetic variants support the integration of regulatory activities.” In Figure 4, we provide evidence for direct promoterenhancer interaction though the use of 3C. We furthermore demonstrate that these interactions are dependent upon genotype at rs4294451 as stated by the reviewer. We have highlighted the promoter-enhancer interaction in the revised manuscript, lines 323-325. The role of genotype in this interaction is also specifically discussed in lines 378-381.

    2. eLife assessment

      This manuscript presents valuable findings on the identification of epigenetically mediated control for the recognition of dihydropyrimidine dehydrogenase (DPYD) gene expression that is linked with cancer treatment resistance using 5-fluorouracil. The evidence is compelling, supported by data from patient-derived specimens and direct assessment of 5-fluorouracil sensitivity, which provides confidence in the proposed mechanisms. The model is additionally supported by genome data from a population with high "compromised allele frequency". This work will interest those studying drug resistance in cancer therapy.

    3. Joint Public Review:

      Zhang et. al. presents compelling results that support the identification of epigenetically mediated control for the recognition of dihydropyrimidine dehydrogenase (DPYD) gene expression that is linked with cancer treatment resistance 5-fluorouracil. The experimental approach was developed and pursued with in vitro and in vivo strategies. Combining molecular, cellular, and biochemical approaches, the authors identify a germline variant with compromised enhancer control. Several lines of evidence were presented that are consistent with increased CEBP recruitment to the DPYD regulatory domain with consequential modifications in promoter-enhancer interactions that are associated with compromised 5-fluorouracil resistance. Functional identification of promoter and enhancer elements was validated by CRISPRi and CRISPRa assays. ChIP and qPCR documented histone marks that can account for the control of DPYD gene expression were established. Consistency with data from patient-derived specimens and direct assessment of 5-fluorouracil sensitivity provides confidence in the proposed mechanisms. The model is additionally supported by genome data from a population with high "compromised allele frequency". It can be informative to directly demonstrate DPYD promoter-enhancer interactions. However, the genetic variants support the integration of regulatory activities.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Gap junction channels establish gated intercellular conduits that allow the diffusion of solutes between two cells. Hexameric connexin26 (Cx26) hemichannels are closed under basal conditions and open in response to CO2. In contrast, when forming a dodecameric gapjunction, channels are open under basal conditions and close with increased CO2 levels. Previous experiments have implicated Cx26 residue K125 in the gating mechanism by CO2, which is thought to become carbamylated by CO2. Carbamylation is a labile post-translational modification that confers negative charge to the K125 side chain. How the introduction of a negative charge at K125 causes a change in gating is unclear, but it has been proposed that carbamylated K125 forms a salt bridge with the side chain at R104, causing a conformational change in the channel. It is also unclear how overall gating is controlled by changes in CO2, since there is significant variability between structures of gap-junction channels and the cytoplasmic domain is generally poorly resolved. Structures of WT Cx26 gap-junction channels determined in the presence of various concentrations of CO2 have suggested that the cytoplasmatic N-terminus changes conformation depending on the concentration of the gas, occluding the pore when CO2 levels are high.

      In the present manuscript, Deborah H. Brotherton and collaborators use an intercellular dyetransfer assay to show that Cx26 gap-junction channels containing the K125E mutation, which mimics carbamylation caused by CO2, is constitutively closed even at CO2 concentrations where WT channels are open. Several cryo-EM structures of WT and mutant Cx26 gap junction channels were determined at various conditions and using classification procedures that extracted more than one structural class from some of the datasets. Together, the features on each of the different structures are generally consistent with previously obtained structures at different CO2 concentrations and support the mechanism that is proposed in the manuscript. The most populated class for K125E channels determined at high CO2 shows a pore that is constricted by the N-terminus, and a cytoplasmic region that was better resolved than in WT channels, suggesting increased stability. The K125E structure closely resembles one of the two major classes obtained for WT channels at high CO2. These findings support the hypothesis that the K125E mutation biases channels towards the closed state, while WT channels are in an equilibrium between open and closed states even in the presence of high CO2. Consistently, a structure of K125E obtained in the absence of CO2 appeared to also represent a closed state but at lower resolution, suggesting that CO2 has other effects on the channel beyond carbamylation of K125 that also contribute to stabilizing the closed state. Structures determined for K125R channels, which are constitutively open because arginine cannot be carbamylated, and would be predicted to represent open states, yielded apparently inconclusive results.

      A non-protein density was found to be trapped inside the pore in all structures obtained using both DDM and LMNG detergents, suggesting that the density represents a lipid rather than a detergent molecule. It is thought that the lipid could contribute to the process of gating, but this remains speculative. The cytoplasmic region in the tentatively closed structural class of the WT channel obtained using LMNG was better resolved. An additional portion of the cytoplasmic face could be resolved by focusing classification on a single subunit, which had a conformation that resembled the AlphaFold prediction. However, this single-subunit conformation was incompatible with a C6-symmetric arrangement. Together, the results suggest that the identified states of the channel represent open states and closed states resulting from interaction with CO2. Therefore, the observed conformational changes illuminate a possible structural mechanism for channel gating in response to CO2.

      Some of the discussion involving comparisons with structures of other gap junction channels are relatively hard to follow as currently written, especially for a general readership. Also, no additional functional experiments are carried out to test any of the hypotheses arising from the data. However, structures were determined in multiple conditions, with results that were consistent with the main hypothesis of the manuscript. No discussion is provided, even if speculative, to explain the difference in behavior between hemichannels and gap junction channels. Also, no attempt was made to measure the dimensions of the pore, which is relevant because of the importance of identifying if the structures indeed represent open or closed states of the channel.

      We have considerably revised the manuscript in an attempt to make it more tractable. We respond to the individual comments below.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Brotherton et al. describes a structural study of connexin-26 (Cx26) gap junction channel mutant K125E, which is designed to mimic the CO2-inhibited form of the channel. In the wild-type Cx26, exposure to CO2 is presumed to close the channel through carbamylation of the residue K125. The authors mutated K125 to a negatively charged residue to mimic this effect, and they observed by cryo-EM analysis of the mutated channel that the pore of the channel is constricted. The authors were able to observe conformations of the channel with resolved density for the cytoplasmic loop (in which K125 is located). Based on the observed conformations and on the position of the N-terminal helix, which is involved in channel gating and in controlling the size of the pore, the authors propose the mechanisms of Cx26 regulation.

      Strengths:

      This is a very interesting and timely study, and the observations provide a lot of new information on connexin channel regulation. The authors use the state of the art cryo-EM analysis and 3D classification approaches to tease out the conformations of the channel that can be interpreted as "inhibited", with important implications for our understanding of how the conformations of the connexin channels controlled.

      Weaknesses:

      My fundamental question to the premise of this study is: to what extent can K125 carbamylation by recapitulated by a simple K125E mutation? Lysine has a large side chain, and its carbamylation would make it even slightly larger. While the authors make a compelling case for E125-induced conformational changes focusing primarily on the negative charge, I wonder whether they considered the extent to which their observation with this mutant may translate to the carbamoylated lysine in the wild-type Cx26, considering not only the charge but also the size of the modified side-chain.

      This is an important point. We agree that the difference in size will have a different effect on the structure. For kinases, aspartate or glutamate are often used as mimics of phosphorylated serine or threonine and these will have the same issues. The fact that we cannot resolve the relevant side-chains in the density may be indicative that the mutation doesn’t give the whole story. It may be able to shift the equilibrium towards the closed conformation, but not stably trap the molecule in that conformation. We include a comment to this effect in the revised manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The mechanism underlying the well-documented CO2-regulated activity of connexin 26 (Cx26) remains poorly understood. This is largely due to the labile nature of CO2-mediated carbamylation, making it challenging to visualize the effects of this reversible posttranslational modification. This paper by Brotherton et al. aims to address this gap by providing structural insights through cryo-EM structures of a carbamylation-mimetic mutant of the gap junction protein.

      Strengths:

      The combination of the mutation, elevated PCO2, and the use of LMNG detergent resulted in high-resolution maps that revealed, for the first time, the structure of the cytoplasmic loop between transmembrane helix (TM) 2 and 3.

      Weaknesses:

      The presented maps merely reinforce their previous findings, wherein wildtype Cx26 favored a closed conformation in the presence of high PCO2. While the structure of the TM2-TM3 loop may suggest a mechanism for stabilizing the closed conformation, no experimental data was provided to support this mechanism. Additionally, the cryo-EM maps were not effectively presented, making it difficult for readers to grasp the message.

      We have extensively revised the manuscript so that the novelty of this study is more apparent. There are three major points

      (1) The carbamylation mimetic pushes the conformation towards the closed conformation. Previously we just showed that CO2 pushes the conformation towards this conformation. Though we could show this was not due to pH, and could speculate this was due to carbamylation as suggested by previous mutagenesis studies, our data did not provide any mechanism whereby Lys125 was involved.

      (2) In going from the open to closed conformations, not only is a conformational change in TM2 involved, as we saw previously, but also a conformational change in TM1, the linker to the N-terminus and the cytoplasmic loop. Thus there is a clear connection between Lys125 and the conformation of the pore-closing N-terminus.

      (3) We observe for the first time in any connexin structure, density for the cytoplasmic loop. Since this loop is important in regulation, knowing how it might influence the positions of the transmembrane helices is important information if we are to understand how connexins can be regulated.

      Reviewing Editor:

      The reviewers have agreed on a list of suggested revisions that would improve the eLife assessment if implemented, which are as follows:

      (1) For completeness, Figure 1 could be supplied with an example of how the experiment would look like in the presence of CO2 - for the wild-type and for the K125E mutant. presumably for the wild-type this has been done previously in exactly this assay format, but this control would be an important part of characterization for the mutant. Page 4, lines 105106; "unsurprisingly, Cx26K125E gap junctions remain closed at a PCO2 of 55 mmHg." The data should be presented in the manuscript.

      We have now included the data with a PCO2 of 55mmH. This is now Figure 4 in our revised manuscript.

      (2) Would AlphaFold predictions show any interpretable differences in the E125 mutant, compared to the K125 (the wild-type)?

      We tried this in response to the reviewer’s suggestion. We did not see any interpretable differences. In general AlphaFold is not recognised as giving meaningful information around point mutations.

      (3) The K125R mutant appears to be a more effective control for extracting significant features from the K125E maps. Given that the use of a buffer containing high PCO2 is essential for obtaining high-resolution maps, wildtype Cx26 is unsuitable as an appropriate control. The K125R map, obtained at a high resolution (2.1Å), supports its suitability as a robust control.

      Though we are unsure what the referee is referring to here, we have rewritten this section and compare against the K125R map (figure 5a) as well as that derived from the wild-type protein. The important point is that the K125E mutant, causes a structural change that is consistent with the closure of the gap junctions that we observe in the dye-transfer assays.

      (4) Likewise, the rationale for using wildtype Cx26 maps obtained in DDM is unclear. Wildtype Cx26 seems to yield much better cryo-EM maps in LMNG. We suggest focusing the manuscript on the higher-quality maps, and providing supporting information from the DDM maps to discuss consistency between observations and the likely possibility that the nonprotein density in the pore is lipid and not detergent.

      The rationale for comparing the mutants against the wt Cx26 maps obtained in DDM was because the mutants were also solubilised in DDM. However, taking the lead from the referees’ comments, we have now rewritten the manuscript so that we first focus on the data we obtain from protein solubilised in LMNG. We feel this makes our message much clearer.

      (5) In general, the rationale for utilizing cryo-EM maps with the entire selected particles is unclear. Although the overall resolutions may slightly improve in this approach, the regions of interest, such as the N-terminus and the cytoplasmic loop, appear to be better ordered afer further classifications. The paper would be more comprehensible if it focuses solely on the classes representing the pore-constricting N-terminus (PCN) and the pore-open flexible Nterminus (POFN) conformations. Also, the nomenclatures used in the manuscript, such as "WT90-Class1", "K125E90-1", "LMNG90-class1", "LMNG90-mon-pcn" are confusing.

      LMNG90s are also wildtype; K125E-90-1 is in Class1 for this mutant and is similar to WT90Class2, which represents the PCN conformation. More consistent and intuitive nomenclatures would be helpful.

      We agree with the referees’ comments. This should now be clearer with our rewritten manuscript where we have simplified this considerably. We now call the conformations NConst (N-terminus defined and constricting the pore) and NFlex (N-terminus not visible) and keep this consistent throughout.

      (6) A potential salt bridge between the carbamylated K125 and R104 is proposed to account for the prevalence of Class-1 (i.e., PCN) in the majority of cryo-EM particles. However, the side chain densities are not well-defined, suggesting that such an interaction may not be strong enough to trap Cx26 in a closed conformation. Furthermore, the absence of experimental data to support this mechanism makes it unclear how likely this mechanism may be. Combining simple mutagenesis, such as R104E, with a dye transfer assay could offer support for this mechanism. Are there any published experimental results that could help address this question without the need for additional experimental work? Alternatively, as acknowledged in the discussion, this mechanism may be deemed as an "over-simplification." What is an alternative mechanism?

      R104 has been mutated to alanine in gap junctions and tested in a dye transfer assay as now mentioned in the text (Nijar et al, J Physiol 2021) supporting this role. In hemichannels R104 has been mutated to both alanine and glutamate and tested through dye loading assays Meigh et al, eLife 2013). Also in hemichannels R104 and K125 have been mutated to cysteines allowing them to be cross-linked through a disulphide bond. This mutant responds to a change in redox potential in a similar way to which the wild type protein responds to CO2 (Meigh et al, Open Biol 2015). Therefore, there is no doubt that the residues are important for the mechanism and the salt-bridge interaction seems a plausible mechanism to reconcile the mutagenesis data, however we cannot be sure that there are not other interactions involved that are necessary for closure. This information has now been included in the text.

      (7) The cryo-EM maps presented in the manuscript propose that gap junctions are constitutively open under normal PCO2 as the flexible N-terminus clears the solute permeation pathway in the middle of the channel. However, hemichannels appear to be closed under normal PCO2. It is puzzling how gap junctions can open when hemichannels are closed under normal PCO2 conditions. If this question has been addressed in previous studies, the underlying mechanism should be explicitly described in the introduction. If it remains an open question, differences in the opening mechanisms between hemichannels and gap junctions should be investigated.

      We suspect this is due to the difference in flexibility of gap junctions relative to hemichannels. However, a discussion of this is beyond this paper and would be complete speculation based on hemichannel structures of other connexins, performed in different buffering systems. There are no high resolution structures of Cx26 hemichannels.

      (8) A mystery density likely representing a lipid is abruptly introduced, but the significance of this discovery is unclear. It is hard to place the lipid on Figure S6 in the wider context of everything else that is discussed in the text. It would be helpful for readers if a figure were provided to show where the density is located in relation to all the other regions that are extensively discussed in the text.

      In the revised text this section has been completely rewritten. We have now include a more informative view in a new figure (Figure 1 – figure supplement 3).

      (9) Including and displaying even tentative pore-diameter measurements for the different states - this would be helpful for readers and provide a more direct visual cue as to the difference between open and closed states.

      We have purposely avoided giving precise measurements to the pore-diameter, since this depends on how we model the N-terminus. The first three residues are difficult to model into the density without causing stearic clashes with the neighbouring subunits.

      (10) Given that no additional experiments for channel function were carried out, it would be useful if to provide a more detailed discussion of additional mutagenesis results from the literature that are related to the experimental results presented.

      We have amplified this in the discussion (see answer to point 6).

      The reviewers also agreed that improvements in the presentation of the data would strengthen the manuscript. Here is a summary list of suggestions by reviewers aimed at helping improve how the data is presented:

      (1) Why is the pipette bright green in the top image, but rather weakly green in the bottom image in Figure 1 - is this the case for all images?

      (Now figure 4) This depends on whether the pipette was in the focal plane of view or not. The important point of these images is the difference in intensity of the donor vs the recipient cell. The graphs in figure 4c illustrate clearly the difference between the wild-type and the mutant gap junctions.

      (2) In figures 2-5, labels would help a lot in understanding what is shown - while the legends do provide the information on what is presented, it would help the reader to see the models/maps with labels directly in the panel. For example, Figure 2a/b - just indicating "WT90 Cx26" in pink and "K125E90" in blue directly in the panel would reduce the work for the reader.

      We have extensively modified the labels in the figures to address this issue.

      (3) Figure 4 - magenta and pink are fairly close, and to avoid confusion it might be useful to use a different color selection. This is especially true when structures are overlayed, as in this figure - the presentation becomes rather complicated, so the less confusion the color code can introduce, the better.

      (Now Figure 2) We have now changed pink to blue.

      (4) Figure 5 - a remarkably under-labelled figure.

      Now added labels.

      (5) Figure 6 - it would be interesting to add a comparison to Cx32 here as well for completeness, since the structure has been published in the meantime.

      Cx32 has now been included.

      (6) Figure 7 - please add equivalent labels on both sides of the model, left and right. Add the connecting lines for all of the tubes TM helices - this will help trace the structural elements shown. The legend does not quite explain the colors.

      We have modified the figure as suggested and explained the colours in the legend.

      (8) Fig.1 legend; Unclear what mCherry fluorescence represents. State that Cx26 was expressed as a translational fusion with mCherry.

      Now figure 4. We have now written “Montages each showing bright field DIC image of HeLa cells with mCherry fluorescence corresponding to the Cx26K125E-mCherry fusion superimposed (leftmost image) and the permeation of NBDG from the recorded cell to coupled cells.”

      (9) Fig. 3 b); Show R104 in the figure. Also E129-R98/R99 interaction is hard to acknowledge from the figure. It seems that the side chain density of E129 is not strong enough to support the modeled orientation.

      This is now Figure 1c. While the density in this region is sufficient to be confident of the main chain, we agree that the side chain density for the E129-R98/R99 interaction is not sufficiently clear to draw attention to and have removed the associated comment from the figure legend. The density is focussed on the linker between TM1 and the N-terminus and the KVRIEG motif. We prefer to omit R104, in order to keep the focus on this region. As described in the manuscript, the density for the R104 side chain is poor.

      (10) Fig. 3 c); Label the N-terminus and KVRIEG motif in the figure.

      Now Figure 1b. We have labelled the N-terminus. The KVRIEG motif is not visible in this map.

      (11) Page 9, lines 246-248; Restate, "We note, however, density near to Lys125, between Ser19 in the TM1-N-term linker, Tyr212 of TM4 and Tyr97 on TM3 of the neighbouring subunit, which we have been unable to explain with our modelling."

      We have reworded this.

      (12) Page 14, line 399; Patch clamp recording is not included in the manuscript.

      Patch clamp recordings were used to introduce dye into the donor cell.

      (13) On the same Figure 2, clashes are mentioned but these are hard to appreciate in any of the figures shown. Perhaps would be useful to include an inset showing this.

      We have modified Figure 2b slightly and added an explanation to highlight the clash. It is slightly confusing because the residues involved belong to neighbouring subunits.

      (14) The discussion related to Figure 6 is very hard to follow for readers who are not familiar with the context of abbreviations included on the figure labels. This figure could be improved to allow a general readership to identify more clearly each of the features and structural differences that are discussed in the text.

      We have extensively changed the text and updated the labels on the figure to make it much easier for the reader to follow.

      Below, you can find the individual reviews by each of the three reviewers.

      Reviewer #1 (Recommendations For The Authors):

      (1) In Figure 2d-e, the text discusses differences between K125E 90-1 and WT 90-class2 (7QEW), yet the figure compares K125E with 7QEQ. I suggest including a figure panel with a comparison between the two structures discussed in the manuscript text.

      This has been changed in the revised manuscript.

      Other comments have been addressed above.

    2. eLife assessment

      This study presents valuable new structures of a carbamylation-mimetic K125E mutant of the Cx26 gap junction channel uncovering the cytoplasmic loop structure and information about the closed state of the channel. The cryo-EM maps are in high quality and serve as strong foundations for dissecting the gating mechanism by CO2, providing convincing evidence in support of a mechanism where CO2-mediated carbamylation of Lys125 shifts the conformational equilibrium towards a state where the N-terminus occludes the pore of the channel. This information will be of interest to biochemists, cell biologists and biophysicists interested in the function of gap-junction channels in health and disease.

    3. Reviewer #1 (Public Review):

      Gap junction channels establish gated intercellular conduits that allow the diffusion of solutes between two cells. Hexameric connexin26 (Cx26) hemichannels are closed under basal conditions and open in response to CO2. In contrast, when forming a dodecameric gap-junction, channels are open under basal conditions and close with increased CO2 levels. Previous experiments have implicated Cx26 residue K125 in the gating mechanism by CO2, which is thought to become carbamylated by CO2. Carbamylation is a labile post-translational modification that confers negative charge to the K125 side chain. How the introduction of a negative charge at K125 causes a change in gating is unclear, but it has been proposed that carbamylated K125 forms a salt bridge with the side chain at R104, causing a conformational change in the channel. It is also unclear how overall gating is controlled by changes in CO2, since there is significant variability between structures of gap-junction channels and the cytoplasmic domain is generally poorly resolved. Structures of WT Cx26 gap-junction channels determined in the presence of various concentrations of CO2 have suggested that the cytoplasmatic N-terminus changes conformation depending on the concentration of the gas, occluding the pore when CO2 levels are high.

      In the present manuscript, Deborah H. Brotherton and collaborators use an intercellular dye-transfer assay to show that Cx26 gap-junction channels containing the K125E mutation, which mimics carbamylation caused by CO2, is constitutively closed even at CO2 concentrations where WT channels are open. Several cryo-EM structures of WT and mutant Cx26 gap junction channels were determined at various conditions and using classification procedures that extracted more than one structural class from some of the datasets. Together, the features on each of the different structures are generally consistent with previously obtained structures at different CO2 concentrations and support the mechanism that is proposed in the manuscript. The most populated class for K125E channels determined at high CO2 shows a pore that is constricted by the N-terminus, and a cytoplasmic region that was better resolved than in WT channels, suggesting increased stability. The K125E structure closely resembles one of the two major classes obtained for WT channels at high CO2. These findings support the hypothesis that the K125E mutation biases channels towards the closed state, while WT channels are in an equilibrium between open and closed states even in the presence of high CO2. Consistently, a structure of K125E obtained in the absence of CO2 appeared to also represent a closed state but at a lower resolution, suggesting that CO2 has other effects on the channel beyond carbamylation of K125 that also contribute to stabilizing the closed state. Structures determined for K125R channels, which are constitutively open because arginine cannot be carbamylated, and would be predicted to represent open states, yielded apparently inconclusive results.

      A non-protein density was found to be trapped inside the pore in all structures obtained using both DDM and LMNG detergents, suggesting that the density represents a lipid rather than a detergent molecule. It is thought that the lipid could contribute to the process of gating, but this remains speculative. The cytoplasmic region in the tentatively closed structural class of the WT channel obtained using LMNG was better resolved. An additional portion of the cytoplasmic face could be resolved by focusing classification on a single subunit, which had a conformation that resembled the AlphaFold prediction. However, this single-subunit conformation was incompatible with a C6-symmetric arrangement. Together, the results suggest that the identified states of the channel represent open states and closed states resulting from interaction with CO2. Therefore, the observed conformational changes illuminate a possible structural mechanism for channel gating in response to CO2.

    4. Reviewer #2 (Public Review):

      Summary:

      The manuscript by Brotherton et al. describes a structural study of connexin-26 (Cx26) gap junction channel mutant K125E, which is designed to mimic the CO2-inhibited form of the channel. In the wild-type Cx26, exposure to CO2 is presumed to close the channel through carbamylation of the redeye K125. The authors mutated K125 to a negatively charged residue to mimic this effect and observed by cryo-EM analysis of the mutated channel that the pore of the channel is constricted. The authors were able to observe conformations of the channel with resolved density for the cytoplasmic loop (in which K125 is located). Based on the observed conformations and on the position of the N-terminal helix, which is involved in channel gating and in controlling the size of the pore, the authors propose the mechanisms of Cx26 regulation.

      Strengths:

      This is a very interesting and timely study, and the observations provide a lot of new information on connexin channel regulation. The authors use the state of the art cryo-EM analysis and 3D classification approaches to tease out the conformations of the channel that can be interpreted as "inhibited", with important implications for our understanding of how the conformations of the connexin channels controlled.

      Weaknesses:

      The revised version of the manuscript is improved, and the authors have addressed the review comments/criticisms in a satisfactory manner.

    5. Reviewer #3 (Public Review):

      Summary:

      The mechanism underlying the well-documented CO2-regulated activity of connexin 26 (Cx26) remains poorly understood. This is largely due to the labile nature of CO2-mediated carbamylation, making it challenging to visualize the effects of this reversible posttranslational modification. This paper by Brotherton et al. aims to address this gap by providing structural insights through cryo-EM structures of a carbamylation-mimetic mutant of the gap junction protein.

      Strength:

      The combination of the mutation, elevated PCO2, and the use of LMNG detergent resulted in high-resolution maps that revealed, for the first time, the structure of the cytoplasmic loop between transmembrane helix (TM) 2 and 3.

      Weaknesses:

      While the structure of the TM2-TM3 loop may suggest a mechanism for stabilizing the closed conformation, the EM density is not strong enough to support direct interaction with carbamylated or mutated K125.

      Overall, the cryo-EM structures presented in this study support their proposing mechanism in which carbamylation at K125 promotes Cx26 gap junction closure. Through careful control of the pH and PCO2 for each cryo-EM sample, the current study substantiated that the more closed conformation observed in high PCO2 is independent of pH but likely triggered by carbamylation. This was unclear from their prior cryo-EM map of wildtype Cx26 at high PCO2.

      While the new structures successfully visualize the TM2-TM3 loop, which likely plays significant roles in CO2-regulated Cx26 activity, further studies are necessary to understand the underlying mechanism. For instance, the current study lacks explanation regarding what propels the movement of the N-terminal helix, how carbamylated K125 interacts with the TM2-TM3 loop, the importance of the lipids visualized in the map, or the reason why gap junctions are constitutively open while hemichannels are closed under normal PCO2 levels

    1. Reviewer #3 (Public Review):

      The authors delved into an important aspect of abortifacient diseases of livestock in Tanzania. The thoughts of the authors on the topic and its significance are implied, and the methodological approach needs further clarity. The number of wards in the study area, statistical selection of wards, type of questionnaire ie open or close-ended. Statistical analyses of outcomes were not clearly elucidated in the manuscript. Fifteen wards were mentioned in the text but 13 used what were the exclusion criteria. Observations were from pastoral, agropastoral, and smallholder agroecological farmers. No sample numbers or questionnaires were attributed to the above farming systems to correlate findings with management systems. The impacts of the research investigation output are not clearly visible as to warrant intervention methods. What were the identified pathogens from laboratory investigation, particularly with the use of culture and PCR not even mentioning the zoonotic pathogens encountered if any? The public health importance of any of the abortifacient agents was not highlighted.

      In conclusion, based on the intent of the authors and the content of this research, and the weight of the research topic, there are obvious weaknesses in the critical data analysis to demonstrate cause, effect, and impact.

    2. Reviewer #2 (Public Review):

      The paper "The Value of Livestock Abortion Surveillance in Tanzania: Identifying Disease Priorities and Informing Interventions" provides a comprehensive analysis of the importance of livestock abortion surveillance in Tanzania. The authors aim to highlight the significance of this surveillance system in identifying disease priorities and guiding interventions to mitigate the impact of livestock abortions on both animal and human health.

      Summary:

      The paper begins by discussing the context of livestock farming in Tanzania and the significant economic and social impact of livestock abortions. The authors then present a detailed overview of the livestock abortion surveillance system in Tanzania, including its objectives, methods, and data collection process. They analyze the data collected from this surveillance system over a specific period to identify the major causes of livestock abortions and assess their public health implications.

      Evaluation:

      Overall, this paper provides valuable insights into the importance of livestock abortion surveillance as a tool for disease prioritization and intervention planning in Tanzania. The authors effectively demonstrate the utility of this surveillance system in identifying emerging diseases, monitoring disease trends, and informing evidence-based interventions to control and prevent livestock abortions.

      Strengths:

      (1) Clear Objective: The paper clearly articulates its objective of highlighting the value of livestock abortion surveillance in Tanzania.

      (2) Comprehensive Analysis: The authors provide a thorough analysis of the surveillance system, including its methodology, data collection process, and findings as seen in the supplementary files.

      (3) Practical Implications: The paper discusses the practical implications of the surveillance system for disease control and public health interventions in Tanzania.

      (4) Well-Structured: The paper is well-organized, with clear sections and subheadings that facilitate understanding and navigation.

      Suggestions for Improvement:

      (1) Data Presentation: While the analysis is comprehensive, the presentation of data could be enhanced with the use of more visual aids such as tables, graphs, or charts to illustrate key findings.

      (2) Discussion Section: The paper could benefit from a more in-depth discussion of the implications of the findings for disease control strategies and policy formulation in Tanzania.

      (3) Future Directions: Including recommendations for future research or areas for further investigation would add depth to the paper.

      Summary:

      This paper contains thorough analysis and valuable insights. Overall, it makes a significant contribution to the literature on livestock abortion surveillance and its implications for disease control in Tanzania.

    3. Reviewer #1 (Public Review):

      Summary:

      The paper examined livestock abortion, as it is an important disease syndrome that affects productivity and livestock economies. If livestock abortion remains unexamined it poses risks to public health.

      Several pathogens are associated with livestock abortions across Africa however the livestock disease surveillance data rarely include information from abortion events, little is known about the aetiology and impacts of livestock abortions, and data are not available to inform prioritisation of disease interventions. Therefore the current study seeks to examine the issue in detail and proposes some solutions.

      The study took place in 15 wards in northern Tanzania spanning pastoral, agropastoral, and smallholder agro-ecological systems. The key objective is to investigate the causes and impacts of livestock abortion.

      The data collection system was set up such that farmers reported abortion cases to the field officers of the Ministry of Livestock and Fisheries livestock.

      The reports were made to the investigation teams. The team only included abortion of those that the livestock field officers could attend to within 72 hours of the event occurring.

      Also, a field investigation was carried out to collect diagnostic samples from aborted materials. In addition, aborting dams and questionnaires were administered to collect data on herd/flock management. Laboratory diagnostic tests were carried out for a range of abortigenic pathogens

      Over the period of the study, 215 abortion events in cattle (n=71), sheep 48 (n=44), and goats (n=100) were investigated. All 49 investigated cases varied widely across wards. The aetiological attribution, achieved for 19.5% of cases through PCR-based diagnostics, was significantly affected by delays in the field investigation.

      The result also revealed that vaginal swabs from aborting dams provided a practical and sensitive source of diagnostic material for pathogen detection.

      Livestock abortion surveillance can generate valuable information on causes of zoonotic disease outbreaks, and livestock reproductive losses and can identify important pathogens that are not easily captured through other forms of livestock disease surveillance. The study demonstrated the feasibility of establishing an effective reporting and investigation system that could be implemented across a range of settings, including remote rural areas,

      Strengths:

      The paper combines both science and socio-economic methodology to achieve the aim of the study. The methodology was well presented and the sequence was great. The authors explain where and how the data was collected. Figure 2 was used to describe the study area which was excellently done. The section on the investigation of cases was well written. The sample analysis was also well-written. The authors devoted a section to summarizing the investigated cases and description of the livestock 221-study population. The logit model was well-presented.

    4. eLife assessment

      This important study reports the use of a surveillance approach in identifying emerging diseases, monitoring disease trends, and informing evidence-based interventions in the control and prevention of livestock abortions, as it relates to their public health implications. The data support the convincing finding that abortion incidence is higher during the dry season, and occurs more in cross-bred and exotic livestock breeds. Aetiological and epidemiological data can be generated through established protocols for sample collection and laboratory diagnosis. These findings are of potential interest to the fields of veterinary medicine, public health, and epidemiology.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      The reviewers thoughtful comments have helped us make the manuscript both more comprehensive and clearer. Thank you for your time and effort. We know that this is a long and technical paper. In our responses we refer to three documents:

      • Original: the first original submission

      • Revision: the revised document (02 MillardFranklinHerzog2023 v2.pdf)

      • Difference: a document that shows the changes made to text (but not figures or tables) from the original to revision (03 MillardFranklinHerzog2023 diff.pdf).

      Reviewer #1 (Recommendations For The Authors):

      (1) In general, the paper is well written and addresses important questions of muscle mechanics and muscle modeling. In the current version, the model limitations are briefly summarized in the abstract. However, the discussion needs a more complete description of limitations as well as a discussion of types of data (in vivo, ex vivo, single fiber, wholes muscle, MTU, etc.) that can be modeled using this approach.

      Please see the response to comment 23 for more details of the limitations that have been added to the revised document.

      (2) The choice of a model with several tendon parameters for simulating single muscle fiber experiments is not well justified.

      A rigid-tendon model with a slack length of zero was, in fact, used for these simulations for both the VEXAT and Hill models. In case this is still not clear: a rigid-tendon model of zero length is equivalent to no tendon at all. The text that first mentions the tendon model has now been modified to make it clearer that the parameters of the model were set to be consistent with no tendon at all:

      Please see the following text:

      Original:

      • page 17, column 1, line 28 ”... rigid tendon of zero length,”

      • page 17, column 1, line 51 ”... rigid tendon of zero length.”

      Revision:

      • page 19, column 1, line 19 ”... we used a rigid-tendon of zero length (equivalent to ignoring the tendon)”

      • page 19, column 1, line 38 ”... coupled with a rigid-tendon of zero-length.”

      Difference:

      • page 21, column 1, line 19 ”... we used a rigid-tendon... ”

      • page 21, column 1, line 45 ”... rigid-tendon of zero length ...”

      (3) A table that clarifies how all model parameters were estimated needs to be included in the main part of the manuscript.

      Two tables have been added to the manuscript that detail the parameters of the elastic-tendon cat soleus model (in the main body of the text) and the rabbit psoas fibril model (in an appendix). Each table includes:

      • A plain language parameter name

      • The mathematical symbol for the parameter

      • The value and unit of the parameter

      • A coded reference to the data source that indicates both the experimental animal and how the data was used to evaluate the parameter.

      Please see the following text:

      Revision:

      • page 11

      • page 42

      Difference:

      • page 11

      • page 46

      (4) The supplemental information is not properly referenced in the main text. There are a number of smaller issues that also need to be addressed.

      Thank for your attention to detail. The following problems related to Appendix referencing have been fixed:

      • Appendices are now parenthetically referenced at the end of a sentence. However, a few references to figures (that are contained within anAppendix) still appear in the body of the sentence since moving these figure references makes the text difficult to understand.

      • All Appendices are now referenced in the main body of the text.

      (5) Abstract, line 6: While it is commonly assumed that the short range stiffness of muscle is due to cross bridges, Rack & Westbury (1974) noted that it occurs over a distance of 25-35 nm, and that many cross-bridges must be stretched even farther than this distance (their p. 348 middle). It seems unlikely that cross-bridges alone can actually account for the short-range stiffness.

      There are three parts to our response to this comment:

      (a) Rack & Westbury’s definition of short-range-stiffness and unrealistic cross-bridge stretches

      (b) Rack & Westbury’s definition of short-range-stiffness vs. linear-timeinvariant system theory

      (c) Updates to the paper

      a. Rack & Westbury’s definition of short-range-stiffness and unrealistic cross-bridge stretches.

      As you note, on page 348, Rack and Westbury write that ”If the short range stiffness is to be explained in terms of extension of cross-bridges, then many of them must be extended further than the 25-35 nm mentioned above.” Having re-read the paper, its not clear how these three factors are being treated in the 25−35 nm estimate:

      • the elasticity of the tendon and aponeurosis,

      • the elasticity of actin and myosin filaments,

      • and the cycling rate of the cross-bridges.

      Obviously the elasticity of the tendon, aponeurosis, actin, and myosin filaments will reduce the estimated amount of crossbridge strain during Rack and Westbury’s experiments. A potentially larger factor is the cycling rate of each cross-bridge. If each crossbridge cycles faster than 11 Hz (the maximum frequency Rack and Westbury used), then no single crossbridge would stretch by 25-35 nm. So why didn’t Rack and Westbury consider the cycling rate of crossbridges?

      Rack and Westbury’s reasoned that a perfectly elastic work loop would necessarily mean that all crossbridges stayed attached: as soon as a crossbridge cycles it would release its stored elastic energy and the work loop would no longer be elastic. Since Rack and Westbury measured some nearly perfect elastic work loops (the smallest loops in Fig. 2,3, and 4), I guess they assumed crossbridges remained attached during the 25-35 nm crossbridge stretch estimate. However, even Rack and Westbury note that none of the work loops they measured were perfectly elastic and so there is room to entertain the idea that crossbridges are cycling.

      Fortunately, for this discussion, crossbridge cycling rates have been measured.

      In-vitro measurements by Uyeda et al. show that crossbridges are cycling at 30 Hz when moving at 0.5-1.2 length/s. At this rate, there would be enough time for a single crossbridge to cycle nearly 2.72 times for every cycle of the 11 Hz sinusoidal perturbations, reducing its expected strain from 25-35 nm down to 9.2−12.9µm. This effect becomes even more pronounced if crossbridge cycling rate is used to explain the difference in sliding velocity between Uyeda et al.’s in-vitro data (0.5-1.2 length/s) and the maximum contraction velocity of an in-situ cat soleus (4.65 lengths/s, Scott et al.).

      b. Rack & Westbury’s definition of short-range-stiffness vs. linear-time-invariant system theory

      Rack and Westbury defined short-range-stiffness to describe a specific kind of force response of the muscle to cyclical length changes:

      • muscle force is linear with length change,

      • and independent of velocity.

      Rack and Westbury’s definition therefore fails when viscous forces become noticeable, because viscous forces are velocity dependent.

      On line 6 of the abstract the term ‘short-range-stiffness’ is not used because Rack and Westbury’s definition is too narrow for our purposes. Instead we are using the more general approach of approximating muscle as a linear-timeinvariant (LTI) system, where it is assumed that

      • the response of the system is linear

      • and time invariant.

      To unpack that a little, a muscle is considered in the ‘short-range’ in our work if it meets the criteria of a linear time-invariant (LTI) system:

      • the force response of muscle can be accurately described as a linear function of its length and velocity (its state)

      • and its response is not a function of time (which means constant stimulation, and no fatigue).

      In contrast to Rack and Westbury’s definition, the ‘short-range’ in linear systems theory is general enough to accommodate both elastic and viscous forces. In physical terms, small for an LTI approximation of muscle is larger than the short-range defined by Rack and Westbury: an LTI system can include velocity dependence, while short-range-stiffness ends when velocity dependence begins.

      c. Updates to the paper

      To make the differences between Rack and Westbury’s ‘short-range-stiffness’ and LTI system theory clearer: - We have removed all occurrences of ‘short-range’ that were associated with Kirsch et al. and have replaced this phrase with ‘small’.

      • On the first mention of Kirsch’s work we have made the wording more specific

      Revision:

      • page 1, column 1, lines 4,5

      • page 1, column 2, lines 14-21 ”Under constant activation ...”

      Difference: page 1, column 2, line 19-26

      • page 1, column 1, lines 4,5

      • page 1, column 2, lines 20-27 ”Under constant activation ...”

      • A footnote has been added to contrast the definition of ‘small’ in the context of an linear time invariant system to ‘short-range’ in the context of Rack and Westbury’s definition of short-range-stiffness.

      Revision: page 1, column 2, bottom

      Difference: page 1, column 2, bottom

      • In addition, we have added a brief overview of LTI system theory to make the analysis and results more easily understood:

      Revision: Figure 4 paragraph beginning on page 10, column 2, line 15 ”As long as ...”

      Difference: Figure 4 paragraph beginning on page 12, column 1, line 46 ”As long as ...”

      (6) Page 3, lines 6-8: It also seems unlikely that 25% of cross-bridges are attached at one time (Howard, 1997) even for supramaximal isometric stimulation. The number should be less than 20%. What would the ratio of load path stiffness be for low force movements such as changing the direction of a frictionless manipulandum or slow walking? The range of relative stiffnesses is of more interest than the upper limit.

      We have made the following updates to address this comment:

      • A 20% duty cycle now defines the upper bound stiffness of the actinmyosin load path.

      • We have also evaluated the lower bound actin-myosin stiffness when a single crossbridge is attached.

      • The stiffness of titin from Kellermayer et al. has been digitized at a length of 2 µm and 4 µm to more accurately capture the length dependence of titin’s stiffness.

      • We have added a new figure (Figure 14) to make it easier to compare the range of actin-myosin stiffness to titin-actin stiffness.

      • The text in the main body of the paper and the Appendix has been updated.

      • The script ’main ActinMyosinAndTitinStiffness.m’ used to perform the calculations and generate the figure is now a part of the code repository.

      Please see the following text:

      Revision

      • The paragraph beginning at page 2, column 2, line 45 ”The addition of a titin element ...”

      • Appendix A

      • Figure 14 (in Appendix A)

      Difference

      • The paragraph beginning at page 3, column 1, line 6: ”The addition of a titin element ...”

      • Appendix A

      • Figure 14 (in Appendix A)

      (7) Page 5, line 12: A word seems to be missing here, ”...together to further...”.

      Thank you for your attention to detail. The sentence has been corrected.

      Please see the following text:

      • Revision: page 4, column 2, line 40 ”... into a single ...”

      • Difference: page 5, column 1, line 18

      (8) Page 5, line 24-27: These ”theories” are not mutually exclusive, and it is misleading to suggest they are. There is evidence for binding of titin to actin at multiple locations and there is no reason why evidence supporting one binding location must detract from the evidence supporting other binding locations.

      The text has been modified to make it clear to readers that the different titinactin binding locations are not mutually exclusive. Please see the following text:

      • Revision: page 5, column 1, lines 17-19, the sentence beginning ”As previously mentioned, ...”

      • Difference: page 5, column 1, lines 41-44

      (9) Page 5, lines 48-51: Should cite Kellermayer and Granzier (1996) not Kellermayer et al. (1997).

      The reference to ‘Kellermayer et al.’ has been changed to ‘Kellermayer and Granzier’. The comment that the year of the reference should be changed from (1997) to (1996) is confusing: the 1996 paper is being referenced.

      For further details please see:

      • Revision: page 5, column 1, 39-40

      • Difference: page 5, column 2, line 19-22

      (10) Also, Dutta et al. (2018) should be cited as further showing that N2A titin by itself slows actin motility on myosin.

      Thank you for the suggestion. The sentence has been modified to include Dutta et al.:

      For further details please see:

      • Revision: page 5, column 1, 40

      • Difference: page 5, column 2, line 19-22

      (11) Figure 2 legend and elsewhere: it is odd to say that experiments used ”a cat soleus” when more than one cat coleus was used. Change to ”cat coleus”. See also page 15, line 15.

      Thank you for your attention to detail. All occurrences of ‘a cat soleus’ have been changed, with some sentence revision, to ‘cat soleus’.

      (12) Page 6, line 10: It is not clear why an MTU was used to simulate single muscle fiber experiments. What is the justification for choosing this particular model? Also, the choice of model might explain why the version with stiff tendon performs better than the version with an elastic tendon, but this is never mentioned. Why not use a muscle model with no tendon (e.g., Wakeling et al., 2021 J. Biomech.)?

      Please see the response to comment 2.

      (13) Millard et al.’s activation dynamics model also fails to capture the lengthdependence of activation dynamics (Shue and Crago, 1998; Sandercock and Heckman, 1997), which should be noted in the discussion along with other limitations.

      An additional limitations paragraph is in the revised manuscript that addresses this comment specifically. However, we have used Stephenson and Wendt as a reference for the shift in peak isometric force that comes with submaximal activation. In addition, we also reference Chow and Darling for the property that the maximum shortening velocity is reduced with submaximal activations.

      • Revision: page 22, column 1, line 41 ”Finally, the VEXAT model ...”

      • Difference: page 24, column 2, line 12 ”Finally, the VEXAT model ...”

      In addition, please see the response to comment 23.

      (14) Page 6, line 22: ”An underbar...”.

      Thank you for your attention to detail, this correction has been made.

      (14) Page 7, lines 27-32: This and other issues should be described in the Discussion under a heading of model limitations.

      Please see the response to comment 23.

      (15) Page 7, lines 43-44: Numerous papers from the last author’s laboratory contradict the claim that there is no force enhancement on the ascending limb by demonstrating that force enhancement does occur on the ascending limb (see e.g., Leonard & Herzog 2002, Peterson et al., 2004 and several papers from the Rassier laboratory).

      Thank you for your attention to detail. This statement is in error and has been removed. To improve this section of the paper, a paragraph has been added to briefly mention the experimental observations of residual force enhancement before proceeding to explain how this phenomena is represented by the model.

      Please see the following text:

      Revision:

      • the paragraph starting on page 7, column 2, line 43 ”When active muscle is lengthened, ...”

      • and the following paragraph starting on page 8, column 1, line 3 “To develop RFE, ”

      Difference:

      • the paragraph starting on page 8, column 2, line 15

      • and the following paragraph starting on page 9, column 1, line 6

      (17) Figure 3 legend and elsewhere: The authors use Prado et al. (2005) to determine several titin parameters, however the simulations seem to focus on cat soleus, but Prado et al.’s paper is on rabbits. More clarity is needed about which specific results from which species and muscles were used to parameterize the model.

      The new parameter table includes coded entries to indicate the literature source for experimental data, the animal it came from, and how the data was used. For example, the ‘ECM fraction’ has a source of ‘R[57]’ to show that the data came from rabbits from reference 57. For further details, please see the response to comment #3

      Please see the following text:

      • Revision: page 11, column 2, table section H: ‘ECM fraction’.

      • Difference: page 11, column 2, table section H: ‘ECM fraction’.

      To address this comment in a little more detail, we have had to use Prado et al. (2005) to give us estimates for only one parameter: P, the fraction of the passive force-length relation that is due to titin. Prado et al.’s measurements relating to P are unique to our knowledge: these are the only measurements we have to estimate P in any muscle, cat soleus or otherwise. Here we use the average of the values for P across the 5 muscles measured by Prado et al. as a plausible default value for all of our simulations.

      (18) Figure 4 seems unnecessary.

      Figure 4 has been removed.

      (19) Page 10, lines 17-18: provide the abbreviation (VAF) here with the definition (variance accounted for).

      Thank you for your attention to detail. The abbreviation has been added.

      Please see these parts of the manuscripts for details:

      • Revision: page 12, column 2, line 13

      • Difference: page 13, column 2, line 32

      (20) Page 11, lines 2-3: Here and elsewhere, it is clear that some model parameters have been optimized to fit the model. The main paper should include a table that lists all model parameters and how they were chosen or optimized, including but not limited to the information in Table 1 of the supplemental information section.

      See response to comment 3.

      (20) Page 17, lines 45 -49: Again, a substantial number of ad hoc adjustments to the model appear to be required. These should be described in the Discussion under limitations, and accounted for in the parameters table. See also legends to Fig. 12 and 13, page 19, lines 23-26.

      Please see the response to comment #3: a coded entry now appears to indicate the data source, the animal used in the experiment, and the method used to process the data. This includes entries for parameters which were estimated

      ‘E’ so that the model produced acceptable results in the simulations presented. In addition, the new discussion paragraph includes a number of sentences that use the adjustment to the active-titin-damping coefficient as an opening to discuss the limitations of the VEXAT’s titin-actin bond model and the circumstances under which the model’s parameters would need to be adjusted.

      Please see responses to comments 3 and 23 for additional details. In addition, please see the specific discussion text mentioning the change to βoPEVK:

      • Revision: page 22, column 1, line 30 ”In Sec. 3.3 we had ...”

      • Difference: page 24, column 1, line 49

      (22) Page 20, lines 50-11: It should be noted here that Tahir et al.’s (2018) model has both series and parallel elastic elements, provided by superposition of rotation (series) and translation (parallel) of a pulley.

      While it is true that Tahir et al.’s (2018) model has series and parallel elements, as do the other models mentioned, these models do not have the correct structure to yield a gain and phase response that mimics biological muscle. The text that I originally wrote attempted to explain this without going into the details. As you note, this explanation leaves something to be desired. The original text commenting on the models of Forcinito et al, Tahir et al, Haeufle et al., and Gunther et al. has been updated to be more specific.¨ Please see the parts of the following manuscripts for details:

      • Revision: page 22, column 2, line 20, the paragraph beginning ”The models of Forcinito ...”

      • Difference: page 24, column 2, line 44

      (23) Discussion: This section should include a description of model limitations, including the relatively large number of ad hoc modifications and how many parameters must be found by optimization in practice. The authors should discuss what types of data are most compatible for use with the model (ex vivo, in vivo, single fiber, whole muscle, MTU), requirements for applying the model to different types of data, and impediments to using the model on different types of data.

      An additional limitations paragraph has been added to the discussion.

      Please see the following text:

      • Revision: the paragraph beginning on page 22, column 1, line 11 ”Both the viscoelastic ...”

      • Difference: the paragraph beginning on page 24, column 1, line 27.

      Reviewer #2 (Recommendations For The Authors):

      (1) If it is possible to compare the output of this model to other more contemporary models which incorporate titin but are also simple enough to implement in whole-body simulation (such as the winding filament model), this would seem to greatly strengthen the paper.

      That’s an excellent idea, though beyond the scope of this already lengthy paper. Even though the Hill model we evaluated is a bit old it is widely used, and so, many readers will be interested in seeing the benchmark results. As benchmarking work is both difficult to fund and undertake, we do hope that others will evaluate their own models using the code and data we have provided.

      (2) I’m a little unclear on the basis for the transition between short- and midrange length changes, both in reality and in the model. And also about the range of strains that qualify as ”short”. It seems like there is potential for short range stiffness, although I would have thought more in the range of 1-2% strains than >3%, to be due to currently attached crossbridges. There is clear evidence that active titin is responsible for the low stiffness at very large strains that exceed actin-myosin overlap. But I am not clear on how a transitional stiffness on the descending limb of the force-length relationship is implemented in the model, and what aspect of physiology this is replicating. It may be helpful to clarify this further and indicate where in the model this stiffness arises.

      This question has several parts to it which I will paraphrase here:

      A Short-range stiffness acts over smaller strains than 3.8%. How is shortrange defined?

      B Where is the transition made between short-range and mid-range force response, both in reality and in the model. Also how does this change on the descending limb?

      C What components in the model contribute to the stiffness of the CE?

      A. Short-range stiffness acts over smaller strains than 3.8%. How is shortrange defined?

      The response to Reviewer 1’s comment # 5 directly addresses this question.

      B. Where is the transition made between short-range and mid-range forceresponse, both in reality and in the model. Also how does this change on the descending limb? We are going to rephrase the question because of changes in terminology that we have made in response to Reviewer 1’s comment #5.

      (i) What is the basis for the transition between the muscle behaving like an LTI system? Both in reality, and in the model. (ii) What happens outside the LTI range? (iii) Also how does this change on the descending limb?

      We will address this question one part at a time:

      (i) What is the basis for the transition between the muscle behaving like an LTI system? Both in reality, and in the model.

      A system’s response can be approximated as a linear-time-invariant (LTI) system as long as it is time-invariant, and its output can be expressed as a linear function of its input. In the context of Kirsch et al.’s experiment, the ‘system’ is the muscle, the ‘input’ is the time series of length data, and the ‘output’ is the time series of force data. Due to the requirement for timeinvariance, two experimental conditions must be met to approximate muscle as an LTI system:

      • the nominal length of the muscle stays constant over long periods of time,

      • and the nominal activation of the muscle stays constant.

      These conditions were met by default in Kirch et al.’s experiment, and also in our simulations of this experiment. The one remaining condition to assess is whether or not the muscle’s response is linear.

      To evaluate whether the muscle’s force is a linear function of the length change, Kirch et al. evaluated (Cxy)2 the coherence squared between the length and force time-series data. Even though the mathematical underpinnings of (Cxy)2 are complicated, the interpretation of (Cxy)2 is simple: muscle can be accurately approximated as a linear system if (Cxy)2 is close to 1, but the accuracy of this approximation becomes poor as (Cxy)2 approaches 0. Kirsch et al. used (Cxy)2 to identify a bandwidth in which the response of the muscle to the 1−3.8%ℓoM length changes was sufficiently linear for analysis: a lower bound of 4 Hz was identified using (Cxy)2 and the bandwidth of the input signal (15 Hz, 35 Hz, or 90 Hz) set the upper bound. In Fig. 3 of Kirsch et al. the (Cxy)2 at 4 Hz has a value of at least 0.67 for the 15 Hz and 90 Hz signals. To minimize error in our analysis and yet be consistent with Kirsch et al., we analyze the bandwidth common to both (Cxy)2 ≥ 0.67 and Kirsch et al.’s defined range. Though the bandwidth defined by the criteria (Cxy)2 ≥ 0.67 is usually larger than the one defined by Kirsch et al., there are some exceptions where the lower frequency bound of the models is higher than 4 Hz (now reported in Tables 4D and 5D).

      (ii) What happens outside the LTI range?

      When a muscle’s output cannot be considered a LTI it means that either that its length or activation is time-varying, or the relationship between length and force is no longer linear. In short, that the muscle is behaving as one would normally expect: time-varying and non-linearly. The wonderful part of Kirsch et al.’s work is that they found a surprisingly large region in the frequency domain where muscle behaves linearly and can be analyzed using the powerful tools of linear systems and signals.

      (iii) Also how does this change on the descending limb?

      Since nominal length of Kirsch et al.’s experiments is ℓoM it is not clear how the results of the perturbation experiments will change if the nominal length is moved firmly to the descending limb. However, we can see how the stiffness and damping values will change by examining Figure 9C and 9D which shows the calculated stiffness and damping of the VEXAT and Hill models as ℓM is lengthened from ℓoM down the descending limb: the stiffness and damping of the VEXAT model does not change much, while the Hill model’s stiffness changes sign and the damping coefficient changes a lot. What cannot be seen from Figure 9C and 9D is how the bandwidth over which the models are considered linear changes.

      We have made a number of updates to the text to more clearly communicate these details of our response to part (i):

      • Text has been edited so that it is clear that the terms ’short-range stiffness’ and ’small’ from Rack and Westbury’s work is not confused with ’stiffness’ and ’small’ from the LTI system’s analysis. Please see our response to comment # 5 for details.

      • We have added text to the main body of the paper to explain how the coherence squared metric was used to select a bandwidth in which the response of the system is approximately linear:

      • Revision: the paragraph that starts on page 11, column 1, line 3 ”Kirsch et al. used system identification ...”

      – Difference: page 13, column 2, line 1

      – Coherence is defined in Appendix D

      – Coherence is now also included in the example script ‘main SystemIdentificationExample.m’

      • The bandwidth over which model output can be considered linear (coherence squared > 0.67) has been added to Tables 4 and 5

      – Revision: see Table 4D, and Table 5D in Appendix E

      – Difference: see Table 4D, and Table 5D in Appendix E

      • Figures 6 and Figures 16 are annotated now if the plotted signal does not meet the linearity requirement of Cxy > 0.67.

      C. What components in the model contribute to the stiffness of the CE?

      There are three components that contribute to the stiffness of the CE which are pictured in Figure 1, appear in Eqn. 15, and are listed explicitly in Eqn. 76:

      (a) The XE, as represented by the afL(ℓ˜S+L˜M)k˜oX term in Eqn. 15.

      (b) The elasticity of the distal segment of titin, f2(ℓ˜2). Only f2(ℓ˜2) appears in Eqn. 15 because ℓ˜1 is a model state.

      (c) The extracellular matrix, as represented by the fECM(ℓ˜ECM)

      There is also a compressive element fKE, but it plays no role in the simulations presented in this work because it only begins to produce force at extremely short CE lengths (ℓ˜M < 0.1ℓoM).

      We have made the following changes to make these components clearer

      Figure 1A has been updated:

      – The symbols for a spring and a damper are now defined in Figure 1A

      – The ECM now has a spring symbol. Now all springs and dampers have the correct symbol in Figure 1A.

      – The caption now explicitly lists the rigid, viscoelastic, and elastic elements in the model

      The equations for the VEXAT’s CE stiffness and damping are now compared and contrasted to the the Hill model’s stiffness and damping in Sec. 3.1.

      – Revision: starting at page 14, column 2, line 1: Eqn. 28 and Eqn. 29 and surrounding text

      – Difference: page 17, column 1, line 22

      (3) This model appears to be an amalgamation of a phenomenological (forcelength and force-velocity relationships) and a mechanistic (crossbridge and titin stiffness and damping) model. While this may improve predictions, and so potentially be useful, it also seems like it limits the interpretation of physiological underpinnings of any findings. It may be helpful to explore in greater detail the implications of this approach.

      We have added a limitations paragraph to the discussion which addresses this comment and can be found in:

      • Revision: the paragraph beginning on page 22, column 1, line 11 ”Both the viscoelastic ...”

      • Difference: the paragraph beginning on page 24, column 1, line 27

      (4)As a biologist, I found the interpretation of phase and gain a little difficult and it may help the reader to show in greater detail the time series data and model predictions to highlight conditions under which the models do not accurately capture the magnitude and timing of force production.

      It is important that the ideas of phase and gain are understood, especially because little information can be gleaned from the time series data directly. There is some time series data in the paper already that compares each model’s response to its spring-damper of best fit: plots of the force response of each model and its spring damper of best fit can be found in Figures 6A, 6D, 6G, 6J, 16A, 16D, 16G, and 16J in the revised manuscript. While it is clear that models with a higher VAF more closely match the spring-damper of best fit, there is not much more that can be taken from time series data: the systematic differences, particularly in phase, are just not visually apparent in the time-domain but are clear in gain and phase plots in the frequency-domain.

      To make the meaning of phase and gain plots clearer, Figure 4 (Figure 5 in the first submission) has been completely re-made and includes plots that illustrate the entire process of going from two length and force timedomain signals to gain and phase plots in the frequency-domain. Included in this figure is a visual representation of transforming a signal from the time to the frequency domain (Fig. 4B and 4C), and also an illustration of the terms gain and phase (Fig. 4D). In addition, a small example file ’main SystemIdentificationExample.m’ has been added to the matlab code repository in the elife2023 branch to accompany Appendix D, which goes through the mathematics used to transform input and output time domain signals into gain and phase plots of the input-output relation. Small updates have been made to Figure 6 and 16 in the revised paper (Figures 7 and 18 in the first submission) to make the time domain signals from the spring-damper of best fit and the model output clearer. Finally, I have re-calculated the gain and phase profiles using a more advanced numerical method that trades off some resolution in frequency for more accuracy in the magnitude. This has allowed me to make Figures 6 and 16 easier to follow because the gain and phase responses are now lines rather than a scattering of points. We hope that these additions make the interpretation of gain and phase clearer.

      Please see

      Revision:

      – Figure 4 and caption on page 12

      – The opening 2 paragraphs of Sec 3.1 starting on page 10, column 2, line 4 ”In Kirsch et al.’s ...”

      – Figure 6 & 16: spring damper and model annotation added, plotted the gain and phase as lines

      – Appendix D: Updated to include coherence and the more advanced method used to evaluate the system transfer function, gain, and phase.

      Difference:

      – Figure 4 and caption on page 12

      – The opening 2 paragraphs of Sec 3.1 starting on page 12, column 1, line 34 and ending on page 13, column 2, line 29

      – Figure 6 & 16: spring damper and model annotation added

      – Appendix D

      (5) The actin-myosin and actin-titin load pathways are depicted as distinct in the model. However, given titin’s position in the center of myosin and the crossbridge connections between actin and myosin, this would seem to be an oversimplification. It seems worth considering whether the separation of these pathways is justified if it has any effect on the conclusions or interpretation.

      We have reworked one of the discussion paragraphs to focus on how our simulations would be affected by two mechanisms (Nishikawa et al.’s winding filament theory and DuVall et al.’s titin entanglement hypothesis) that make it possible for crossbridges to do mechanical work on titin.

      • Revision: the paragraph beginning on page 21, column 2, line 42 “The active titin model ...”

      • Difference: the paragraph beginning on page 23, column 2, line 48

      References

      Nishikawa KC, Monroy JA, Uyeno TE, Yeo SH, Pai DK, Lindstedt SL. Is titin a ‘winding filament’? A new twist on muscle contraction. Proceedings of the royal society B: Biological sciences. 2012 Mar 7;279(1730):981-90.

      DuVall M, Jinha A, Schappacher-Tilp G, Leonard T, Herzog W. I-Band Titin Interaction with Myosin in the Muscle Sarcomere during Eccentric Contraction: The Titin Entanglement Hypothesis. Biophysical Journal. 2016 Feb 16;110(3):302a.

    2. eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

    3. Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

      I think the authors should state how many parameters require fitting to the data vs the total number of model parameters. It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

    4. Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model's ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

    1. eLife assessment

      This study is of potential interest to readers in human genetics and quantitative genetics, as it presents a new method for homozygosity mapping in population-scale datasets, based on an innovative computational algorithm that efficiently identifies runs-of-homozygosity (ROH) segments shared by many individuals. Although the method is innovative and has the potential to be broadly useful, its power and limitations have not yet been adequately evaluated. The application of this new method to the UK Biobank dataset identifies several interesting associations, but it remains currently unclear under what conditions the new approach can provide additional power over existing genome-wide association study methods.

    2. Reviewer #1 (Public Review):

      In this manuscript, Naseri et al. present a new strategy for identifying human genetic variants with recessive effects on disease risk by the genome-wide association of phenotype with long runs-of-homozygosity (ROH). The key step of this approach is the identification of long ROH segments shared by many individuals (termed "shared ROH diplotype clusters" by the authors), which is computationally intensive for large-scale genomic data. The authors circumvented this challenge by converting the original diploid genotype data to (pseudo-)haplotype data and modifying the existing positional Burrow-Wheeler transformation (PBWT) algorithms to enable an efficient search for haplotype blocks shared by many individuals. With this method, the authors identified over 1.8 million ROH diplotype clusters (each shared by at least 100 individuals) and 61 significant associations with various non-cancer diseases in the UK Biobank dataset.

      Overall, the study is well-motivated, highly innovative, and potentially impactful. Previous biobank-based studies of recessive genetic effects primarily focused on genome-wide aggregated ROH content, but this metric is a poor proxy for homozygosity of the recessive alleles at causal loci. Therefore, searching for the association between phenotype and specific variants in the homozygous state is a key next step towards discovering and understanding disease genes/alleles with recessive effects. That said, I have some concerns regarding the power and error rate of the methods, for both identification of ROH diplotype clusters and subsequent association mapping. In addition, some of the newly identified associations need further validation and careful consideration of potential artifacts (such as cryptic relatedness and environment sharing).

      (1) Identification of ROH diplotype clusters.<br /> The practice of randomly assigning heterozygous sites to a homozygous state is expected to introduce errors, leading to both false positives and false negatives. An advantage that the authors claim for this practice is to reduce false negatives due to occasional mismatch (possibly due to genotyping error, or mutation), but it's unclear how much the false positive rate is reduced compared to traditional ROH detection algorithm. The authors also justified the "random allele drawing" practice by arguing that "the rate of false positives should be low" for long ROH segments, which is likely true but is not backed up with quantitative analysis. As a result, it is unclear whether the trade-off between reducing FNs and introducing FPs makes the practice worthwhile (compared to calling ROHs in each individual with a standard approach first followed by scanning for shared diplotypes across individuals using BWT). I would like to see a combination of back-of-envelope calculation, simulation (with genotyping errors), and analysis of empirical data that characterize the performance of the proposed method.

      In particular, I find the high number of ROH clusters in MHC alarming, and I am not convinced that this can be fully explained by a high density of SNPs and low recombination rate in this region. The authors may provide further support for their hypothesis by examining the genome-wide relationship between ROH cluster abundance and local recombination rate (or mutation rate).

      (2) Power of ROH association. Given that the authors focused on long segments only (which is a limitation of the current method), I am concerned about the power of the association mapping strategy, because only a small fraction of causal alleles are expected to be present in long, homozygous haplotypes shared by many individuals. It would be useful to perform a power analysis to estimate what fraction of true causal variants with a given effect size can be detected with the current method. To demonstrate the general utility of this method, the authors also need to characterize the condition(s) under which this method could pick up association signals missed by standard GWAS with recessive effects considered. I suspect some variants with truly additive effects can also be picked up by the ROH association, which should be discussed in the manuscript to guide the interpretation of results.

      (3) False positives of ROH association. GWAS is notoriously prone to confounding by population and environmental stratification. Including leading principal components in association testing alleviates this issue but is not sufficient to remove the effects of recent demographic structure and local environment (Zaidi and Mathieson 2020 eLife). Similar confounding likely applies to homozygosity mapping and should be carefully considered. For example, it is possible that individuals who share a lot of ROH diplotypes tend to be remotely related and live near each other, thus sharing similar environments. Such scenarios need to be excluded to further support the association signals.

      (4) Validation of significant associations. It is reassuring that some of the top associations are indirectly corroborated by significant GWAS associations between the same disease and individual SNPs present in the ROH region (Tables 1 and 2). However, more sanity checks should be done to confirm consistency in direction of effect size (e.g., risk alleles at individual SNPs should be commonly present in risk-increasing ROH segment, and vice versa) and the presence of dominance effect.

    3. Reviewer #2 (Public Review):

      The authors have proposed a computational algorithm to identify runs of homozygosity (ROH) segments in a generally outbred population and then study the association of ROH with self-reported disorders in the UK biobank. The algorithm certainly identifies such segments. However, more work is needed to justify the importance of ROH.

    4. Reviewer #3 (Public Review):

      A classic method to detect recessive disease variants is homozygosity mapping, where affected individuals in a pedigree are scanned for the presence of runs of homozygosity (ROH) intersecting in a given region. The method could in theory be extended to biobanks with large samples of unrelated individuals; however, no efficient method was available (to the best of my knowledge) for detecting overlapping clusters of ROH in such large samples. In this paper, the authors developed such a method based on the PBWT data structure. They applied the method to the UK biobank, finding a number of associations, some of them not discovered in single SNP associations.

      Major strengths:<br /> • The method is innovative and algorithmically elegant and interesting. It achieves its purpose of efficiently and accurately detecting ROH clusters overlapping in a given region. It is therefore a major methodological advance.<br /> • The method could be very useful for many other researchers interested in detecting recessive variants associated with any phenotype.<br /> • The statistical analysis of the UK biobank data is solid and the results that were highlighted are interesting and supported by the data.

      Major weaknesses:<br /> • The positions and IDs of the ROH clusters in the UK biobank are not available for other researchers. This means that other researchers will not be able to follow up on the results of the present paper.<br /> • The vast majority of the discoveries were in regions already known to be associated with their respective phenotypes based on standard GWAS.<br /> • The running time seems rather long (at least for the UK biobank), and therefore it will be difficult for other researchers to extensively experiment with the method in very large datasets. That being said, the method has a linear running time, so it is already faster than a naïve algorithm.

    5. Author response:

      Reviewer #1 (Public Review):

      In this manuscript, Naseri et al. present a new strategy for identifying human genetic variants with recessive effects on disease risk by the genome-wide association of phenotype with long runs-of-homozygosity (ROH). The key step of this approach is the identification of long ROH segments shared by many individuals (termed "shared ROH diplotype clusters" by the authors), which is computationally intensive for large-scale genomic data. The authors circumvented this challenge by converting the original diploid genotype data to (pseudo-)haplotype data and modifying the existing positional Burrow-Wheeler transformation (PBWT) algorithms to enable an efficient search for haplotype blocks shared by many individuals. With this method, the authors identified over 1.8 million ROH diplotype clusters (each shared by at least 100 individuals) and 61 significant associations with various non-cancer diseases in the UK Biobank dataset.

      Overall, the study is well-motivated, highly innovative, and potentially impactful. Previous biobank-based studies of recessive genetic effects primarily focused on genome-wide aggregated

      ROH content, but this metric is a poor proxy for homozygosity of the recessive alleles at causal loci. Therefore, searching for the association between phenotype and specific variants in the homozygous state is a key next step towards discovering and understanding disease genes/alleles with recessive effects. That said, I have some concerns regarding the power and error rate of the methods, for both identification of ROH diplotype clusters and subsequent association mapping. In addition, some of the newly identified associations need further validation and careful consideration of potential artifacts (such as cryptic relatedness and environment sharing).

      1) Identification of ROH diplotype clusters.

      The practice of randomly assigning heterozygous sites to a homozygous state is expected to introduce errors, leading to both false positives and false negatives. An advantage that the authors claim for this practice is to reduce false negatives due to occasional mismatch (possibly due to genotyping error, or mutation), but it's unclear how much the false positive rate is reduced compared to traditional ROH detection algorithm. The authors also justified the "random allele drawing" practice by arguing that "the rate of false positives should be low" for long ROH segments, which is likely true but is not backed up with quantitative analysis. As a result, it is unclear whether the trade-off between reducing FNs and introducing FPs makes the practice worthwhile (compared to calling ROHs in each individual with a standard approach first followed by scanning for shared diplotypes across individuals using BWT). I would like to see a combination of back-of-envelope calculation, simulation (with genotyping errors), and analysis of empirical data that characterize the performance of the proposed method.

      In particular, I find the high number of ROH clusters in MHC alarming, and I am not convinced that this can be fully explained by a high density of SNPs and low recombination rate in this region. The authors may provide further support for their hypothesis by examining the genome-wide relationship between ROH cluster abundance and local recombination rate (or mutation rate).

      Thanks for this insightful comment. Through additional experiments, we confirmed that the excessive number of ROH clusters in the MHC region is due to the higher density of markers per centimorgan. As discussed above at Essential Revision 2, we took this opportunity to modify our code to search for clusters with the minimum length in terms of cM instead of sites. We have also provided the genetic distance for reported clusters in the MHC region with significant association (genetic length (cM) column in Tables 1 and 2). We include the following in the main text:

      “We searched for ROH clusters using a minimum target length of 0.1 cM (Figure 3–figure supplement 1). As shown in the figure, there is no excessive number of ROH clusters in chromosome 6 as was spotted using a minimum number of variant sites.”

      Methods section, ROH algorithm subsection:

      “We implemented ROH-DICE to allow direct use of genetic distances in addition to variant sites for L. The program can take minimum target length L directly in cM and detect all ROH clusters greater than or equal to the target length in cM. The program holds a genetic mapping table for all the available sites, and cPBWT was modified to work directly with the genetic length instead of the number of sites.”

      2) Power of ROH association. Given that the authors focused on long segments only (which is a limitation of the current method), I am concerned about the power of the association mapping strategy, because only a small fraction of causal alleles are expected to be present in long, homozygous haplotypes shared by many individuals. It would be useful to perform a power analysis to estimate what fraction of true causal variants with a given effect size can be detected with the current method. To demonstrate the general utility of this method, the authors also need to characterize the condition(s) under which this method could pick up association signals missed by standard GWAS with recessive effects considered. I suspect some variants with truly additive effects can also be picked up by the ROH association, which should be discussed in the manuscript to guide the interpretation of results.

      We added a new experiment in the Results section “Evaluation of ROH clusters in simulated data” under Power of ROH-DICE in association studies. We compared the power of the ROH cluster with additive, recessive, and dominant models. Our simulation shows that using ROH clusters outperforms standard GWAS when a phenotype is associated with a set of consecutive homozygous sites. We added the following text:

      “...We calculated the p-values for both ROH clusters and all variant sites. We used a p-value cut-off of 0.05 divided by the number of tests for each phenotype to determine whether the calculated p-value was smaller than the threshold, indicating an association. For GWAS, only one variant site within the ROH cluster, contributing to the phenotype, was required. We tested for all additive, dominant, and recessive effects (Figure 1–figure supplement 3). The figure demonstrates that ROH-DICE outperforms GWAS when a phenotype is associated with a set of consecutive homozygous sites. The maximum effect size of 0.3 resulted in ROH clusters achieving a power of 100%, whereas the additive model only achieved 11%, and the dominant and recessive models achieved 52% and 70%, respectively. The GWAS with recessive effect yields the best results among other GWAS tests, however, its power is still lower than using ROH clusters.”

      3) False positives of ROH association. GWAS is notoriously prone to confounding by population and environmental stratification. Including leading principal components in association testing alleviates this issue but is not sufficient to remove the effects of recent demographic structure and local environment (Zaidi and Mathieson 2020 eLife). Similar confounding likely applies to homozygosity mapping and should be carefully considered. For example, it is possible that individuals who share a lot of ROH diplotypes tend to be remotely related and live near each other, thus sharing similar environments. Such scenarios need to be excluded to further support the association signals.

      We acknowledge that there could be confounding factors that may affect the association's results. To address this, we utilized principal component (PC) values and additional covariates while using PHESANT after our initial Chi-square tests. We also included your comments in our Discussion section:

      "We used age, gender, and genetic principal components as confounding variables in the association analysis. Genetic principal components can reduce the confounding effect brought on by population structure but it may be insufficient to completely eliminate the effects of recent demographic structure and the local environment45. For example, individuals sharing excessive ROH diplotypes may share similar environments since they are closely related and reside close to one another. Since we did not rule out related individuals, some of the reported GWAS signals may not be attributable to ROH.”

      4) Validation of significant associations. It is reassuring that some of the top associations are indirectly corroborated by significant GWAS associations between the same disease and individual SNPs present in the ROH region (Tables 1 and 2). However, more sanity checks should be done to confirm consistency in direction of effect size (e.g., risk alleles at individual SNPs should be commonly present in risk-increasing ROH segment, and vice versa) and the presence of dominance effect.

      The beta values for effect size are now included in all reported tables. All beta values for ROH-DICE are positive indicating carriers of these ROH diplotypes may increase the risk of certain non-cancerous diseases. Moreover, we conducted the suggested sanity check to confirm the consistency of the direction of risk-inducing ROH diplotypes and risk alleles.

      We also computed D’ as a measure of linkage between the reported GWAS results and ROH clusters. We found that most of the GWAS results and ROH clusters are strongly correlated. However, in a few cases, D' is small or close to zero. In such cases, the reported p-value from GWAS was also insignificant, while the ROH cluster indicated a significant association. We included these points in the Results section.

      Reviewer #3 (Public Review):

      A classic method to detect recessive disease variants is homozygosity mapping, where affected individuals in a pedigree are scanned for the presence of runs of homozygosity (ROH) intersecting in a given region. The method could in theory be extended to biobanks with large samples of unrelated individuals; however, no efficient method was available (to the best of my knowledge) for detecting overlapping clusters of ROH in such large samples. In this paper, the authors developed such a method based on the PBWT data structure. They applied the method to the UK biobank, finding a number of associations, some of them not discovered in single SNP associations.

      Major strengths:

      •           The method is innovative and algorithmically elegant and interesting. It achieves its purpose of efficiently and accurately detecting ROH clusters overlapping in a given region. It is therefore a major methodological advance.

      •           The method could be very useful for many other researchers interested in detecting recessive variants associated with any phenotype.

      •           The statistical analysis of the UK biobank data is solid and the results that were highlighted are interesting and supported by the data.

      Major weaknesses:

      •           The positions and IDs of the ROH clusters in the UK biobank are not available for other researchers. This means that other researchers will not be able to follow up on the results of the present paper.

      We included the SNP IDs, positions, and consensus alleles for all reported loci in the main tables. Moreover, additional information including beta and D’ values were added. The current information should allow researchers to follow up on the results. Supplementary File 2 contains beta, D’ values for all reported clusters.

      Supplementary File 3 contains the SNP IDs and consensus alleles for all reported clusters in Tables 1 and 2. The consensus allele denotes the allele with the highest occurrence in the reported clusters.

      •           The vast majority of the discoveries were in regions already known to be associated with their respective phenotypes based on standard GWAS.

      We agree that a majority of the ROH regions are indeed consistent with GWAS. However, some regions were missed by standard GWAS (e.g. chr6:25969631-26108168, hemochromatosis). Our message is that our method is a complementary approach to standard GWAS and will not replace standard GWAS analysis. See our response to Reviewer #2 Point Six.

      •           The running time seems rather long (at least for the UK biobank), and therefore it will be difficult for other researchers to extensively experiment with the method in very large datasets. That being said, the method has a linear running time, so it is already faster than a naïve algorithm.

      Thank you for your input. The algorithm used to locate matching blocks is efficient and the total CPU hours it consumed was the reported run time. Since it consumes very little memory and resources, it can be executed simultaneously for all chromosomes. We also noticed that a significant time was being spent parsing the input file and slightly modified our script to improve the parsing. We also re-ran it for all chromosomes in parallel and reported the elapsed time which was only 18 hours and 54 minutes.

      “This was achieved by running the ROH-DICE program, with a wall clock time of 18 hours and 54 minutes where the program was executed for all chromosomes in parallel (total CPU hours of ~ 242.5 hours). The maximum residence size for each chromosome was approximately 180 MB.”

    1. Author response;

      Reviewer #1 (Public Review):

      Authors investigated the role of OBOX4 in the zygotic genome activation (ZGA) in mice. Obox4 genes form an array of duplicated genes they were identified as a candidate ZGA factor based on expression patterns during early development. The role of OBOX4 was subsequently studied in embryonic stem cells and early embryos. It was found that transcriptional activation mediated by OBOX4 has similar features as that of DUX, which was previously identified as a zygotic transcription factor involved in ZGA and a major activator of the zygotic expression program. It was, however, unexpected that Dux knock-out did not impair embryonic development. The work by Guo et al. provides several lines of evidence that OBOX4-mediated activation of gene expression considerably overlaps with that of DUX and this redundancy might explain the loss of early developmental phenotype in Dux mutants. Consistent with this model, double mutants of Obox4 and Dux show impaired development. Given the difficulties with investigating details of the genetic model in double mutants at the preimplantation embryo stage, authors not only crossed genetic mutants, but also used (1) nuclear transfer of mutated nuclei of ESCs, which could be characterized on their own in separate experiments, and (2) antisense oligonucleotides (ASO) microinjection, which included a rescue control demonstrating that reintroducing OBOX4 is sufficient to rescue the phenotype caused by blocking both, Dux and Obox4.

      This work is important for the field because it reveals functional redundancy and plasticity of the zygotic genome activation in mammals, where the mouse model stands as a remarkable example of genome activation, which massively integrated long terminal repeat (LTR)-derived enhancers from retrotransposons and now two of the key activating zygotic factors appear to be encoded by tandemly duplicated clusters of different phylogenetic age. Identification of OBOX4 as a second factor partially redundant with DUX now allows us to decipher what constitutes the essential part of the ZGA program.

      We are grateful for the reviewer’s appreciation of our work, particularly the technical difficulty of knocking out two multicopy genes and the value of the rescue experiment.

      Reviewer #2 (Public Review):

      In this study, Guo et al., screened a few homeobox transcription factors and identified that Obox4 can induce the 2-cell like state in mouse embryonic stem cells (mESCs) (Fig. 1 and 2). The authors also compared in detail how Obox4 vs. Dux in activating 2C repeats and genes in mESCs (Fig. 3). Compared to Dux, Obox4 activates fewer 2C genes (Fig. 2). In addition, although both Obox4 and Dux bind to MERVL elements, Obox4 additionally binds to ERVK (Fig. 3). The authors then used three different approaches (i.e., SCNT-mediated KO, ASO-mediated KD, and genetic KO) to study how Obox4 and Dux regulates zygotic genome activation in embryos. Although there are some inconsistencies among different approaches, the authors were able to show that loss of both Obox4 and Dux causes more severe consequences than loss of single protein in embryonic development and zygotic genome activation (Fig. 4 and 5).

      Overall, this is a comprehensive study that addresses an important question that puzzles the community. However, some comparisons to the recent work by Ji et al (PMID: 37459895) are highly recommended. Ji et al knocked out the entire Obox cluster (including Obox4) in mice and found that Obox cluster KO causes 2-4 cell arrest without affecting Dux. That said, Obox proteins seem more critical than Dux in regulating ZGA, and Obox cluster KO cannot be compensated by Dux. Ji et al., also reported that maternal (Obox1, 2, 5, 7) and zygotic (Obox3, 4) Obox proteins redundantly regulate embryogenesis because loss of either is compatible to development. Consistent with Ji's work, Obox4 KO embryos generated in this study can develop to adulthood and are fertile. Since these two studies are highly relevant, some comparisons of Obox4 KO and Obox4/Dux DKO with the previous Obox cluster KO will greatly benefit the community.

      We thank the reviewer for appreciating the value of our study. We are aware of the work done to high standard by Ji et al. and have included a comparison between our data and the data by Ji et al. in the revised manuscript. Despite repeated attempts, various crossing strategies failed to produce Obox4KO/DuxKO mating pairs that could be used to produce large number of Obox4KO/DuxKO embryos required for in-depth transcriptome analysis. Based on the quality of the RNA-seq, we decided to perform comparative analysis using our ASO KD data and showed that Obox4 has distinct regulatory targets from those of other Obox family members, which is consistent with the phylogenetic distance within the family.

    2. eLife assessment

      This study presents an important finding that Obox4 and Dux act redundantly in regulating zygotic genome activation in mice. The evidence supporting the claims of the authors is solid. The work will be of interest to researchers interested in early embryo development and epigenetic reprogramming.

    3. Reviewer #1 (Public Review):

      Authors investigated the role of OBOX4 in the zygotic genome activation (ZGA) in mice. Obox4 genes form an array of duplicated genes they were identified as a candidate ZGA factor based on expression patterns during early development. The role of OBOX4 was subsequently studied in embryonic stem cells and early embryos. It was found that transcriptional activation mediated by OBOX4 has similar features as that of DUX, which was previously identified as a zygotic transcription factor involved in ZGA and a major activator of the zygotic expression program. It was, however, unexpected that Dux knock-out did not impair embryonic development. The work by Guo et al. provides several lines of evidence that OBOX4-mediated activation of gene expression considerably overlaps with that of DUX and this redundancy might explain the loss of early developmental phenotype in Dux mutants. Consistent with this model, double mutants of Obox4 and Dux show impaired development. Given the difficulties with investigating details of the genetic model in double mutants at the preimplantation embryo stage, authors not only crossed genetic mutants, but also used (1) nuclear transfer of mutated nuclei of ESCs, which could be characterized on their own in separate experiments, and (2) antisense oligonucleotides (ASO) microinjection, which included a rescue control demonstrating that reintroducing OBOX4 is sufficient to rescue the phenotype caused by blocking both, Dux and Obox4.

      This work is important for the field because it reveals functional redundancy and plasticity of the zygotic genome activation in mammals, where the mouse model stands as a remarkable example of genome activation, which massively integrated long terminal repeat (LTR)-derived enhancers from retrotransposons and now two of the key activating zygotic factors appear to be encoded by tandemly duplicated clusters of different phylogenetic age. Identification of OBOX4 as a second factor partially redundant with DUX now allows us to decipher what constitutes the essential part of the ZGA program.

    4. Reviewer #2 (Public Review):

      In this study, Guo et al., screened a few homeobox transcription factors and identified that Obox4 can induce the 2-cell like state in mouse embryonic stem cells (mESCs) (Fig. 1 and 2). The authors also compared in detail how Obox4 vs. Dux in activating 2C repeats and genes in mESCs (Fig. 3). Compared to Dux, Obox4 activates fewer 2C genes (Fig. 2). In addition, although both Obox4 and Dux bind to MERVL elements, Obox4 additionally binds to ERVK (Fig. 3). The authors then used three different approaches (i.e., SCNT-mediated KO, ASO-mediated KD, and genetic KO) to study how Obox4 and Dux regulates zygotic genome activation in embryos. Although there are some inconsistencies among different approaches, the authors were able to show that loss of both Obox4 and Dux causes more severe consequences than loss of single protein in embryonic development and zygotic genome activation (Fig. 4 and 5).

      Overall, this is a comprehensive study that addresses an important question that puzzles the community. However, some comparisons to the recent work by Ji et al (PMID: 37459895) are highly recommended. Ji et al knocked out the entire Obox cluster (including Obox4) in mice and found that Obox cluster KO causes 2-4 cell arrest without affecting Dux. That said, Obox proteins seem more critical than Dux in regulating ZGA, and Obox cluster KO cannot be compensated by Dux. Ji et al., also reported that maternal (Obox1, 2, 5, 7) and zygotic (Obox3, 4) Obox proteins redundantly regulate embryogenesis because loss of either is compatible to development. Consistent with Ji's work, Obox4 KO embryos generated in this study can develop to adulthood and are fertile. Since these two studies are highly relevant, some comparisons of Obox4 KO and Obox4/Dux DKO with the previous Obox cluster KO will greatly benefit the community.

    1. Author response:

      A general comment was that this study left several key questions unanswered, in particular the causal mechanism for the reported ribosomal distributions. We have been interested in the evolution of asymmetric bacterial growth and aging for many years. However, a motivational difference is that we are more interested in the evolutionary process, and evolution by natural selection works on the phenotype. Thus, we wanted to start with the phenotype closest to fitness, appropriately defined for the conditions, work downwards. We examined first the asymmetry of elongation rates in single cells, then gene products, and now ribosomes. As we have pointed out, our demonstration of ribosomal asymmetry shows that the phenomenon was not peculiar and unique to the gene products we examined. Rather, the asymmetry is acting higher up in the metabolic network and likely affecting all genes. We find such conceptual guidance to be important. In the ideal world, of course we would have liked to have worked out the causal mechanisms in one swoop. In a less than ideal situation, it is a subjective decision as where to stop. We believe that the publication of this manuscript is more than appropriate at this juncture. We work at the interface of evolutionary theory and microbiology. Our results could appeal to both fields. If we attract new researchers, progress could be accelerated. Could the delay caused by publishing only completed stories slow the rate of discovery? These questions are likely as old as science (e.g., https://telliamedrevisited.wordpress.com/2021/01/28/how-not-to-write-a-response-to-reviewers/).

      We present below our response to specific comments by reviewers. We have not added a new discussion of papers suggested by Reviewer #1 because we feel that the speculations would have been too unfocused. We were already criticized for speculation in the Discussion about a link between aggregate size and ribosomal density.

      Respond to Major comments by Reviewer #1.

      (a) Fig. 1 only shows 2 divisions (rather than 3 as per Rev1) to avoid an overly elaborate figure. We have added text to the figure legend that the old and new poles and daughters in the subsequent 3, 4, 5, 6, and 7 generations can be determined by following the same notations and tracking we presented for generations 1 and 2 in Fig. 1. For example, if we know the old and new poles of any of the four daughters after 2 divisions (as in Fig. 1), and allow that daughter to elongate, become a mother, and divide to produce 2 “grand-daughters”, the polarity of the grand-daughters can also be determined.

      (b) Because division times were normalized and analyzed as quartiles, the raw values were never used. Rather than annotating unused values, we have provided the mean division times in the Material and Methods section on normalization to provide representative values.

      (c) We did not quantify in our study the changes over generations for three reasons. First, the sample sizes for the first generations (cohorts of 1, 2, 4, and 8 cells) are statistically small. Second, and most importantly, cells on an agar pad in a microscope slide, despite being inoculated as fresh exponentially growing cells, experience a growth lag, as all cells transferred to a new physiological condition. Thus, to be safe, we do not collect data from cohorts 1, 2, 4, and 8 to ensure that our cells are as much as possible physiologically uniform. Lastly, as we noted in the Material and Methods they also slow down after 7 generations (128 cells). Thus, we have collected ribosome and length measurements primarily from cohorts 16, 32, 64, and 128. Measurable cells from the 128 cohort are actually rare because a colony with that many cells often starts to form double layers, which are not measurable. Most of our measurements came from the 16, 32, and 64 cohorts, in which case a time series would not be meaningful. Some of these details were not included in our manuscript but have been added to the Material and Methods (Microscopy and time-lapse movies). For these reasons we have not added a time series as requested by the reviewer.

      (d) We have added the additional figure as requested, but as a supplement rather than in the main article (Supplemental Materials Fig. S1). This figure showed the normalized density of ribosomes along the normalized length of old and new daughters. The density was continuous rather than quartiles. This figure was included in the original manuscript, but readers recommended that it be removed because the all the analyzed data had been done with quartiles. Readers felt mislead and confused.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      We greatly appreciate the comments from the editor and the reviewers, based on which we have made the revisions. We have responded to all the questions and summarized the revisions below. The changes are also highlighted in the manuscript.

      Additionally, we’ve noticed a few typos in the manuscript presented on the eLife website, which were not there in our originally submitted file.

      (1) In both the “Full text” presented on the eLife website and the pdf file generated after clicking “Download”: the last FC1000 in the second paragraph of the “Extensive induction curves fitting of TetR mutants” section should be FC1000WT .

      (2) In the pdf file generated after clicking “Download”: the brackets are all incorrectly formatted in the captions of Figure 4 and Figure 3—figure supplement 6.

      eLife assessment

      The fundamental study presents a two-domain thermodynamic model for TetR which accurately predicts in vivo phenotype changes brought about as a result of various mutations. The evidence provided is solid and features the first innovative observations with a computational model that captures the structural behavior, much more than the current single-domain models.

      We appreciate the supportive comments by the editor and reviewers.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors’ earlier deep mutational scanning work observed that allosteric mutations in TetR (the tetracycline repressor) and its homologous transcriptional factors are distributed across the structure instead of along the presumed allosteric pathways as commonly expected. Especially, in addition, the loss of the allosteric communications promoted by those mutations, was rescued by additional distributed mutations. Now the authors develop a two-domain thermodynamic model for TetR that explains these compelling data. The model is consistent with the in vivo phenotypes of the mutants with changes in parameters, which permits quantification. Taken together their work connects intra- and inter-domain allosteric regulation that correlate with structural features. This leads the authors to suggest broader applicability to other multidomain allosteric proteins. Here the authors follow their first innovative observations with a computational model that captures the structural behavior, aiming to make it broadly applicable to multidomain proteins. Altogether, an innovative and potentially useful contribution.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      None that I see, except that I hope that in the future, if possible, the authors would follow with additional proteins to further substantiate the model and show its broad applicability. I realize however the extensive work that this would entail.

      We thank the reviewer for the supportive comments and the suggestion to extend the model to other proteins, which we indeed plan to pursue in future studies.

      Reviewer #2 (Public Review):

      Summary:

      This combined experimental-theoretical paper introduces a novel two-domain statistical thermodynamic model (primarily Equation 1) to study allostery in generic systems but focusing here on the tetracycline repressor (TetR) family of transcription factors. This model, building on a function-centric approach, accurately captures induction data, maps mutants with precision, and reveals insights into epistasis between mutations.

      Strengths:

      The study contributes innovative modeling, successful data fitting, and valuable insights into the interconnectivity of allosteric networks, establishing a flexible and detailed framework for investigating TetR allostery. The manuscript is generally well-structured and communicates key findings effectively.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      The only minor weakness I found was that I still don’t have a better sense into (a) intuition and (b) mathematical derivation of Equation 1, which is so central to the work. I would recommend that the authors provide this early on in the main text.

      We thank the reviewer for the suggestion. The full mathematical derivation of Equation 1 is given in the first section of the supplementary file. Given the length of the derivation, we think it’s better to keep it in the supplementary file rather than the main text. In the main text, the first subsection (overview of the two-domain thermodynamic model of allostery) of the Results section and the paragraph right before Equation 1 are meant for providing intuitive understandings of the two-domain model and the derivation of Equation 1, respectively.

      We would also like to point the reviewer to Figure 2-figure supplement 2 and Equations (12) to (18) in the supplementary file for an alternative derivation. They show that the equilibria among all molecular species containing the operator are dictated by the binding free energies, the ligand concentration, and the allosteric parameters. The probability of an unbound operator (proportional to the probability that the promoter is bound by a RNA polymerase, or the gene expression level) can thus be calculated using Equation (12), which then leads to main text Equation 1 following the derivation given there.

      Additionally, we’ve added a paragraph to the main text (line 248-260) to aid an intuitive understanding of Equation 1.

      “The distinctive roles of the three biophysical parameter on the induction curve as stipulated in Equation 1 could be understood in an intuitive manner as well. First, the value of εD controls the intrinsic strength of binding of TetR to the operator, or the intrinsic difficulty for ligand to induce their separation. Therefore, it controls how tightly the downstream gene is regulated by TetR without ligands (reflected in leakiness) and affects the performance limit of ligands (reflected in saturation). Second, the value of εL controls how favorable ligand binding is in free energy. When εL increases, the binding of ligand at low concentrations become unfavorable, where the ligands cannot effectively bind to TetR to induce its separation from the operator. Therefore, the fold-change as a function of ligand concentration only starts to noticeably increase at higher ligand concentrations, resulting in larger EC50. Third, as discussed above, γ controls the level of anti-cooperativity between the ligand and operator binding of TetR, which is the basis of its allosteric regulation. In other words, γ controls how strongly ligand binding is incompatible with operator binding for TetR, hence it controls the performance limit of ligand (reflected in saturation).”

      We hope that the reviewer will find this explanation helpful.

      Reviewer #3 (Public Review):

      Summary:

      Allosteric regulations are complicated in multi-domain proteins and many large-scale mutational data cannot be explained by current theoretical models, especially for those that are neither in the functional/allosteric sites nor on the allosteric pathways. This work provides a statistical thermodynamic model for a two-domain protein, in which one domain contains an effector binding site and the other domain contains a functional site. The authors build the model to explain the mutational experimental data of TetR, a transcriptional repress protein that contains a ligand and a DNA-binding domain. They incorporate three basic parameters, the energy change of the ligand and DNA binding domains before and after binding, and the coupling between the two domains to explain the free energy landscape of TetR’s conformational and binding states. They go further to quantitatively explain the in vivo expression level of the TetR-regulated gene by fitting into the induction curves of TetR mutants. The effects of most of the mutants studied could be well explained by the model. This approach can be extended to understand the allosteric regulation of other two-domain proteins, especially to explain the effects of widespread mutants not on the allosteric pathways. Strengths: The effects of mutations that are neither in the functional or allosteric sites nor in the allosteric pathways are difficult to explain and quantify. This work develops a statistical thermodynamic model to explain these complicated effects. For simple two-domain proteins, the model is quite clean and theoretically solid. For the real TetR protein that forms a dimeric structure containing two chains with each of them composed of two domains, the model can explain many of the experimental observations. The model separates intra and inter-domain influences that provide a novel angle to analyse allosteric effects in multi-domain proteins.

      We thank the reviewer for the supportive comments.

      Weaknesses:

      As mentioned above, the TetR protein is not a simple two-main protein, but forms a dimeric structure in which the DNA binding domain in each chain forms contacts with the ligand-binding domain in the other chain. In addition, the two ligand-binding domains have strong interactions. Without considering these interactions, especially those mutants that are on these interfaces, the model may be oversimplified for TetR.

      We thank the reviewer for this valid concern and acknowledge that TetR is a homodimer. However, we’ve deliberately chosen to simplify this complexity in our model for the following reasons.

      (1) In this work, we aim to build a minimalist model for two-domain allostery withonly the most essential parameters for capturing experimental data. The simplicity of the model helps promote its mechanistic clarity and potential transferability to other allosteric systems.

      (2) Fewer parameters are needed in a simpler model. Our two-domain modelcurrently uses only three biophysical parameters, which are all demonstrated to have distinct influences on the induction curve (see the main text section “System-level ramifications of the two-domain model”). This enables the inference of parameters with high precision for the mutants, and the quantification of the most essential mechanistic effects of their mutations, provided that the model is shown to accurately recapitulate the comprehensive dataset. Thus, we found it was unnecessary to add another parameter for explicitly describing inter-chain coupling, which would likely incur uncertainty in the inference of parameters due to the redundancy of their effects on induction data, and prevent the model from making faithful predictions.

      (3) From a more biological point of view, TetR is an obligate dimer, meaning thatthe two chains must synchronize for function, supporting the two-domain simplification of TetR for binding concerns.

      Additionally, as shown in the subsection “Inclusion of single-ligand-bound state of repressor” of section 1 of the supplementary file, incorporating the dimeric nature of TetR in our model by allowing partial ligand binding does not change the functional form of main text equation 1 in any practical sense. Therefore, considering all the factors stated above, we think that increasing the complexity of the two-domain model will only be necessary if additional data emerge to suggest the limitation of our model.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      This is an excellent work. I have only one suggestion for the authors. Interestingly, the authors also note that the epistatic interactions that they obtain are consistent with the structural features of the protein, which is not surprising. Within this framework, have the authors considered rescue mutations? Please see for example PMID: 18195360 and PMID: 15683227. If I understand right, this might further extend the applicability of their model. If so, the authors may want to add a comment to that effect.

      We thank the reviewer for the supportive comments and for pointing us to the useful references. We have added some comments to the main text regarding this point in line 332-336: “The diverse mechanistic origins of the rescuing mutations revealed here provide a rational basis for the broad distributions of such mutations. Integrating such thermodynamic analysis with structural and dynamic assessment of allosteric proteins for efficient and quantitative rescuing mutation design could present an interesting avenue for future research, particularly in the context of biomedical applications (PMID: 18195360, PMID: 15683227).”

      Reviewer #3 (Recommendations For The Authors):

      The authors should try to build a more realistic dimeric model for TetR to see if it could better explain experimental data. If it were too complicated for a revision, more discussions on the weakness of the current model should be given.

      We thank the reviewer for this valid concern and for the suggestion. The reasons for refraining from increasing the complexity of the model are fully discussed in our response to the reviewer’s public review given above. Primarily, we think that the value of a simple physical model is two-fold (e.g., the paradigm Ising model in statistical physics and the classic MWC model), first, its mechanistic clarity and potential transferability makes it a useful conceptual framework for understanding complex systems and establishing universal rules by comparing seemingly unrelated phenomena; second, it provides useful insights and design principles of specific systems if it can quantitatively capture the corresponding experimental data. Thus, given the current experimental data set, we believe it is justified to keep the two-domain model in its current form, while additional experimental data could necessitate a more complex model for TetR allostery in the future. Relevant discussions are added to the main text (line 443-446) and section 8 of the supplementary file.

      “It’s noted that the homodimeric nature of TetR is ignored in the current two-domain model to minimize the number of parameters, and additional experimental data could necessitate a more complex model for TetR allostery in the future (see supplementary file section 8 for more discussions).”

      Minor issues:

      (1) There is an error in Figure 3A, the 13th and 14th subgraphs are the same and should be corrected.

      We thank the reviewer for capturing this error, which has been corrected in the revised manuscript.

      (2) The criteria for the selection of mutants for analysis should be clearly given. Apart from deleting mutants that are in direct contact with the ligand of DNA, how many mutants are left, and how far are they are from the two sites? In line 257, what are the criteria for selecting these 15 mutants? Similarly, in line 332, what are the criteria for selecting these 8 mutants?

      We thank the reviewer for this comment. The data selection criteria are now added in section 7 of the supplementary file. The distances to the DNA operator and ligand of the 21 residues under mutational study are now added in Table 1 (Figure 3-figure supplement 9). The added materials are referenced in the main text where relevant.

      “7. Mutation selection for two-domain model analysis

      In this work, there are 24 mutants studied in total including the WT, and they contain mutations at 21 WT residues. We did not perform model parameter inference for the mutant G102D because of its flat induction curve (see the second subsection of section 2 and main text Figure 2—figure Supplement 3). Therefore, there are 23 mutants analyzed in main text Figure 5.

      Measuring the induction curve of a mutant involves a significant amount of experimental effort, which therefore is hard to be extended to a large number of mutants. Nonetheless, we aim to compose a set of comprehensive induction data here for validating our two-domain model for TetR allostery. To this end, we picked 15 individual mutants in the first round of induction curve measurements, which contains mutations spanning different regions in the sequence and structure of TetR (main text Figure 3—figure Supplement 1). Such broad distribution of mutations across LBD, DBD and the domain interface could potentially lead to diverse induction curve shapes and mutant phenotypes for validating the two-domain model. Indeed, as discussed in the main text section "Extensive induction curves fitting of TetR mutants", the diverse effects on induction curve from mutations perturbing different allosteric parameters predicted by the model, are successfully observed in these 15 experimental induction curves. Additionally, 5 of the 15 mutants contain a dead-rescue mutation pair, which helps us validate the model prediction that a dead mutation could be rescued by rescuing mutations that perturb the allosteric parameters in various ways.

      Eight mutation combinations were chosen for the second round of induction curve measurement for studying epistasis, where we paired up C203V and Y132A with mutations from different regions of the TetR structure. Such choice is largely based on two considerations. 1. As both C203V and Y132A greatly enhance the allosteric response of TetR, we want to probe why they cannot rescue a range of dead mutations as observed previously (PMID: 32999067). 2. C203V and Y132A are the only two mutants that show enhanced allosteric response in the first round of analysis. Combining detrimental mutations of allostery in a combined mutant could potentially lead to near flat induction curve, which is less useful for inference (see the second subsection of section 2).”

      Since the number of hotspots identified by DMS is not very large, why not analyze them all?

      We thank the reviewer for this comment. There are 41 hotspot residues in TetR (PMID: 36226916), which have 41*19=779 possible single mutations. It’s unfeasible to perform induction curve measurements for all of these 779 mutants in our current experiment. However, we agree that it would be helpful if we can obtain such a dataset in an efficient way.

      In line 257, there are 15 mutants mentioned, while in Figure 5, there are 23 mutants mentioned, in Figure 3-figure supplement 1, there are 21 mutants mentioned, and in line 226 of the supplementary file, there are 24 mutants mentioned, which is very confusing. Therefore, the data selection criteria used in this article should be given.

      We thank the reviewer for this comment. The data selection criteria are now given in section 7 of the supplementary file, which should clarify this confusion.

      (3) In Figure 4 of the Exploring epistasis between mutations section, the 6 weights of the additive models corresponding to each mutation combination are different. On one hand, it seems that there are no universal laws in these experimental data. On the other hand, unique parameters of a single mutation combination were not validated in other mutation combinations, which somewhat weakened the conclusions about the potential physical significance of these additive weights.

      We thank the reviewer for this comment. We admit that a quantitative universal law for tuning the 6 weights of the additive model does not manifest in our data, which indicates the mutation-specific nature of epistatic interactions in TetR as hinted in the different rescuing mutation distributions of different dead mutations (PMCID: PMC7568325). However, clear common trends in the weight tuning of combined mutants that contain common mutations do emerge, which comply with the structural features of the protein and provide explanations as to why C203V and Y132A don’t rescue a range of dead mutations (main text section “Exploring epistasis between mutations”). Additionally, the lack of a quantitative universal rule for tuning the 6 weights in our simple model doesn’t exclude the possibility of the existence of universal law for epistasis in TetR in another functional form, a point that could be explored in the future with more extensive joint experimental and computational investigations.

      In Eq. (27) of the supplementary file, the prior distribution of inter-domain coupling γ is given as a Gaussian distribution centered at 5 kBT. Since the absolute value of γ is important, can the authors explain why the prior distribution of γ is set to this value and what happens if other values are used?

      We thank the reviewer for the question. As explained in the corresponding discussions of Eq. (27) in the supplementary file, the prior of γ is chosen to serve as a soft constraint on its possible values based on the consideration that 1. inter-domain energetics for a TetR-like protein should be on the order of a few kBT; and 2. the prior distribution should reflect the experimental observation in the literature that γ has a small probability of adopting negative values upon mutations. Given our thorough validation of the statistical model and computational algorithm (see section 3 of the supplementary file), and the high precision in the parameter fitting results using experimental data (Figure 3 and Figure 4-figure supplement 2), we conclude that 1. the physical range of parameters encoded in their chosen prior distributions agrees well with the value reflected in the experimental data; 2. the inference results are predominantly informed by the data. Thus, changing the mean of the prior distribution of γ should not affect the inference results significantly given that it remains in the physical range.

      This point is explicitly shown in the added Table 2 (Figure 3-figure supplement 10), where we compare the current Bayesian inference results with those obtained after increasing the standard deviation of the Gaussian prior of γ from 2.5 to 5 kBT. As shown in the table, most inference results stay virtually unchanged at the use of this less informative prior, which confirms that they are predominantly informed by the data. The only exceptions are the slight increase of the inferred γ values for C203V, C203V-Y132A and C203V-G102D-L146A, reflecting the intrinsic difficulty of precise inference of large γ values with our model, as is already discussed in the second subsection of section 3 of the supplementary file. However, such observations comply with the common trend of epistatic interactions involving C203V presented in the main text and don’t compromise the ability of our model to accurately capture the induction curves of mutants. Relevant discussions are now added to the second subsection of section 3 of the supplementary file (line 368-385).

      “In our experimental dataset, such inference difficulty is only observed in the case of C203V, Y132A-C203V and C203V-G102D-L146A due to their large γ and γ + εL values (see main text Figure 3, Figure 3—figure Supplement 10 and Figure 4). As shown in main text Figure 3—figure Supplement 10, the inference results for the other 20 mutants stay highly precise and virtually unchanged after increasing the standard deviation of the Gaussian prior of γ (gstdγ ) from 2.5 to 5 kBT. This demonstrates that the inference results for these mutants are strongly informed by the induction data and there is no difficulty in the precise inference of the parameter values. On the other hand, the inferred γ values (especially the upper bound of the 95% credible region) for C203V, Y132A-C203V and C203V-G102D-L146A increased with gstdγ . This is because the induction curves in these cases are not sensitive to the value of γ given that it’s large enough as discussed above. Hence, when unphysically large γ values are permitted by the prior distribution, they could enter the posterior distribution as well. Such difficulty in the precise inference of γ values for these three mutants however, doesn’t compromise the ability of our model in accurately capturing the comprehensive set of induction data (see part iv below). Additionally, the increase of the inferred γ value of C203V at the use of larger gstdγ complies with the results presented in main text Figure 4, which show that the effect of C203V on γ tends to be compromised when combined with mutations closer to the domain interface."

    2. Reviewer #2 (Public Review):

      Summary:

      This combined experimental-theoretical paper introduces a novel two-domain statistical thermodynamic model (primarily Equation 1) to study allostery in generic systems but focusing here on the tetracycline repressor (TetR) family of transcription factors. This model, building on a function-centric approach, accurately captures induction data, maps mutants with precision, and reveals insights into epistasis between mutations.

      Strengths:

      The study contributes innovative modeling, successful data fitting, and valuable insights into the interconnectivity of allosteric networks, establishing a flexible and detailed framework for investigating TetR allostery. The manuscript is generally well-structured and communicates key findings effectively.

      Comments on revised version:

      I am happy with the changes made by the authors

    3. eLife assessment

      The study presents valuable findings where two-domain thermodynamic model for TetR accurately predicts in vivo phenotype changes brought about as a result of various mutations. The evidence provided is compelling and features the first innovative observations with a computational model that captures the structural behavior, much more than the current single-domain models.

    4. Reviewer #1 (Public Review):

      Summary:

      The authors' earlier deep mutational scanning work observed that allosteric mutations in TetR (the tetracycline repressor) and its homologous transcriptional factors are distributed across the structure instead of along the presumed allosteric pathways as commonly expected. Especially, in addition, the loss of the allosteric communications promoted by those mutations, was rescued by additional distributed mutations. Now the authors develop a two-domain thermodynamic model for TetR that explains these compelling data. The model is consistent with the in vivo phenotypes of the mutants with changes in parameters, which permits quantification. Taken together their work connects intra- and inter-domain allosteric regulation that correlate with structural features. This leads the authors to suggest broader applicability to other multidomain allosteric proteins.

      Here the authors follow their first innovative observations with a computational model that captures the structural behavior, aiming to make it broadly applicable to multidomain proteins. Altogether, an innovative and potentially useful contribution.

      Weaknesses:

      None that I see, except that I hope that in the future, if possible, the authors would follow with additional proteins to further substantiate the model and show its broad applicability. I realize however the extensive work that this would entail.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      The study provides potentially fundamental insight into the function and evolution of daily rhythms. The authors investigate the function of the putative core circadian clock gene Clock in the cnidarian Nematostella vectensis. While it parts still incomplete, the evidence suggests that, in contrast to mice and fruit flies, Clock in this species is important for daily rhythms under constant conditions, but not under a rhythmic light/dark cycle, suggesting that the major role of the circadian oscillator in this species could be a stabilizing function under non-rhythmic environmental conditions.

      Public Reviews:

      Reviewer #1 (Public Review):

      In this nice study, the authors set out to investigate the role of the canonical circadian gene Clock in the rhythmic biology of the basal metazoan Nematostella vectensis, a sea anemone, which might illuminate the evolution of the Clock gene functionality. To achieve their aims the team generated a Clock knockout mutant line (Clock-/- ) by CRISPR/Cas9 gene deletion and subsequent crossing. They then compared wild-type (WT) with Clock-/- animals for locomotor activity and transcriptomic changes over time in constant darkness (DD) and under light/dark cycles to establish these phenotypes under circadian control and those driven by light cycles. In addition, they used Hybridization Chain Reaction-In situ Hybridization (HCR-ISH) to demonstrate the spatial expression of Clock and a putative circadian clocl-controlled gene Myh7 in whole-mounted juvenile anemones.

      The authors demonstrate that under LD both WT and Clock-/- animals were behaviourally rhythmic but under DD the mutants lost this rhythmicity, indicating that Clock is necessary for endogenous rhythms in activity. With altered LD regimes (LD6:6) they show also that Clock is light-dependent. RNAseq comparisons of rhythmic gene expression in WT and Clock-/- animals suggest that clock KO has a profound effect on the rhythmic genome, with very little overlap in rhythmic transcripts between the two phenotypes; of the rhythmic genes in both LD and DD in WT animals (220- termed clock-controlled genes, CCGS) 85% were not rhythmic in Clock-/- animals in either light condition. In silico gene ontology (GO) analysis of CCGS reflected process associated with circadian control. Correspondingly, those genes rhythmic in KO animals under DD (here termed neoCCGs) were not rhythmic in WT, lacked upstream E-box motifs associated with circadian regulation, and did not display any GO enrichment terms. 'Core' circadian genes (as identified in previous literature) in WT and Clock-/- animals were only rhythmic under entrainment (LD) conditions whilst Clock-/- displayed altered expression profiles under LD compared to WT. Comparing CCGs with previous studies of cycling genes in Nematostellar, the authors selected a gene from 16 rhythmic transcripts. One of these, Myh7 was detectable by both RNAseq and HCR-ISH and considered a marker of the circadian clock by the authors.

      The authors claim that the study reveals insights into the evolutionary origin of circadian timing; Clock is conserved across distant groups of organisms, having a function as a positive regulator of the transcriptional translational feedback loop at the heart of daily timing, but is not a central element of the core feedback loop circadian system in this basal species. Their behavioural and transcriptomic data largely support the claims that Clock is necessary for endogenous daily activity but that the putative molecular circadian system is not self-sustained under constant darkness (this was known already for WT animals)- rather it is responsive to light cycles with altered dynamics in Clock-/- specimens in some core genes under LD. In the main, I think the authors achieved their aims and the manuscript is a solid piece of important work. The Clock-/- animal is a useful resource for examining time-keeping in a basal metazoan.

      The work described builds on other transcriptomic-based works on cnidaria, including Nematostellar, and does probe into the molecular underpinnings with a loss-of-function in a gene known to be core in other circadian systems. The field of chronobiology will benefit from the evolutionary aspect of this work and the fact that it highlights the necessity to study a range of non-model species to get a fuller picture of timing systems to better appreciate the development and diversity of clocks.

      Strengths:

      The generation of a line of Clock mutant Nematostellar is a very useful tool for the chronobiological community and coupled with a growing suite of tools in this species will be an asset. The experiments seem mostly well conceived and executed (NB see 'weaknesses'). The problem tackled is an interesting one and should be an important contribution to the field.

      Weaknesses:

      I think the claims about shedding light on the evolutionary origin of circadian time maintenance are a little bold. I agree that the data do point to an alternative role for Clock in this animal in light responsiveness, but this doesn't illuminate the evolution of time-keeping more broadly in my view. In addition, these are transcriptomic data and so should be caveated- they only demonstrate the expression of genes and not physiology beyond that. The time-course analysis is weakened by its low resolution, particularly for the RAIN algorithm when 4-hour intervals constrain the analysis. I accept that only 24h rhythms were selected in the analysis from this but, it might be that detail was lost - I think a preferred option would be 2 or 3-hour resolution or 2 full 24h cycles of analysis.

      The authors discount the possibility of the observed 12h rhythmicity in Clock-/- animals by exposing them to LD6:6 cycles before free-running them in DD. I suggest that LD cycles are not a particularly robust way to entrain tidal animals as far as we know. Recent papers show inundation/mechanical agitation are more reliable cues (Kwiatkowski ER, et al. Curr Biol. 2023, 2;33(10):1867-1882.e5. doi: 10.1016/j.cub.2023.03.015; Zhang L., et al Curr Biol. 2013, 23;19, 1863-1873 doi.org/10.1016/j.cub.2013.08.038.) and might be more effective in revealing endogenous 12h rhythms in the absence of 24h cues.

      Response: We removed the suggestion that we used 6:6h LD to perform tidal entrainment. We generated this ultradian light condition to address the 24h rhythmicity observed in the NvClk1-/- in 12:12h LD.

      Reviewer #2 (Public Review):

      This manuscript addresses an important question: what is the role of the gene Clock in the control of circadian rhythms in a very primitive group of animals: Cnidaria. Clock has been found to be essential for circadian rhythms in several animals, but its function outside of Bilaterian animals is unknown. The authors successfully generated a severe loss-of-function mutant in Nematostella. This is an important achievement that should help in understanding the early evolution of circadian clocks. Unfortunately, this study currently suffers from several important weaknesses. In particular, the authors do not present their work in a clear fashion, neither for a general audience nor for more expert readers, and there is a lack of attention to detail. There are also important methodological issues that weaken the study, and I have questions about the robustness of the data and their analysis. I am hoping that the authors will be able to address my concerns, as this work should prove important for the chronobiology field and beyond. I have highlighted below the most important issues, but the manuscript needs editing throughout to be accessible to a broad audience, and referencing could be improved.

      Major issues:

      (1) Why do the authors make the claim in the abstract that CLOCK function is conserved with other animals when their data suggest that it is not essential for circadian rhythms? dCLK is strictly required in Drosophila for circadian rhythms. In mammals, there are two paralogs, CLOCK and NPAS2, but without them, there are no circadian rhythms either. Note also that the recent claim of BMAL1-independent rhythms in mammals by Ray et al., quoted in the discussion to support the idea that rhythms can be observed in the absence of the positive elements of the circadian core clock, had to be corrected substantially, and its main conclusions have been disputed by both Abruzzi et al. and Ness-Cohn et al. This should be mentioned.

      Response: According to our Behavioral and Transcriptomic data, CLOCK function is conserved in constant light condition. In LD context, the rhythmicity is maintained probably by the light-response pathway in Nematostella. We modified our rhythmic transcriptomic analysis and considered the context of the contested results by Ray et al., and discussed it in the revised manuscript.

      (2) The discussion of CIPC on line 222 is hard to follow as well. How does mRNA rhythm inform the function of CIPC, and why would it function as a "dampening factor"? Given that it is "the only core clock member included in the Clock-dependent CCGs," (220) more discussion seems warranted. Discussing work done on this protein in mammals and flies might provide more insight.

      Response: The initial sentence was unclear. Furthermore, since we restricted our rhythmic analysis to genes only found rhythmic with a p<0.01 with RAIN combined with JTK, NvCipc was no longer defined as rhythmic in free running.

      (3) The behavioral arrhythmicity seen with their Clock mutation is really interesting. However, what is shown is only an averaged behavior trace and a single periodogram for the entire population. This leaves open the possibility that individual animals are poorly synchronized with each other, rather than arrhythmic. I also note that in DD there seem to be some residual rhythms, though they do not reach significance. Thus, it is also possible that at least some individual animals retain weak rhythms. The authors should analyze behavioral rhythms in individual animals to determine whether behavioral rhythmicity is really lost. This is important for the solidity of their main conclusions.

      Response: Fig. 1 has been modified. We have separated the data for WT and NvClk1-/- animals to provide clarity on the average behavior pattern for each genotype. While the LSP analysis on the population average informs us about the synchronization of the population, it is true that it does not provide insight into individual rhythmicity. To address this, we analyzed individuals in all conditions using the Discorhythm website (Carlucci et al., 2019).

      In the revised figure, we have included a comparison plot of the acrophase of 24-hour rhythmic animals between genotypes using Cosinor analysis, which is most suitable for acrophase detection. This plot indicates the number of animals detected as significantly rhythmic, providing direct visual input to the reader regarding individual rhythmicity. Additionally, we have added Table 1, which contains the Cosinor period analysis (24 and 12 hours) of individuals for all genotypes and conditions, further enhancing the clarity of our findings.

      (4) There is no mention in the results section of the behavior of heterozygotes. Based on supplement figure 2A, there is a clear reduction in amplitude in the heterozygous animals. Perhaps this might be because there is only half a dose of Clock, but perhaps this could be because of a dominant-negative activity of the truncated protein. There is no direct functional evidence to support the claim that the mutant allele is nonfunctional, so it is important to discuss carefully studies in other species that would support this claim, and the heterozygous behavior since it raises the possibility that the mutant allele acts as a dominant negative.

      Response: Extended Data Fig.1 modified. We show NvClk1+/- normalized locomotion over time in DD of the population, comparison of individual normalized behavior amplitude, LSP of the average population and individual acrophase of only rhythmic 24h individuals. Indeed, we cannot discriminate Dominant-negative from non-functional allele.

      (5) I do not understand what the bar graphs in Figure 2E and 3B represent - what does the y-axis label refer to?

      Response: Not relevant to the revised manuscript.

      (6a) I note that RAIN was used, with a p<0.05 cut-off. I believe RAIN is quite generous in calling genes rhythmic, and the p-value cut-off is also quite high. What happens if the stringency is increased, for example with a p<0.01.

      Response: We acknowledge your concern regarding the stringency of our statistical analysis. To address this, we opted to combine both RAIN and JTK methods and applied a more stringent p-value cut-off of p<0.01.

      (6b) It would be worth choosing a few genes called rhythmic in different conditions (mutant or wild-type. LD or DD), and using qPCR to validate the RNAseq results. For example, in Figure 3D, Myh7 RNAseq data are shown, and they do not look convincing. I am surprised this would be called a circadian rhythm. In wild-type, the curve seems arrhythmic to me, with three peaks, and a rather large difference between the first and second ZT0 time point. In the Clock mutants, rhythms seem to have a 12hr period, so they should not be called rhythmic according to the material and methods, which says that only ca 24hr period mRNA rhythms were considered rhythmic. Also, the result section does not say anything about Myh7 rhythms. What do they tell us? Why were they presented at all?

      Response: Regarding the suggestion for independent verification of our RNAseq results, we agree that such validation would enhance the robustness of our findings. To address this, we chose to overlap our identified rhythmic genes under WT LD conditions with those from another transcriptomic study that shared similarities in experimental design. Notably, the majority of overlapping rhythmic genes between the studies are candidate pacemaker genes. We believe that this replication of biologically significant rhythmic genes strengthens the validity and reliability of our results (see Extended Data Fig. 2).

      Furthermore, we have decided to remove the NvMhc-st (mistakenly named Myh7, only rhythmic in WT DD in the new analysis) as it does not contribute substantively to the revised version of the manuscript.

      (7) The authors should explain better why only the genes that are both rhythmic in LD and DD are considered to be clock-controlled genes (CCGs). In theory, any gene rhythmic in DD could be a CCG. However, Leach and Reitzel actually found that most genes in DD1 do not cycle the next day (DD2)? This suggests that most "rhythmic" genes might show a transient change in expression due to prolonged obscurity and/or the stress induced by the absence of a light-dark cycle, rather than being clock controlled. Is this why the authors saw genes rhythmic under both LD and DD as actual CCGs? I would suggest verifying that in DD the phase of the oscillation for each CCG is similar to that in LD. If a gene is just responding to obscurity, it might show an elevated expression at the end of the dark period of LD, and then a high level in the first hours of DD. Such an expression pattern would be very unlikely to be controlled by the circadian clock.

      Response: As we modified our transcriptomic analysis, we do no longer analyze LD+DD rhythmic genes, but any genes rhythmic (RAIN and JTK p<0.01) in each condition. As such we end up with four list of genes corresponding to each experimental conditions.

      (8) Since there are still rhythms in LD in Clock mutants, I wonder whether there is a paralog that could be taking Clock's place, similar to NPAS2 in mammals.

      Response: see response to (1) > The only NPAS2 orthologous identified in Nematostella NPAS3 showed marginally significance (p=0.013) with RAIN in LD WT suggesting a regulation similar to the candidate pacemaker genes. As such we included within our candidate pacemaker genes list.

      (9) I do not follow the point the authors try to make in lines 268-272. The absence of anticipatory behavior in Drosophila Clk mutants results from disruption of the circadian molecular clock, due to the loss of Clk's circadian function. Which light-dependent function of Clock are the authors referring to, then? Also, following this, it should be kept in mind that clock mutant mice have a weakened oscillator. The effect on entrainment is secondary to the weakening of the oscillator, rather than a direct effect on the light input pathway (weaker oscillators have increased response to environmental inputs). The authors thus need to more clearly explain why they think there is a conservation of circadian and photic clock function.

      Response: Following the changes in our statistical analysis we reframed the discussion and address directly the circadian and the photic clock function (we call it light-response pathway in the manuscript)

      Recommendations for the authors:

      We suggest the following improvements:

      (1) Please undertake a serious effort to make this work more accessible to non-marine chronobiologists. This includes better explanations, and schemes of the animal when images of staining are shown (e.g. Fig.1b) which include the labeling of relevant morphological structures mentioned in the text (like "tentacle endodermis and mesenteries" (line 132)). Similar issues for mentioned life cycle stages like "late planula stage" (line 133), "bisected physa" (line 149).

      Response: Fig. 1b, we outlined the animal shaped and added 2 arrows to locate the tentacle endodermis and mesenteries. We replaced the term late planula stage, by larvae. And we rephrased bisected physa by tissue sampling.

      Please attend to details. This includes:

      • Wrong referrals to figures (currently line 151 refers to EDF2- but should be EDF 1 instead, there is a Fig.3f mentioned in the text, but there is no such Fig.).

      Response: Fixed

      • Mentioning of ZTs when the HCR stainings were performed.

      Response: Fixed

      • Fig.1 a shows a rather incomplete and thus potentially confusing phylogenetic tree. Vertebrates have at least two Clk orthologs (NPAS2 and CLK), please include both, use an outgroup, and rout the tree.

      Response: Identifying NPAS2 and CLK orthologous in all species added more confusion into the conclusion. However, we followed the suggestion of adding an outgroup using a CLK orthologous sequence identified in the sponge Amphimedon queenslandica and rout the tree. Thank for the suggestion.

      • What do the y-axis labels in Figure 2E and 3B refer to exactly? Y-axis label annotations in Fig.3a,d are entirely missing- what do the numbers refer to?

      Response: not relevant in the revised manuscript

      • Fig.2D- is the Go term enrichment referring to LD or DD?

      Response: to DD. We made it cleared on the figure 5.

      • Wording: "Clock regulates genetic pathways." What is meant by "genetic pathways"? There are no "non-genetic pathways". Could one simply say: "Clock regulates a variety of transcripts".

      Response: We modified our threshold to use only p.adj<0.01, which reduced the GO term numbers. We removed “genetic pathways” and now address the specific pathways: cell-cycle and neuronal.

      The use of the term "epistatic" is confusing (line 219), i.e. that light is epistatic to Clock. In genetics, epistasis is defined as the effect of gene interactions on phenotypes. To a geneticist, this implies that there is a second gene impacting on the phenotype of the Clock mutants. Please re-word.

      Response: “light is epistatic on Clock” has been re-phrased.

      The provided Supplementary tables are not well annotated. Several of them need guess-work about what is shown. For instance, for Supplementary Table 1, the Ns are unclear, which in total can go up to almost 200 per condition-genotype, but only about 30 animals for each were tested. Thus, where do the high totals in the LSP table come from? What do the numbers of each periodicity mean? Initially one might assume it was the number of animals that showed a periodogram peak at a given periodicity, but it seems that cannot be. Maybe it counted any period bin over statistical significance? Please clarify with better descriptions and labels.

      Response: Supplementary tables are now clearly annotated on their first Tabs. About Fig.1, we already addressed this point in the public review.

      Albeit not essential, it would be more reader-friendly to also add a summary table with average period and SD, power and SD, and percentage rhythmicity to the main figure.

      Response: Table 1 is added: it contains individual count of rhythmic animals (24h and 12h) with Cosinor. However, using Discorhythm we had to ask for a specific Period. Thus, we can only provide animal count significant for a given period value. And not an estimation of their own period.

      (2) Some of the terminology is quite confusing, in particular the double meaning of the word "clock" (i.e the pacemaker and the transcription factor). This is not a specific problem to this manuscript, but it would be helpful for the readability to try to improve this.

      Could the gene/transcript/protein be spelled: clk and Clk?

      Alternatively, for clarity- how about talking about "core pacemaker genes," "CLOCK-dependent rhythmic genes" and "CLOCK-independent rhythmic genes"?

      Response:

      Clock/CLOCK > NvClk / NvCLK and the mutant is NvClk1-/-

      Core clock genes > candidate pacemaker genes.

      CLOCK-dependent CCG > this notion no longer exists in the revised manuscript.

      CLOCK-independent CCG > this notion no longer exists in the revised manuscript.

      (3) The dismissal of the 12h rhythmicity in Clock-/- animals is not really convincing and should be reconsidered. LD6:6 cycles (before free-running animals in DD) is likely a not particularly robust way to entrain tidal animals. Recent papers show inundation/mechanical agitation are more reliable cues (Kwiatkowski ER, et al. Curr Biol. 2023, 2;33(10):1867-1882.e5. doi: 10.1016/j.cub.2023.03.015; Zhang L., et al Curr Biol. 2013, 23;19, 1863-1873 doi.org/10.1016/j.cub.2013.08.038.) and might be more effective in revealing endogenous 12h rhythms in the absence of 24h cues.

      Response: We removed the proposition of using 6:6hLD as Tidal entrainment. Instead, the LD 6:6 experiment reveals the direct light-dependency of the NvClk1-/- mutant.

      (4) There are significant questions raised on the validity of BMAL1-independent rhythms in mammals as suggested by the Ray et al study. See DOI: 10.1126/science.abe9230 and DOI: 10.1126/science.abf0922

      These technical comments should also be taken into account and the discussion adjusted accordingly to better reflect the ongoing discussions in the chronobiology field.

      Response: We modified our rhythmic analysis. As we cannot use BHQ or adjusted p-value which resulted in very genes, we defined 24h-rhythmic genes if p<0.01 with two different algorithms (RAIN and JTK). We propose this compromise to reduce the risk of false-positive. Furthermore, we discussed our methodology in the light of the significant questions raised by these papers you cited. We thank the reviewer for this important point.

      (5) The HCR stainings for clk are not very convincing. Normally, HCR should have more dots. In principle, the logic of HCR is such that it detects individual mRNA molecules in the cell. Thus, having only one strong dot/cell like in Fig.1b doesn't make much sense.

      Response: We were the first surprised by this single dot signal. We are experienced users of HCRv.3 across different species. We decided to remove the close-up (for further investigations) but to keep the full animal signal. According to our approach it is a convincing signal. However, the doty nature of the signal itself it is not easy to make it highly visible at full scale animal on the picture. We did our best to show the mRNA signal visible without altering the pattern.

      Furthermore, the controls for the HCR in situ hybridization are unclear. In the methods, there are two Clock probes described (B3 & B5) and two control probes (B1 & B3), however, in the negative control image, a combination of one Clock (B1) and one control (B3) probes is used and is unclear what "redundant detection" means in the legend of figure S2.

      Response: Considering the nature of the signal (single of few dots), we decided to use two probes with 2 different fluorophores. A noise is by nature random. Our hypothesis was: only overlapping fluorescent dots are true signal of NvClk mRNA.

      For Control probes we used two zebrafish probes labelling hypothalamic peptides.

      Based on the experience with non-Drosophila, non-mouse animal model systems the reviewers assume that non-sense mediated mRNA decay (NMD) is not strongly initiated upon Crispr-induced premature STOP-codons. If this assumption is correct it would be worth to mention it. Alternatively, it would be worth testing if Nematostella induces NMD, as this would be a great control for the HCR and the mutation itself. At which ZT was the HCR done?

      Response: We performed the HCR at ZT10 when NvClk is described to be at peak. It is now indicated in the Fig. 1b. The RNAseq detected a higher quantity of NvClk1 mRNA in the NvClk1-/- (see Fig. 4a). mRNA quantity regulation involves transcription, stabilization, and degradation. At this stage, we cannot identify which specific step is affected.

      For Fig.1c- please provide the binding site and sequence in the figure, simply include EDF 1 in the main figure.

      Response: We generated a clear indication in the new Fig.1c and EDF. 1b about the protein domains, the CRISPR binding site and the consequences on the DNA and AA sequences.

      (6) Please provide the individual trace data for the behavioral analyses either as supplementary files or as a link to an openly accessible database like DRYAD (see also comment 7 in the public review of reviewer 2). Maybe this is what is shown in Supplementary Table 1, but it is really not clear what is actually shown.

      Response: Fig.1 is updated. Table 1 is added. Supplementary Table 1 contains individual normalized locomotor data of each polyps for each genotypes and light conditions. Supplementary Table 2 contains the cosinor individual rhythmic behavior analysis based on the Supplementary Table 1.

      (7) It is not really clear if the mutation is a true loss-of-function or could also be dominant negative. While this is raised in the discussion, it should be more carefully considered. The reason why a dominant negative would be unlikely is unclear. More specifically also see comment 8) in the public review of reviewer 2.

      Response: Indeed, the results cannot tell us if it is a true loss of function, a dominant negative or non-functional allele. We addressed it in the first part of the discussion.

      (8) The pretty small overlap of rhythmic transcripts in LD and DD could reflect the true biology of a more core clock driven-process under constant conditions and a more light-driven process under LD. But still- wouldn't one expect that similar processes should be rhythmic? If not, why not?

      It would certainly add strength to the data if for one or two transcripts these results were independently verified by qPCR from an independent sampling. This could even be done for just two time points with the most extreme differences.

      Response: We appreciate the reviewer's comments and concerns regarding the overlap of rhythmic transcripts in different conditions. In response to the reviewer's query, we revised our interpretation of the transcriptomic data, acknowledging the limited overlap between light and genotype conditions in our study. This prompted us to reconsider the underlying biological processes driving rhythmic gene expression under constant conditions versus light-dark cycles.

      Regarding the suggestion for independent verification of our RNAseq results, we agree that such validation would enhance the robustness of our findings. To address this, we chose to overlap our identified rhythmic genes under WT LD conditions with those from another transcriptomic study that shared similarities in experimental design. Notably, the majority of overlapping rhythmic genes between the studies are candidate pacemaker genes. We believe that this replication of biologically significant rhythmic genes strengthens the validity and reliability of our results (see Extended Data Fig. 2).

      (9) Expression of myh7 : Checking for co-expression should be pretty straightforward by HCR. This is what this type of staining technique is really good for. Please do clk and myh7 co-staining if you want to claim co-expression. Otherwise don't make such a claim.

      Response: We agree that checking for co-expression should be straightforward by HCR. However, due to time constraints during the revision period, we are unable to conduct the double in-situ experiment. Additionally, upon careful consideration, we recognize that including myhc-st (mistakenly named myh7) staining and co-expression analysis would not significantly contribute to the main conclusions of our study. Therefore, we have decided to remove this analysis from the revised manuscript.

      (10) Missing methodological details:

      • The false discovery rate for each analysis should be included (see Hughes et al.,: "Guidelines for Genome-Scale Analysis of Biological Rhythms," 2017).

      Response: THE FDR is indicated for each gene in supplementary table 3

      • Fig.1f- continuous light- please provide a spectrum (If there is no good spectrophotometer available, please provide at least manufacturer information.

      Response: Unfortunately, we don’t have a good spectrophotometer available during the time of the revision. We added to the method the reference of the lamp. We found the light spectrum provided by the supplier. However, we did not add it to the revised manuscript.

      Author response image 1.

      Spectrum of the Aquastar t8

      Also, it would be easier for the reader, if the measurements of light intensity are provided in photons, because this is what the light receptors ultimately measure.

      Response: Modified.

      • Fig.2E- please add the consensus sequence used for circadian E-box vs. E-box to the figure.

      Response: In the revised manuscript Fig.4c, we show which E-box motifs we extracted for our promoter analysis. We as well changed our analysis and did no longer use HOMER, but we directly extracted promoter sequences and looked for canonical Ebox CANNTG and Circadian Ebox CACGTG and generate a Circadian Ebox enrichment output per gene promoter.

      (11) There has been some discussion about the evolutionary statement as stated by the authors. It appears that depending on the background of the reader, this can be misunderstood. We thus suggest to more clearly point out where the author thinks there is evolutionary conservation (a function for clk in the circadian oscillator under constant light or dark conditions) versus where there is no apparent evolutionary conservation (the situation under light-dark conditions).

      Response: In the revised manuscript we proposed a conserved function of NvCLK in constant darkness, and a light-response pathway compensating in LD conditions in the mutant.

      Please also consider the major comments 8 and 9 of the common review from reviewer 2.

      Reviewer #1 (Recommendations For The Authors):

      The hybridization chain-reaction ISH is OK but, I'm not sure I understand the control condition-this should be clarified. I would also welcome the use of Clock-/- animals in HCR as another, more direct level of control. In addition, the authors state that the Myh7 probes hybridise in anatomical regions resembling those for Clock (Fig 3e). It would be better to duplex these two probe sets with different fluors for a better representation of the relative spatial distributions of each transcript.

      Response: We agree that checking for co-expression should be straightforward by HCR. However, due to time constraints during the revision period, we are unable to conduct the double in-situ experiment. Additionally, upon careful consideration, we recognize that including myhc-st (mistakenly named myh7) staining and co-expression analysis would not significantly contribute to the main conclusions of our study. Therefore, we have decided to remove this analysis from the revised manuscript.

      We clarified in the methods the control probes design.

      Minor points:

      Figure legends do not all convey sufficient detail. For instance, Figure 1c needs a better explanation. Figure 3e- are these images both WT? Fig 3f doesn't exist and other figure text references do not align with figures and need an overhaul.

      Response: All errors have been fixed.

      Reviewer #2 (Recommendations For The Authors):

      Major issues:

      (1) The authors need to introduce their model system better for a broad audience. What are the tissues/cells that express Clock at a higher level? What is their function, does this provide a potential explanation for their specific Clock expression, and how CLOCK might regulate behavior? Terms such as "tentacle endodermis and mesenteries" (line 132), "late planula stage" (line 133), "bisected physa" (line 149) would need some explanation.

      Response: We modified term such as planula to larvae, and bisected physa to tissue samples.

      2) Some of the terminology used is quite confusing, because of the double-meaning of the word "clock" (i.e the pacemaker and the transcription factor). The authors use terms such as "clock-controlled genes", "core clock genes", "CLOCK-dependent clock-controlled genes", "neo-clock-controlled genes". Is there any way to help the reader? Here are several suggestions: "core pacemaker genes," "CLOCK-dependent rhythmic genes" and "CLOCK-independent rhythmic genes".

      Response: all the terminology has been clarified, see previous comments

      3) Also in the abstract, there is mention of "hierarchal light- and Clock-signaling" (52-3) - is this related to the statement on line 219 that light is epistatic to Clock? I do not quite understand what epistatic would mean here. Who is upstream of whom? LD modifies rhythmicity in Clock mutant animals, but Clock mutations also impact rhythmicity in LD. Also, as epistasis is defined as the effect of gene interactions on phenotypes - what is the secondary gene impacting the phenotype of the Clock mutants? I am not sure the term epistatic is appropriate in the present context.

      Response: Indeed, Epistatic is a genetic term which might be unclear in this context. We removed it.

      4) The control for the in situ hybridization is unclear. In the methods, there are two Clock probes described (B3 & B5) and two control probes (B1 & B3), however, in the negative control image, a combination of one Clock (B1) and one control (B3) probe is used, I am not sure what "redundant detection" means in the legend of figure S2. Also, the sequences of each Clock probe should be provided. It might be worth testing the Clock mutant the authors generated. Clock mRNA could be reduced due to non-sense, mediated RNA decay, since the mutation causes a premature stop codon. This would be a great additional control for the in situ hybridization. Even better would be if, by chance, the probes target the mutated sequence. The signal should then be completely lost.

      Response: HCR is a tilling probe. Which means the target transcript is covered by dozens of successive DNA sequence “primer-like” which allow the HCRv.3 technology. We cannot design a mutant probe specific with this technology.

      (5) I have concerns with rhythmic-expression calls, particularly as there is so little overlap between LD and DD, and that a completely different set of rhythmic genes is observed in Clock mutant and wild-type animals. I am not an expert in whole-genome expression studies, so I hope one of my colleague reviewers can weigh in.

      When describing rhythmicity analysis in the Methods, it states that Benjamini-Hochberg corrections were applied to account for multiple comparisons. However, the false discovery rate for each analysis should be included (see Hughes et al.,: "Guidelines for Genome-Scale Analysis of Biological Rhythms," 2017).

      Response: As explained before we cannot used Benjamini-Hochberg corrections as only few genes (mostly oscillator gene pass the threshold). As such we combined two different algorithms (RAIN and JTK) with a p<0.01 to detect confidently rhythmic genes while reducing the risk of false-positives.

      Minor issues:

      (1) Environmental inputs are not "circadian", as written in the title.

      Response: Title modified

      (2) In the abstract, the description of the Clock mutant behavioral phenotypes is hard to follow, with no mention of whether or not Clock mutant animals are behaviorally rhythmic or arrhythmic in constant conditions.

      Response: corrected

      (3) Abstract: A 6/6 h LD cycle is not a compressed tidal cycle as written in the abstract. Light is not an input to tidal rhythms.

      Response: corrected

      (4) Line 101: timeout is not a core clock gene in animals.

      Response: we removed it from the candidate pacemaker genes.

      (5) What is the evidence for the role of PAR-Zip proteins in the Nematostella clock? The reference provided does not mention those.

      Response: There is no functional data in Nematostella yet to support their role within the pacemaker. However based on their rhythmicity in LD and protein conservation, we included them within the candidate pacemaker genes list. The refences have been corrected.

      (6) Line 125. should refer to Fig 1C when describing the Clock protein.

      Response: corrected

      (7) Line 143-4. based on the figure, the region targeted by gRNA was not "close to the 5' end" as stated, it is closer to the middle of the gene sequence as shown in Figure 1C. A more accurate description would be a region in between the PAS domains.

      Response: Indeed we modified the figure and the text.

      (8) Line 150. The mutant allele is described as Clock1 initially, then for the rest of the paper as Clock-. SInce it is not clear that the allele is a null (see major comment #8), Clock1 should be used throughout the manuscript.

      Response: the allele is named NvClk1 in the revised manuscript

      (9) Figure 2A, the second CT/ZT0 is misplaced.

      Response: Fig. 2 modified in the revised manuscript

      (10) Figure legend for 2E and 3B. "The 1000bp upstream ATG" is unclear. I guess it means that 1000bp upstream of the putative initiation codon was used.

      Response: Right, and in the revised version we analyzed 5kb upstream the putative ATG.

      (11) Line 164. The authors write "We discovered..." , but wasn't it already known that these animals are behaviorally rhythmic?

      Response: Fixed

      (12) It would be worth mentioning in the results section the reduced amplitude of rhythms in LL compared to DD (in WT and seemingly also in Clock mutants).

      Response: Indeed, we observed a significant reduction in the mean amplitude in the NvClk1-/- in DD and LL compared WT and NvClk1-/- in LD, DD and LL. However, as rhythmicity is lost by virtually all mutants in LL and DD we do not think these results add to the current interpretation of the gene function.

      (13) Please correct the figure numbers in the main text, there are several mistakes.

      Response: Done

      (14) Line 196, most genes in the quoted study did not cycle on day 2, so whether they are truly clock controlled is questionable.

      Response: We agree, identifying free-running cycling genes in cnidarian remains a challenge to overcome. One of the limitations of this study was to detect rhythmic genes in LD which conserved rhythmicity in DD. However, considering different transcriptomic studies (cited in the discussion) it seems that in the cnidaria phyla rhythmic genes in LD are not necessarily the one we identified rhythmic in DD.

      (15) Line 204-206 needs to be rephrased. It is confusing.

      Response: rephrased

      (16) Line 216. Rephrase to something like: "A similar finding was made for."

      Response: rephrased

      (17) "Clock regulates genetic pathways" sounds quite odd. Do you mean it regulates preferentially specific genetic (or maybe better, molecular) pathways?

      Response: rephrased

      (18) Figure 4 and legend: Dashed lines indicating threshold are missing. Do the black and red dots represent WT and Clock-/-, as indicated in the legend, or up/down, as indicated in the figures?

      Response: Fig.5 modified accordingly. Colors in the Volcano plot indicate Up- (black) versus Down- (red) regulated. It is now coherent within the figure.

      (19) Legend for Extended figure 1. "Immature peptide sequence" is incorrect.

      Response: rephrased

      (20) Extended data Figure 4. What the asterisks labels is unclear.

      Response: EDF4 was modified and become EDF2 with different content. The * indicates NvClk mRNA

      (21) Line 228. Gene "isoforms". I guess the authors mean "paralogs".

      Response: corrected.

      (22) Line 232-3/Figure 3e. Please include a comparable image of the Clk ISH to facilitate the comparison of the spatial expression pattern. In addition, where and what is the "analysis" referred to - "the spatial expression pattern of Myh7 closely resembled that of Clock, as evidenced by our analysis"?

      Response: the analysis has been removed from the revised manuscript because we currently cannot perform the double ish.

      (23) Line 282-3. As mentioned above, it is difficult to be sure that circadian behavior is lost, if only looking at a population of animals.

      Response: Fig.1 corrected

      (24) Line 301-5. Rephrase.

      Response: Rephrased

      (25) Line 325. I am not convinced that the author can say that their mutant is amorphic. See Major comment 8.

      Response: corrected.

      (26) Line 351 "simplifying interactions with the environment". Please explain what is meant here.

      Response: this confusing sentence has been removed from the revised manuscript

    2. Reviewer #2 (Public Review):

      In this revised manuscript Aguillon and collaborators convincingly demonstrating that CLK is required for free-running behavioral rhythms under constant conditions in the Cnidarian Nematostella. The results also convincingly show that CLK impacts rhythmic gene expression in this organism. This original work thus demonstrate that CLK was recruited very early during animal evolution in the circadian clock mechanism to optimize behavior and gene expression with the time-of-day. The manuscript could still benefit from some improvements so that it is more accessible for a wide readership.

    3. eLife assessment

      This fundamental study for the first time defines genetically the role of the Clock gene in basal metazoa, using the cnidarian Nematostella vectensis. With convincing evidence, the study provides insight into the early evolution of circadian clocks. Clock in this species is important for daily rhythms under constant conditions, but not under a rhythmic light/dark cycle, suggesting that the major role of the circadian oscillator in this species could be a stabilizing function under non-rhythmic environmental conditions.

    1. eLife assessment

      This important study provides previously unappreciated insights into the functions of protist eIF4E 5'mRNA cap-binding protein family members, thereby contributing to a better understanding of translation regulation in these organisms. The authors provide solid evidence to support the major conclusions of the article. However, the study may further benefit from establishing whether all of the eIF4E family members are indeed involved in translation and more direct evidence for the selectivity of their binding.

    2. Reviewer #1 (Public Review):

      Using A. carterae as a model system, this work investigates the properties of the trans-spliced SL leader sequences and the dinoflagellate eIF4E protein family members.

      Analysis was performed to identify the 5' cap type of the SL leader. Variation in the SL leader sequence and an abundance of modified bases was documented.

      Various aspects of the sequence and expression of the eIF4E family members were examined. This included phylogeny, mRNA, and protein expression levels in A. carterae, and the ability of eIF4E proteins to bind cap structures. Differences in expression levels and cap-binding capacity were characterized, leading to the proposition that eIF4E-1a serves as the major cap-binding protein in A. carterae.

      A major discussion point is the potential for differential eIF4E binding to specific SL leader sequences as a regulatory mechanism, which is an exciting prospect. However, despite indications of sequence variability and the presence of various nucleotide modifications in the SL, and the several eIF4E variants, direct evidence to support this hypothesis is lacking.

      It is an extensive and highly descriptive study. The work is presented clearly, although it is rather lengthy and contains repetition across the introduction, results, and discussion sections. Its style leans more towards a review format. As a non-expert in the field, I appreciated the extensive background however I do believe the paper would benefit from a more concise format.

    3. Reviewer #2 (Public Review):

      Summary:

      Jones et al. extend their previous work on the translation machinery in Dinoflagellate. In particular, they study the species Amphidium carterae. They characterize the type of cap structure mRNAs possess in this species, as well as the eight eIF4E family members A. carterae possesses and their affinity to the mRNA cap. They also establish the leader sequences of the transpliced mRNAs that A. carterae generates during gene expression.

      Strengths:

      The authors performed a solid phylogenetic and biochemical study to understand the structure of Dinoflagellate mRNAs at the 5'-UTR as well as the divergence and biochemical features of eIF4Es across Dinoflagellate. They also establish eIF4E-1a as the prototypical paralog of the eIF4E family of proteins. The scientific questions they ask are very relevant to the gene expression field across eukaryotes. The experiments and the phylogenetic analysis are performed with a very high quality. They perform a wide spectrum of experimental approaches and techniques to answer the questions.

      Weaknesses:

      The authors assume all eIF4E from Dinoflagellate are involved in translation, i.e., mRNA recruitment to the ribosome. Indeed, they think that the diverse biochemical features of all eIF4E in A. carterae have to do with the possible recruitment of different subsets of mRNAs to the ribosome for translation. I think that the biochemical differences among all paralogs also might be due to the involvement of some of them in different processes of RNA metabolism, other than translation. For instance, some of them could be involved only in RNA processing in the nucleus or mRNA storage in cytoplasmic foci.

    4. Reviewer #3 (Public Review):

      Summary:

      In this article, the authors provide an inventory of the 5' spliced leader sequences, cap structures, and eIF4E isoforms present in the model dinoflagellate species A. carterae. They provide evidence that the 5' cap structure is m7G, as it is in most characterized eukaryotes that do not employ trans-splicing for mRNA maturation, and that there are additional methylated nucleotides throughout the spliced leader RNAs. They then show that of the 8 different eIF4E species in A. carterae, only a subset of eIF4E1 and eIF4E2 proteins are detected and that the levels change according to time of day. Interestingly, while the eIF4E1 proteins bind a canonical cap nucleotide and are able to complement eIF4E-deficiency in yeast, an eIF4E2 paralog does not bind the traditional cap.

      Strengths:

      A strength of the article is that the authors have clearly presented the findings and by straying away from traditional model organisms, they have highlighted unique and interesting features of an understudied system for translational control. They provide complementary evidence for most findings using multiple techniques. E.g. the evidence that eIF4E1A binds m7GTP is supported by both pulldowns using m7GTP sepharose as well as SPR experiments to directly monitor binding of recombinant protein with affinity measurements. The methods are extremely detailed noting cell numbers, volumes, concentrations, etc. used in the experiments to be easily replicated.

      Weaknesses:

      While not necessary to support the author's conclusions, the significance of the work would be further enhanced by additional experiments to gain insights into mechanisms for translational control and to link specific SLs to organismal functions or mechanisms of mRNA recruitment.

      -Monitoring diel expression of SLs and direct sequencing of mature mRNA would yield insights into whether there is regulated expression of RNAs with different SLs or the SLs themselves. This would also allow the authors to perform gene ontology to link SL expression at different points in the diel cycle to related functions, e.g. photosynthesis.

      -In addition, the work would be strengthened by polysome sequencing or ribosome profiling as a function of the diel cycle, with analyses of when various spliced leader sequences are recruited to ribosomes in parallel with western blotting of polysome fractions to determine when various eIF4E isoforms are present on polysomes. This is a substantial expansion though from what the authors focused on in this manuscript, and not having these experiments does not undermine the findings presented. Alternatively, they could attempt to make bioinformatic comparisons with existing ribosome profiling datasets from a related dinoflagellate, Lingulodinium polyedrum, discussed briefly, if there were sufficient overlap between SL RNAs in these organisms.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      (1) Figures 1B, S4, and S5, Tibia sections would be more informative and promising as the growth plate is flat. Otherwise, histology of the knee would be preferred.

      We have added the tibia section images in Figures 1B, S4, and S5 (New Figure 1B, Figure 2-figure supplement 3A, and Figure 3-figure supplement 1A).

      (2) Figure 1C, The authors performed immunostaining for vimentin, alpha-SMA, Col1a1 and Col1a2. The authors should use adjusted sections for the immunostaining for different antibodies. It would avoid region-specific variations in the size and shape of sections and the data would be more reliable. Please correct and revise.

      We have provided immunostaining results using consecutive sections at the similar locations of the external ear (Figure 1C).

      (3) Figure 2A and throughout the manuscript where authors performed p-smad1/5/9 fluorescent immunostaining, the authors should also show non-phospho levels of p-smad1/5/9. Please correct and revise.

      We have tried different anti-Smad1/5/9 antibodies and the signals have very high background and are not presentable. We instead did a western blot on auricle samples and the results are in Figure 2-figure supplement 1A, suggesting that ablation of Bmpr1a led to loss of activation of Smad1/5/9 without affecting their expression. For different segments of external ear, we also provided WB results in Figure 2-figure supplement 4B. In addition, we added RNA-seq data regarding the Smad1,5,9 mRNA levels, which were not affected by Bmpr1a ablation (Figure 4-figure supplement 1B). Overall, these results suggest that Bmpr1a ablation does not affect the expression of Smad1/5/9.

      (4) Result 2, lines 131-134, the authors mentioned in the text that they observed no ear phenotype of Prrx1CreERT or Bmpr1af/f mice compared with wild-type mice (Figures S2A and S2B). However, the figures did not show histology pictures of wild-type mice. Please correct and revise.

      We have provided histological pictures of wild type mice (Figure 2-figure supplement 2C).

      (5) Result 5, lines 173-174 "We generated....Bmpr1a floxed mice". How did authors generate Col1a2-CreERT; Bmpr1af/f mice by crossing Prrx1Cre-ERT and Bmpr1af/f mice? Please correct and revise.

      It is a typo and has been corrected.

      (6) In the previous study by Soma Biswas et al., (Scientific Reports 2018, PMID 29855498) the authors mentioned in the result section that the mice with deletion of Bmpr1a using Prx1Cre looked morphologically normal. They did not mention the ear phenotype/microtia. Please explain how this study differs from current work and what are the limitations in the discussion.

      We did not observe an obvious ear phenotype in the adult transgenic Prrx1-CreERT; Bmpr1af/f mice. The reason could be that that the transgene label too few auricle chondrocytes as it has been for endosteal bones and periosteal bones in adult mice (Liu et al. Nat Genet 2022; Wilk, K. et al. Stem Cell Rep 2017; Julien A et al. J Bone Miner Res 2022). The difference is likely caused by the fact that the transgenic CreERT line was driven by a 2.3 kilobase promoter of Prrx1 that was inserted to unknow location in the genome. Since we do not carry the transgenic line any more, we cannot directly test the labelling efficiency of the transgenic line in auricle. We have discussed this point in the revised manuscript.

      Reviewer #2 (Recommendations For The Authors):

      Chondrocytes are present in many parts of the body; some components are replaced by osteoblast cells, but others stay with their morphology. These cells are in different morphological and cellular conditions throughout the body. Is there any human variant study of Prrx1 and their association with auricle chondrocytes is present?

      We searched the literature and found no study on Prrx1 in auricle chondrocytes in human.

      Do auricle chondrocytes have Prrx1+ through their developmental stage, and what's the expression situation of Prrx1+ at articular cartilage and growth plates throughout development? Only a small population is positive throughout the development, or they lose as they develop.

      We traced Prrx1 lineage cells in Prrx1-CreERT; R26tdTomato mice that received TAM at E8.5, E13.5, or p21. We found that auricle chondrocytes were Tomato+ under these conditions even only one dose of TAM (1/10 of the dose for adult mice) was given to the pregnant mice at E8.5 or E13.5 (Figure 1-figure supplement 1). However, while E8.5 mice showed Tomato+ chondrocytes at both articular cartilage and growth plate, E13.5 or p21 mice showed much fewer Tomato+ chondrocytes at articular cartilage and growth plate (Figure 1-figure supplement 1). These results indicate that Prrx1 expression differs in cartilages during development, growth, and maintenance.

      What's your rationale for studying Bmpr1a ablation at the adult stage?

      Organ development and maintenance are different processes, especially for slow-turnover tissues. Organ maintenance is also important since it accounts for 90% of the lifetime of mice. While previous studies have uncovered essential roles for BMP signaling in chondrogenic differentiation during development, it remains unclear whether BMP signaling plays a role in cartilage maintenance in adult mice.

      Line no 128: Chondrocytes are shirked but still have normal proliferation; what's the author's thought about it?

      Sorry that we did not make it clear enough. Actually there were very few cells undergoing proliferation in auricle cartilage and Bmpr1a ablation did not alter that. We have rephrased these sentences.

      Do chondrocytes have protein trafficking defects or ER/Golgi stress?

      We checked the expression of proteins involved in protein trafficking and found that some were up-regulated and some were down-regulated (Figure 4-figure supplement 1D), which may reflect the shift from chondrocytes to osteoblasts and warrants further investigation. However, the expression of ER or Golgi stress-related genes, which play critical roles in chondrocyte differentiation and survival (Wang et al. 2018; Horigome et al. 2020), was not altered by Bmpr1a ablation (Figure 4-figure supplement 1E and 1F).

      How many Prrx paralogs are there in the system? Are all associated with auricle chondrocytes and similar mechanisms?

      There is one Prrx1 paralog, Prrx2. While Prrx1-/- mice lived for up to 24 hours after birth with low-set ears (Martin JF. Eta al. Genes Dev. 1995), Prrx2-/- mice are perfectly normal. Prx1-/-Prx2-/- double mutant mice died within an hour after birth and the pups showed no external ears (ten Berge D. et al. Development. 1998). We have added this information into the revised manuscript.

      Extracellular matrix (ECM) provides cell-to-cell interaction and environment for cell growth. Does Bmpr1a ablation lead to any changes in ECM at the auricle or growth plate chondrocytes?

      Our analysis showed that the expression of many ECM proteins was down-regulated in auricle cartilage of Prrx1-CreERT; Bmpr1af/f mice (Figure 4-figure supplement 1A). This may reflect the shift from chondrocytes to osteoblasts and warrants further investigation. However, immunostaining revealed that the expression of Aggrecan and Col10 in the growth plates was unaltered in adult Prrx1-CreERT; Bmpr1af/f mice compared to control mice (Figure 4-figure supplement 1C), likely due to the lack of marking of chondrocytes in growth plates.

      Microtia usually develops during the first trimester of pregnancy in humans. What's your view about studying at the adult stage compared to intrauterine development?

      Congenital microtia is a problem with the formation of external ear whereas microtia development in adult mice is a problem with the maintenance of the auricle chondrocytes. Organ maintenance is also an important process as it starts from 3 months of age and lasts for 90% of the lifetime of mice.

      In RNA sequencing protocol, Wikipedia pages keep updating, so it is very strange to cite the Wikipedia pages. Cite a research article for it.

      We have replaced this reference.

      Why do the authors have a very low FDR value for this study? How does this value strengthen the study?

      It was a typo that has been corrected.

      It needs further validation to show that Prrx1 marked cells are a good model for auricular chondrocyte-related studies.

      We show that Prrx1 marks auricle chondrocytes but few growth plate or articular chondrocytes in adult mice, suggestive its specificity. However, the use of Prrx1-CreERT line in auricle cartilage studies is complicated by the labelling of dermal cells in the external ear by Prrx1. We have discussed this point in the revised manuscript.

    2. eLife assessment

      BMP signaling plays a vital role in skeletal tissues, and the importance of its role in microtia prevention is novel and promising. This important study sheds light on the role of BMP signaling in preventing microtia in the ear, with solid data broadly supporting the claims of the authors.

    3. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Ruichen Yang et al. investigated the importance of BMP signaling in preventing microtia. Authors showed that Cre recombinase mediated deletion of Bmpr1a using skeletal stem specific Cre Prx1Cre leads to microtia in adult and young mice. In these mice, distal auricle is more affected than middle and proximal. In these Bmpr1a floxed Prx1Cre mice, auricle chondrocyte start to differentiate into osteoblasts through increase in PKA signaling. The authors showed human single-cell RNA-Seq data sets where they observed increased PKA signaling in microtia patient which resembles their animal model experiments.

      Strengths:

      Although the importance of BMP signaling in skeletal tissues has been previously reported, the importance of its role in microtia prevention is novel and very promising to study in detail. The authors satisfied the experimental questions by performing correct methods and explaining the results in detail.

    4. Reviewer #2 (Public Review):

      The authors (Yang et al.) present a well-executed study of a mouse model of Bmpr1a focusing on microtia development and pathogenesis.

      The authors report that the generation of the Bmpr1a in Prrx1+ cells in adult mice helps characterize the developmental progression of the external ear.

      The authors explain how auricular chondrocytes differ from growth plates or other chondrocytes and BMP-Smd1/5/9 activation, which is required to maintain chondrocyte fate in the distal part of the ear. The authors explain with evidence how BMP signaling actively maintains auricle cartilage in the post-developmental stage.

      Elegant immunofluorescence staining, excellent histology preparations and dissections, excellent microscopy, sufficient experimental sample size, and good statistical analyses support the results. The study is well grounded in extensively reviewed and cited existing literature. This report sets the stage for a comprehensive interrogation of Bmpr1a deficiency and ear defects.

    1. eLife assessment

      This study uses ex vivo live imaging of the uterus, uterotubal junction, and oviduct post-mating to test the role of the sperm hook in the house mouse (Mus musculus) in sperm movement which could be interesting to evolutionary biologists. The work is useful as their live imaging revealed sperm behaviors in the female tract that have not been previously reported. However, the strength of evidence is incomplete since the limited quantification of the data is insufficient and the extensive speculation on the functions of these sperm behaviors is not supported by sufficient experimental evidence to support their conclusions.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors want to determine the role of the sperm hook of the house mouse sperm in movement through the uterus. The authors are trying to distinguish between two hypotheses put forward by others on the role of the sperm hook: (1) the sperm cooperation hypothesis (the sperm hook helps to form sperm trains) vs (2) the migration hypothesis (that the sperm hook is needed for sperm movement through the uterus). They use transgenic lines with fluorescent labels to sperm proteins, and they cross these males to C57BL/6 females in pathogen-free conditions. They use 2-photon microscopy on ex vivo uteri within 3 hours of mating and the appearance of a copulation plug. There are a total of 10 post-mating uteri that were imaged with 3 different males. They provide 10 supplementary movies that form the basis for some of the quantitative analysis in the main body figures. Their data suggest that the role of the sperm hook is to facilitate movement along the uterine wall.

      Strengths:

      Ex vivo live imaging of fluorescently labeled sperm with 2-photon microscopy is a powerful tool for studying the behavior of sperm.

      Weaknesses:

      The paper is descriptive and the data are correlations.

      The data are not properly described in the figure legends.

      When statistical analyses are performed, the authors do not comment on the trend that sperm from the three males behave differently from each other. This weakens confidence in the results. For example, in Figure 1 the sperm from male 3613 (blue squares) look different from male 838 (red circles), but all of these data are considered together. The authors should comment on why sperm across males are considered together when the individual data points appear to be different across males.

      Movies S8-S10 are single data points and no statistical analyses are performed. Therefore, it is unclear how penetrant the sperm movements are.

      Movies S1B - did the authors also track the movement of sperm located in the middle of the uterus (not close to the wall)? Without this measurement, they can't be certain that sperm close to the uterus wall travels faster.

      Movie S5A - is of lower magnitude (200 um scale bar) while the others have 50 and 20 uM scale bars. Individual sperm movement can be observed in the 20 uM (Movie 5SC). If the authors went to prove that there is no upsucking movement of sperm by the uterine contractions, they need to provide a high magnification image.

      Movie S8 - if the authors want to make the case that clustered sperm do not move faster than unclustered sperm, then they need to show Movie S8 at higher magnification. They also need to quantify these data.

      Movie S9C - what is the evidence that these sperm are dead or damaged?

      MovIe S10 - both slow- and fast-moving sperm are seen throughout the course of the movie, which does not support the authors' conclusion that sperm tails beat faster over time.

    3. Reviewer #2 (Public Review):

      Summary:

      The specific objective of this study was to determine the role of the large apical hook on the head of mouse sperm (Mus musculus) in sperm migration through the female reproductive tract. The authors used a custom-built two-photon microscope system to obtain digital videos of sperm moving within the female reproductive tract. They used sperm from genetically modified male mice that produce fluorescence in the sperm head and flagellar midpiece to enable visualization of sperm moving within the tract. Based on various observations, the authors concluded that the hook serves to facilitate sperm migration by hooking sperm onto the lining of the female reproductive tract, rather than by hooking sperm together to form a sperm train that would move them more quickly through the tract. The images and videos are excellent and inspirational to researchers in the field of mammalian sperm migration, but interpretations of the behaviors are highly speculative and not supported by controlled experimentation.

      Strengths:

      The microscope system developed by the authors could be of interest to others investigating sperm migration.

      The new behaviors shown in the images and videos could be of interest to others in the field, in terms of stimulating the development of new hypotheses to investigate.

      Weaknesses:

      The authors stated several hypotheses about the functions of the sperm behaviors they saw, but the hypotheses were not clearly stated or tested experimentally.

      The hypothesis statements were weakened by the use of hedge words, such as "may".

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      In this study, the authors address a fundamental unresolved question in cerebellar physiology: do synapses between granule cells (GCs) and Purkinje cells (PCs) made by the ascending part of the axon (AA) have different synaptic properties from those made by parallel fibers? This is an important question, as GCs integrate sensorimotor information from numerous brain areas with a precise and complex topography.

      Summary:

      The authors argue that CGs located close to PCs essentially contact PC dendrites via the ascending part of their axons. They demonstrate that joint high-frequency (100 Hz) stimulation of distant parallel fibers and local CGs potentiates AA-PC synapses, while parallel fiber-PC synapses are depressed. On the basis of paired-pulse ratio analysis, they concluded that evoked plasticity was postsynaptic. When individual pathways were stimulated alone, no LRP was observed. This associative plasticity appears to be sensitive to timing, as stimulation of parallel fibers first results in depression, while stimulation of the AA pathway has no effect. NMDA, mGluR1 and GABAA receptors are involved in this plasticity.

      Strengths:

      Overall, the associative modulation of synaptic transmission is convincing, and the experiments carried out support this conclusion. However, weaknesses limit the scope of the results.

      Weaknesses:

      One of the main weaknesses of this study is the suggestion that high-frequency parallel-fiber stimulation cannot induce long term potentiation unless combined with AA stimulation. Although we acknowledge that the stimulation and recording conditions were different from those of other studies, according to the literature (e.g. Bouvier et al 2016, Piochon et al 2016, Binda et al, 2016, Schonewille et al 2021 and others), high-frequency stimulation of parallel fibers leads to long-term postsynaptic potentiation under many different experimental conditions (blocked or unblocked inhibition, stimulation protocols, internal solution composition). Furthermore, in vivo experiments have confirmed that high-frequency parallel fibers are likely to induce long-term potentiation (Jorntell and Ekerot, 2002; Wang et al, 2009). This article provides further evidence that long-term plasticity (LTP and LTD) at this connection is a complex and subtle mechanism underpinned by many different transduction pathways. It would therefore have been interesting to test different protocols or conditions to explain the discrepancies observed in this dataset.

      Even though this is not the main result of this study, we acknowledge that the control experiments done on PF stimulation add a puzzling result to an already contradictory literature. High frequency parallel fibre stimulation (in isolation) has been shown to induce long term potentiation in vitro, but not always, and most importantly, this has been shown in vivo. This was in fact the reason for choosing that particular stimulation protocol. Examination of in vitro studies, however, show that the results are variable and even contradictory. Most were done in the presence of GABAA receptor antagonists, including the SK channel blocker Bicuculline, whereas in the study by Binda (2016), LTP was blocked by GABAA receptor inhibition. In some studies also, LTP was under the control of NMDAR activation only, whereas in Binda (2016), it was under the control of mGluR activation. Moreover, most experiments were done in mice, whereas our study was done in rats. Our results reveal intricate mechanisms working together to produce plasticity, which are highly sensitive to in vitro conditions. We designed our experiments to be close to physiological conditions, with inhibition preserved and a physiological chloride gradient. It is likely that experimental differences have given rise to the variability of the results and our inability to reproduce PF-LTP, but it was not the aim of this study to dissect the subtleties of the different experimental protocols and models. We will modify the Discussion to describe that point fully including differences in experimental conditions.

      Another important weakness is the lack of evidence that the AAs were stimulated. Indeed, without filling the PC with fluorescent dye or biocytin during the experiment, and without reconstructing the anatomical organization, it is difficult to assess whether the stimulating pipette is positioned in the GC cluster that is potentially in contact with the PC with the AAs. According to EM microscopy, AAs account for 3% of the total number of synapses in a PC, which could represent a significant number of synapses. Although the idea that AAs repeatedly contact the same Purkinje cell has been propagated, to the best of the review author's knowledge, no direct demonstration of this hypothesis has yet been published. In fact, what has been demonstrated (Walter et al 2009; Spaeth et al 2022) is that GCs have a higher probability of being connected to nearby PCs, but are not necessarily associated with AAs.

      We fully agree with the reviewer that we have not identified morphologically ascending axon synapses, and we stress this fact both in the first paragraph of the Results section, and again at the beginning of Discussion. Our point is mainly topographical, given the well documented geometrical organisation of the cerebellar cortex, and strictly speaking, inputs are local (including ascending axon) or distal (parallel fibre). Similarly, the studies by Isope and Barbour (2002) and Walter et al. (2009), just like Sims and Hartell (2005 and 2006), have coined the term ‘ascending axon’ when drawing conclusions about locally stimulated inputs. Moreover, our results do not rely on or assume multiple contacts, stronger connections, or higher probability of connections between ascending axons and Purkinje cells. Our results only demonstrate a different plasticity outcome for the two types of inputs. Therefore, our manuscript could be rephrased with the terms ‘local’ and ‘distal’ granule cell inputs, but this would have no more implication for the results or the computation performed in Purkinje cells. However, in our experience, this is more confusing to the reader, and as we already stress this point in the manuscript, we do not wish to make this modification. However we will modify the abstract of the manuscript to clarify that point.

      Reviewer #2 (Public Review):

      Summary:

      The authors describe a form of synaptic plasticity at synapses from granule cells onto Purkinje cells in the mouse cerebellum, which is specific to synapses proximal to the cell body but not to distal ones. This plasticity is induced by the paired or associative stimulation of the two types of synapses because it is not observed with stimulation of one type of synapse alone. In addition, this form of plasticity is dependent on the order in which the stimuli are presented, and is dependent on NMDA receptors, metabotropic glutamate receptors and to some degree on GABAA receptors. However, under all experimental conditions described, there is a progressive weakening or run-down of synaptic strength. Therefore, plasticity is not relative to a stable baseline, but relative to a process of continuous decline that occurs whether or not there is any plasticity-inducing stimulus.

      As highlighted by the reviewer, we observed a postsynaptic rundown of the EPSC amplitude for both input pathways. Rundown could be mistaken for a depression of synaptic currents, not for a potentiation, and the progressive decrease of the EPSC amplitude during the course of an experiment leads to an underestimate of the absolute potentiation. We have taken the view to provide a strong set of control data rather than selecting experiments based on subjective criteria or applying a cosmetic compensation procedure. We have conducted control experiments with no induction (n = 17), which give a good indication of the speed and amplitude of the rundown. Comparison shows a highly significant potentiation of the ascending axon EPSC. Depression of the parallel fibre EPSC, on the other hand, was not significantly different from rundown, and we have not spoken of parallel fibre long term depression. The data show thus very clearly that ascending axon and parallel fibre synapses behave differently following the costimulation protocol.

      Strengths:

      The focus of the authors on the properties of two different synapse-types on cerebellar Purkinje cells is interesting and relevant, given previous results that ascending and parallel fiber synapses might be functionally different and undergo different forms of plasticity. In addition, the interaction between these two synapse types during plasticity is important for understanding cerebellar function. The demonstration of timing and order-dependent potentiation of only one pathway, and not another, after associative stimulation of both pathways, changes our understanding of potential plasticity mechanisms. In addition, this observation opens up many new questions on underlying intracellular mechanisms as well as on its relevance for cerebellar learning and adaptation.

      Weaknesses and suggested improvements:

      A concern with this study is that all recordings demonstrate "rundown", a progressive decrease in the amplitude of the EPSC, starting during the baseline period and continuing after the plasticity-induction stimulus. In the absence of a stable baseline, it is hard to know what changes in strength actually occur at any set of synapses. Moreover, the issues that are causing rundown are not known and may or may not be related to the cellular processes involved in synaptic plasticity. This concern applies in particular to all the experiments where there is a decrease in synaptic strength.

      We have provided an answer to that point directly below the summary paragraph. Moreover, if the phenomenon causing rundown was involved in plasticity, it should affect plasticity of both inputs, which was not the case, clearly distinguishing the ascending axon and parallel fibre inputs.

      The authors should consider changes in the shape of the EPSC after plasticity induction, as in Fig 1 (orange trace) as this could change the interpretation.

      Figure 1 shows an average response composed of evoked excitatory and inhibitory synaptic currents. The third section of Supplementary material (supplementary figure 3) shows that this complex shape is given by an EPSC followed by a delayed disynaptic IPSC. We would like to point out that while separating EPSC from IPSC might appear difficult from average traces due to the averaged jitter in the onset of the synaptic currents, boundaries are much clearer when analysing individual traces. In the same section we discuss the results of experiments in which transient applications of SR 95531 before and after the induction protocol allowed us to measure the EPSC, while maintaining the experimental conditions during induction. Analysis of the kinetics of the EPSCs during gabazine application at the beginning and end of experiments, showed that there is no change in the time to peak of both AA and PF response. The decay time of AA and PF EPSC are slightly longer at the end of the experiment, even if the difference is not significant for AA inputs (we will add this analysis to the revised version of the paper). Our analysis, that uses as template the EPSCs kinetics measured at the beginning and at the end of the experiments, takes directly into account these changes. The results show clearly that the presence of disynaptic inhibition doesn’t significantly affect the measure of the peak EPSC after the induction protocol nor the estimate of plasticity.

      In addition, the inconsistency with previous results is surprising and is not explained; specifically, that no PF-LTP was induced by PF-alone repeated stimulation.

      In our experimental conditions, PF-LTP was not induced when stimulating PF only, the only condition that reproduces experiments in the literature. As discussed in our response to reviewer 1, a close look at the literature, however, reveals variabilities and contradictions behind seemingly similar results. They reveal intricate mechanisms working together to produce plasticity, which are sensitive to in vitro conditions. We designed our experiments to be close to physiological conditions, with inhibition preserved and a physiological chloride gradient. It is likely that experimental differences have given rise to the variability of the results and our inability to observe PF-LTP. We will modify the discussion section to discuss that point fully in the context of past results.

      The authors test the role of NMDARs, GABAARs and mGluRs in the phenotype they describe. The data suggest that the form of plasticity described here is dependent on any one of the three receptors. However, the location of these receptors varies between the Purkinje cells, granule cells and interneurons. The authors do not describe a convincing hypothetical model in which this dependence can be explained. They suggest that there is crosstalk between AA and PF synapses via endocannabinoids downstream of mGluR or NO downstream of NMDARs. However, it is not clear how this could lead to the long-term potentiation that they describe. Also, there is no long-lasting change in paired-pulse ratio, suggesting an absence of changes in presynaptic release.

      We suggest in the result section that the transient change in paired pulse ratio (PPR) is linked to a transient presynaptic effect only, which has been reported by others. This suggests that the long lasting changes observed are postsynaptic, like other reports with similar trains of stimulation, and we will modify the manuscript to state this clearly.

      Concerning the involvement of multiple molecular pathways, investigators often tested for the involvement of NMDAR or mGluRs in cerebellar plasticity, rarely both. Here we showed that both pathways are involved. The conjunctive requirement for NMDAR and mGluR activation can easily be explained based on the dependence of cerebellar LTP and LTD on the concentrations of both NO and postsynaptic calcium (Coesman et al., 2004; Safo and Regehr, 2005; Bouvier et al., 2016; Piochon et al., 2016). NO production has been linked to the activation of NMDARs in granule cell axons (Casado et al., 2002; Bidoret et al., 2009; Bouvier et al., 2016), occasionally in molecular layer interneurones (Kono et al., 2019). NO diffuses to activate Guanylate Cyclase in the Purkinje cell. Based on the literature also, different mechanisms can feed a calcium increase, including mGluRs activation. Therefore NMDARs and mGluRs can reasonably cooperate to control postsynaptic plasticity. The associative nature of AA-LTP is more complex to explain, i.e. the requirement for co-activation of AA and PF inputs, and indicates a necessary cross talk between synaptic sites. We propose that either one of the receptors is absent from AA synapses, and a signal needs to propagate from PF to AA synapses, or that both receptors are present but a signal is required to activate one of the receptors at AA synapses.

      We also observed an effect of GABAergic inhibition. GABAergic inhibition was elegantly shown by Binda (2016) to regulate calcium entry together with mGluRs, and control plasticity induction. A similar mechanism could contribute to our results, although inhibition might have additional effects. We will modify the discussion of the manuscript and add a diagram to highlight the links between the different molecular pathways and potential cross talk mechanisms, and the location of receptors.

      Is the synapse that undergoes plasticity correctly identified? In this study, since GABAergic inhibition is not blocked for most experiments, PF stimulation can result in both a direct EPSC onto the Purkinje cell and a disynaptic feedforward IPSC. The authors do address this issue with Supplementary Fig 3, where the impact of the IPSC on the EPSC within the EPSC/IPSC sequence is calculated. However, a change in waveform would complicate this analysis. An experiment with pharmacological blockade will make the interpretation more robust. The observed dependence of the plasticity on GABAA receptors is an added point in favor of the suggested additional experiments.

      We did consider that due to long recording times there might be kinetic changes, and that’s the reason why the experiments of Supplementary figure 3 were done with pharmacological blockade of GABAAR with gabazine, both before and again after LTP induction. The estimate of the amplitude of the EPSC is based on the actual kinetics of the response at both times.

      A primary hypothesis of this study is that proximal, or AA, and distal, or PF, synapses are different and that their association is specifically what drives plasticity. The alternative hypothesis is that the two synapse-types are the same. Therefore, a good control for pairing AA with PF would be to pair AA with AA and PF with PF, thereby demonstrating that pairing with each other is different from pairing with self.

      Pairing AA with AA would be difficult because stimulation of AA can only be made from a narrow band below the PC and we would likely end up stimulating overlapping sets of synapses.. However, Figure 5 shows the effect of stimulating PF and PF, while also mimicking the sparse and dense configuration of the usual experiment. It shows that sparse PF do not behave like AA. Sims and Hartell (2006) also made an experiment with sparse PF inputs and observed clear differences between sparse local (AA) and sparse distal (PF) synapses.

      It is hypothesized that the association of a PF input with an AA input is similar to the association of a PF input with a CF input. However, the two are very different in terms of cellular location, with the CF input being in a position to directly interact with PF-driven inputs. Therefore, there are two major issues with this hypothesis: 1) how can sub-threshold activity at one set of synapses affect another located hundreds of micrometers away on the same dendritic tree? 2) There is evidence that the CF encodes teaching/error or reward information, which is functionally meaningful as a driver of plasticity at PF synapses. The AA synapse on one set of Purkinje cells is carrying exactly the same information as the PF synapses on another set of Purkinje cells further up and down the parallel fiber beam. It is suggested that the two inputs carry sensory vs. motor information, which is why this form of plasticity was tested. However, the granule cells that lead to both the AA and PF synapses are receiving the same modalities of mossy fiber information. Therefore, one needs to presuppose different populations of granule cells for sensory and motor inputs or receptive field and contextual information. As a consequence, which granule cells lead to AA synapses and which to PF synapses will change depending on which Purkinje cell you're recording from. And that's inconsistent with there being a timing dependence of AA-PF pairing in only one direction. Overall, it would be helpful to discuss the functional implications of this form of plasticity.

      We do not hypothesise that association of the AA and PF inputs is similar to the association of PF and climbing fibre inputs. We compare them because it is the only other known configuration triggering associative plasticity in Purkinje cells. We conclude that ‘The climbing fibre is not the only key to associative plasticity’, and it is indeed interesting to observe that even if the inputs are very small compared to the powerful climbing fibre input, they can be effective at inducing plasticity. Physiologically, the climbing fibre signal has been clearly linked to error and reward signals, but reward signals are also encoded by granule cell inputs (Wagner et al., 2017). We will modify the discussion to make sure that we do not suggest equivalence with CF induced LTD.

      Moreover, we fully agree that AA and PF synapses made up by a given granule cell carry the same information, and cannot encode sensory and motor information at the same time. Yet, these synapses carry different information. AA synapses from a local granule cell deliver information about the local receptive field, but PF synapses from the same granule cell will deliver contextual information about that receptive field to distant Purkinje cells. In the context of sensorimotor learning, movement is learnt with respect to a global context, not in isolation, therefore learning a particular association must be relevant. The associative plasticity we describe here could help explain this functional association. Difference in timing of the inputs therefore should represent difference in the timing of activation of different granule cells which receive either local information or information from different receptive fields. We will modify the discussion to make sure we do not suggest association between sensory and motor inputs, and clarify our view of local receptive field and context about ongoing activity.

      Reviewer #3 (Public Review):

      Granule cells' axons bifurcate to form parallel fibers (PFs) and ascending axons (AAs). While the significance of PFs on cerebellar plasticity is widely acknowledged, the importance of AAs remains unclear. In the current paper, Conti and Auger conducted electrophysiological experiments in rat cerebellar slices and identified a new form of synaptic plasticity in the AA-Purkinje cell (PC) synapses. Upon simultaneous stimulation of AAs and PFs, AA-PC EPSCs increased, while PFs-EPSCs decreased. This suggests that synaptic responses to AAs and PFs in PCs are jointly regulated, working as an additional mechanism to integrate motor/sensory input. This finding may offer new perspectives in studying and modeling cerebellum-dependent behavior. Overall, the experiments are performed well. However, there are two weaknesses. First, the baseline of electrophysiological recordings is influenced significantly by run-down, making it difficult to interpret the data quantitatively. The amplitude of AA-EPSCs is relatively small and the run-down may mask the change. The authors should carefully reexamine the data with appropriate controls and statistics. Second, while the authors show AA-LTP depends on mGluR, NMDA receptors, and GABA-A receptors, which cell types express these receptors and how they contribute to plasticity is not clarified. The recommended experiments may help to improve the quality of the manuscript.

      As highlighted by the reviewer and developed above in response to reviewer 2, we observed a postsynaptic rundown of the EPSC amplitude. Rundown could be mistaken for a depression of synaptic currents, not for a potentiation. Moreover, we have conducted control experiments with no induction (n = 17), which give a good indication of the speed and amplitude of the rundown, and provide a baseline. Comparison shows a highly significant potentiation of the ascending axon EPSC, relative to baseline and relative to these control experiments. Depression of the parallel fibre EPSC on the other hand was not significantly different from rundown. For that reason we have not spoken of parallel fibre long term depression. The data, however, show that ascending axon and parallel fibre synapses behave very differently following the costimulation protocol.

      We have discussed above in our response to reviewer 2 the potential involvement of mGluRs, NMDARs and GABAARs. We will modify the discussion of the manuscript and add a diagram to highlight the links between the different molecular pathways and potential cross talk mechanisms, and the location of receptors.

    2. eLife assessment

      This study presents useful findings on an unresolved question of cerebellar physiology: Do synapses between Purkinje cells and granule cells, made by the ascending part of the granule cells' axon, have different properties than those made by parallel fibers? The authors conducted patch-clamp recordings on rat cerebellar slices and found a new type of plasticity in the synapses of the ascending part of granule cell axons. While the finding may contribute to a better understanding of cerebellar function, the results are still incomplete because the shift in the baseline recording may have influenced the readout of long-term plasticity.

    3. Reviewer #1 (Public Review):

      In this study, the authors address a fundamental unresolved question in cerebellar physiology: do synapses between granule cells (GCs) and Purkinje cells (PCs) made by the ascending part of the axon (AA) have different synaptic properties from those made by parallel fibers? This is an important question, as GCs integrate sensorimotor information from numerous brain areas with a precise and complex topography.

      Summary:<br /> The authors argue that CGs located close to PCs essentially contact PC dendrites via the ascending part of their axons. They demonstrate that joint high-frequency (100 Hz) stimulation of distant parallel fibers and local CGs potentiates AA-PC synapses, while parallel fiber-PC synapses are depressed. On the basis of paired-pulse ratio analysis, they concluded that evoked plasticity was postsynaptic. When individual pathways were stimulated alone, no LRP was observed. This associative plasticity appears to be sensitive to timing, as stimulation of parallel fibers first results in depression, while stimulation of the AA pathway has no effect. NMDA, mGluR1 and GABAA receptors are involved in this plasticity.

      Strengths:<br /> Overall, the associative modulation of synaptic transmission is convincing, and the experiments carried out support this conclusion. However, weaknesses limit the scope of the results.

      Weaknesses:<br /> One of the main weaknesses of this study is the suggestion that high-frequency parallel-fiber stimulation cannot induce long term potentiation unless combined with AA stimulation. Although we acknowledge that the stimulation and recording conditions were different from those of other studies, according to the literature (e.g. Bouvier et al 2016, Piochon et al 2016, Binda et al, 2016, Schonewille et al 2021 and others), high-frequency stimulation of parallel fibers leads to long-term postsynaptic potentiation under many different experimental conditions (blocked or unblocked inhibition, stimulation protocols, internal solution composition). Furthermore, in vivo experiments have confirmed that high-frequency parallel fibers are likely to induce long-term potentiation (Jorntell and Ekerot, 2002; Wang et al, 2009). This article provides further evidence that long-term plasticity (LTP and LTD) at this connection is a complex and subtle mechanism underpinned by many different transduction pathways. It would therefore have been interesting to test different protocols or conditions to explain the discrepancies observed in this dataset.<br /> Another important weakness is the lack of evidence that the AAs were stimulated. Indeed, without filling the PC with fluorescent dye or biocytin during the experiment, and without reconstructing the anatomical organization, it is difficult to assess whether the stimulating pipette is positioned in the GC cluster that is potentially in contact with the PC with the AAs. According to EM microscopy, AAs account for 3% of the total number of synapses in a PC, which could represent a significant number of synapses. Although the idea that AAs repeatedly contact the same Purkinje cell has been propagated, to the best of the review author's knowledge, no direct demonstration of this hypothesis has yet been published. In fact, what has been demonstrated (Walter et al 2009; Spaeth et al 2022) is that GCs have a higher probability of being connected to nearby PCs, but are not necessarily associated with AAs.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors describe a form of synaptic plasticity at synapses from granule cells onto Purkinje cells in the mouse cerebellum, which is specific to synapses proximal to the cell body but not to distal ones. This plasticity is induced by the paired or associative stimulation of the two types of synapses because it is not observed with stimulation of one type of synapse alone. In addition, this form of plasticity is dependent on the order in which the stimuli are presented, and is dependent on NMDA receptors, metabotropic glutamate receptors and to some degree on GABAA receptors. However, under all experimental conditions described, there is a progressive weakening or run-down of synaptic strength. Therefore, plasticity is not relative to a stable baseline, but relative to a process of continuous decline that occurs whether or not there is any plasticity-inducing stimulus.

      Strengths:

      The focus of the authors on the properties of two different synapse-types on cerebellar Purkinje cells is interesting and relevant, given previous results that ascending and parallel fiber synapses might be functionally different and undergo different forms of plasticity. In addition, the interaction between these two synapse types during plasticity is important for understanding cerebellar function. The demonstration of timing and order-dependent potentiation of only one pathway, and not another, after associative stimulation of both pathways, changes our understanding of potential plasticity mechanisms. In addition, this observation opens up many new questions on underlying intracellular mechanisms as well as on its relevance for cerebellar learning and adaptation.

      Weaknesses and suggested improvements:

      A concern with this study is that all recordings demonstrate "rundown", a progressive decrease in the amplitude of the EPSC, starting during the baseline period and continuing after the plasticity-induction stimulus. In the absence of a stable baseline, it is hard to know what changes in strength actually occur at any set of synapses. Moreover, the issues that are causing rundown are not known and may or may not be related to the cellular processes involved in synaptic plasticity. This concern applies in particular to all the experiments where there is a decrease in synaptic strength.<br /> The authors should consider changes in the shape of the EPSC after plasticity induction, as in Fig 1 (orange trace) as this could change the interpretation.<br /> In addition, the inconsistency with previous results is surprising and is not explained; specifically, that no PF-LTP was induced by PF-alone repeated stimulation.<br /> The authors test the role of NMDARs, GABAARs and mGluRs in the phenotype they describe. The data suggest that the form of plasticity described here is dependent on any one of the three receptors. However, the location of these receptors varies between the Purkinje cells, granule cells and interneurons. The authors do not describe a convincing hypothetical model in which this dependence can be explained. They suggest that there is crosstalk between AA and PF synapses via endocannabinoids downstream of mGluR or NO downstream of NMDARs. However, it is not clear how this could lead to the long-term potentiation that they describe. Also, there is no long-lasting change in paired-pulse ratio, suggesting an absence of changes in presynaptic release.<br /> Is the synapse that undergoes plasticity correctly identified? In this study, since GABAergic inhibition is not blocked for most experiments, PF stimulation can result in both a direct EPSC onto the Purkinje cell and a disynaptic feedforward IPSC. The authors do address this issue with Supplementary Fig 3, where the impact of the IPSC on the EPSC within the EPSC/IPSC sequence is calculated. However, a change in waveform would complicate this analysis. An experiment with pharmacological blockade will make the interpretation more robust. The observed dependence of the plasticity on GABAA receptors is an added point in favor of the suggested additional experiments.<br /> A primary hypothesis of this study is that proximal, or AA, and distal, or PF, synapses are different and that their association is specifically what drives plasticity. The alternative hypothesis is that the two synapse-types are the same. Therefore, a good control for pairing AA with PF would be to pair AA with AA and PF with PF, thereby demonstrating that pairing with each other is different from pairing with self.<br /> It is hypothesized that the association of a PF input with an AA input is similar to the association of a PF input with a CF input. However, the two are very different in terms of cellular location, with the CF input being in a position to directly interact with PF-driven inputs. Therefore, there are two major issues with this hypothesis: 1) how can sub-threshold activity at one set of synapses affect another located hundreds of micrometers away on the same dendritic tree? 2) There is evidence that the CF encodes teaching/error or reward information, which is functionally meaningful as a driver of plasticity at PF synapses. The AA synapse on one set of Purkinje cells is carrying exactly the same information as the PF synapses on another set of Purkinje cells further up and down the parallel fiber beam. It is suggested that the two inputs carry sensory vs. motor information, which is why this form of plasticity was tested. However, the granule cells that lead to both the AA and PF synapses are receiving the same modalities of mossy fiber information. Therefore, one needs to presuppose different populations of granule cells for sensory and motor inputs or receptive field and contextual information. As a consequence, which granule cells lead to AA synapses and which to PF synapses will change depending on which Purkinje cell you're recording from. And that's inconsistent with there being a timing dependence of AA-PF pairing in only one direction. Overall, it would be helpful to discuss the functional implications of this form of plasticity.

    5. Reviewer #3 (Public Review):

      Granule cells' axons bifurcate to form parallel fibers (PFs) and ascending axons (AAs). While the significance of PFs on cerebellar plasticity is widely acknowledged, the importance of AAs remains unclear. In the current paper, Conti and Auger conducted electrophysiological experiments in rat cerebellar slices and identified a new form of synaptic plasticity in the AA-Purkinje cell (PC) synapses. Upon simultaneous stimulation of AAs and PFs, AA-PC EPSCs increased, while PFs-EPSCs decreased. This suggests that synaptic responses to AAs and PFs in PCs are jointly regulated, working as an additional mechanism to integrate motor/sensory input. This finding may offer new perspectives in studying and modeling cerebellum-dependent behavior. Overall, the experiments are performed well. However, there are two weaknesses. First, the baseline of electrophysiological recordings is influenced significantly by run-down, making it difficult to interpret the data quantitatively. The amplitude of AA-EPSCs is relatively small and the run-down may mask the change. The authors should carefully reexamine the data with appropriate controls and statistics. Second, while the authors show AA-LTP depends on mGluR, NMDA receptors, and GABA-A receptors, which cell types express these receptors and how they contribute to plasticity is not clarified. The recommended experiments may help to improve the quality of the manuscript.

    1. eLife assessment

      This study provides valuable information on the mechanism of PepT2 through enhanced-sampling molecular dynamics, backed by cell-based assays, highlighting the importance of protonation of selected residues for the function of a proton-coupled oligopeptide transporter (hsPepT2). The molecular dynamics approaches are convincing, but with limitations that could be addressed in the manuscript, including lack of incorporation of a protonation coordinate in the free energy landscape, possibility of protonation of the substrate, errors with the chosen constant pH MD method for membrane proteins, dismissal of hysteresis emerging from the MEMENTO method, and the likelihood of other residues being affected by peptide binding. Some changes to the presentation could be considered, including a better description of pKa calculations and the inclusion of error bars in all PMFs. Overall, the findings will appeal to structural biologists, biochemists, and biophysicists studying membrane transporters.

    2. Reviewer #1 (Public Review):

      The authors have performed all-atom MD simulations to study the working mechanism of hsPepT2. It is widely accepted that conformational transitions of proton-coupled oligopeptide transporters (POTs) are linked with gating hydrogen bonds and salt bridges involving protonatable residues, whose protonation triggers gate openings. Through unbiased MD simulations, the authors identified extra-cellular (H87 and D342) and intra-cellular (E53 and E622) triggers. The authors then validated these triggers using free energy calculations (FECs) and assessed the engagement of the substrate (Ala-Phe dipeptide). The linkage of substrate release with the protonation of the ExxER motif (E53 and E56) was confirmed using constant-pH molecular dynamics (CpHMD) simulations and cell-based transport assays. An alternating-access mechanism was proposed. The study was largely conducted properly, and the paper was well-organized. However, I have a couple of concerns for the authors to consider addressing.

      (1) As a proton-coupled membrane protein, the conformational dynamics of hsPepT2 are closely coupled to protonation events of gating residues. Instead of using semi-reactive methods like CpHMD or reactive methods such as reactive MD, where the coupling is accounted for, the authors opted for extensive non-reactive regular MD simulations to explore this coupling. Note that I am not criticizing the choice of methods, and I think those regular MD simulations were well-designed and conducted. But I do have two concerns.

      a) Ideally, proton-coupled conformational transitions should be modelled using a free energy landscape with two or more reaction coordinates (or CVs), with one describing the protonation event and the other describing the conformational transitions. The minimum free energy path then illustrates the reaction progress, such as OCC/H87D342-  OCC/H87HD342H  OF/H87HD342H as displayed in Figure 3. Without including the protonation as a CV, the authors tried to model the free energy changes from multiple FECs using different charge states of H87 and D342. This is a practical workaround, and the conclusion drawn (the OCCOF transition is downhill with protonated H87 and D342) seems valid. However, I don't think the OF states with different charge states (OF/H87D342-, OF/H87HD342-, OF/H87D342H, and OF/H87HD342H) are equally stable, as plotted in Figure 3b. The concern extends to other cases like Figures 4b, S7, S10, S12, S15, and S16. While it may be appropriate to match all four OF states in the free energy plot for comparison purposes, the authors should clarify this to ensure readers are not misled.

      b) Regarding the substrate impact, it appears that the authors assumed fixed protonation states. I am afraid this is not necessarily the case. Variations in PepT2 stoichiometry suggest that substrates likely participate in proton transport, like the Phe-Ala (2:1) and Phe-Gln (1:1) dipeptides mentioned in the introduction. And it is not rigorous to assume that the N- and C-termini of a peptide do not protonate/deprotonate when transported. I think the authors should explicitly state that the current work and the proposed mechanism (Figure 8) are based on the assumption that the substrates do not uptake/release proton(s).

      (2) I have more serious concerns about the CpHMD employed in the study.

      a) The CpHMD in AMBER is not rigorous for membrane simulations. The underlying generalized Born model fails to consider the membrane environment when updating charge states. In other words, the CpHMD places a membrane protein in a water environment to judge if changes in charge states are energetically favorable. While this might not be a big issue for peripheral residues of membrane proteins, it is likely unphysical for internal residues like the ExxER motif. As I recall, the developers have never used the method to study membrane proteins themselves. The only CpHMD variant suitable for membrane proteins is the membrane-enabled hybrid-solvent CpHMD in CHARMM. While I do not expect the authors to redo their CpHMD simulations, I do hope the authors recognize the limitations of their method.

      b) It appears that the authors did not make the substrate (Ala-Phe dipeptide) protonatable in holo-simulations. This oversight prevents a complete representation of ligand-induced protonation events, particularly given that the substrate ion pairs with hsPepT2 through its N- & C-termini. I believe it would be valuable for the authors to acknowledge this potential limitation.

    3. Reviewer #2 (Public Review):

      Summary:

      This is an interesting manuscript that describes a series of molecular dynamics studies on the peptide transporter PepT2 (SLC15A2). They examine, in particular, the effect on the transport cycle of protonation of various charged amino acids within the protein. They then validate their conclusions by mutating two of the residues that they predict to be critical for transport in cell-based transport assays. The study suggests a series of protonation steps that are necessary for transport to occur in Petp2. Comparison with bacterial proteins from the same family shows that while the overall architecture of the proteins and likely mechanism are similar, the residues involved in the mechanism may differ.

      Strengths:

      This is an interesting and rigorous study that uses various state-of-the-art molecular dynamics techniques to dissect the transport cycle of PepT2 with nearly 1ms of sampling. It gives insight into the transport mechanism, investigating how the protonation of selected residues can alter the energetic barriers between various states of the transport cycle. The authors have, in general, been very careful in their interpretation of the data.

      Weaknesses:

      Interestingly, they suggest that there is an additional protonation event that may take place as the protein goes from occluded to inward-facing but they have not identified this residue. Some things are a little unclear. For instance, where does the state that they have defined as occluded sit on the diagram in Figure 1a? - is it truly the occluded state as shown on the diagram or does it tend to inward- or outward-facing? The pKa calculations and their interpretation are a bit unclear. Firstly, it is unclear whether they are using all the data in the calculations of the histograms, or just selected data and if so on what basis was this selection done. Secondly, they dismiss the pKa calculations of E53 in the outward-facing form as not being affected by peptide binding but say that E56 is when there seems to be a similar change in profile in the histograms.

    4. Reviewer #3 (Public Review):

      Summary:

      Lichtinger et al. have used an extensive set of molecular dynamics (MD) simulations to study the conformational dynamics and transport cycle of an important member of the proton-coupled oligopeptide transporters (POTs), namely SLC15A2 or PepT2. This protein is one of the most well-studied mammalian POT transporters that provides a good model with enough insight and structural information to be studied computationally using advanced enhanced sampling methods employed in this work. The authors have used microsecond-level MD simulations, constant-PH MD, and alchemical binding free energy calculations along with cell-based transport assay measurements; however, the most important part of this work is the use of enhanced sampling techniques to study the conformational dynamics of PepT2 under different conditions.

      The study attempts to identify links between conformational dynamics and chemical events such as proton binding, ligand-protein interactions, and intramolecular interactions. The ultimate goal is of course to understand the proton-coupled peptide and drug transport by PepT2 and homologous transporters in the solute carrier family.

      Some of the key results include<br /> (1) Protonation of H87 and D342 initiate the occluded (Occ) to the outward-facing (OF) state transition.

      (2) In the OF state, through engaging R57, substrate entry increases the pKa value of E56 and thermodynamically facilitates the movement of protons further down.

      (3) E622 is not only essential for peptide recognition but also its protonation facilitates substrate release and contributes to the intracellular gate opening. In addition, cell-based transport assays show that mutation of residues such as H87 and D342 significantly decreases transport activity as expected from simulations.

      Strengths:

      (1) This is an extensive MD-based study of PepT2, which is beyond the typical MD studies both in terms of the sheer volume of simulations as well as the advanced methodology used. The authors have not limited themselves to one approach and have appropriately combined equilibrium MD with alchemical free energy calculations, constant-pH MD, and geometry-based free energy calculations. Each of these 4 methods provides a unique insight regarding the transport mechanism of PepT2.

      (2) The authors have not limited themselves to computational work and have performed experiments as well. The cell-based transport assays clearly establish the importance of the residues that have been identified as significant contributors to the transport mechanism using simulations.

      (3) The conclusions made based on the simulations are mostly convincing and provide useful information regarding the proton pathway and the role of important residues in proton binding, protein-ligand interaction, and conformational changes.

      Weaknesses:

      (1) Some of the statements made in the manuscript are not convincing and do not abide by the standards that are mostly followed in the manuscript. For instance, on page 4, it is stated that "the K64-D317 interaction is formed in only ≈ 70% of MD frames and therefore is unlikely to contribute much to extracellular gate stability." I do not agree that 70% is negligible. Particularly, Figure S3 does not include the time series so it is not clear whether the 30% of the time where the salt bridge is broken is in the beginning or the end of simulations. For instance, it is likely that the salt bridge is not initially present and then it forms very strongly. Of course, this is just one possible scenario but the point is that Figure S3 does not rule out the possibility of a significant role for the K64-D317 salt bridge.

      (2) Similarly, on page 4, it is stated that "whether by protonation or mutation - the extracellular gate only opens spontaneously when both the H87 interaction network and D342-R206 are perturbed (Figure S5)." I do not agree with this assessment. The authors need to be aware of the limitations of this approach. Consider "WT H87-prot" and "D342A H87-prot": when D342 residue is mutated, in one out of 3 simulations, we see the opening of the gate within 1 us. When D342 residue is not mutated we do not see the opening in any of the 3 simulations within 1 us. It is quite likely that if rather than 3 we have 10 simulations or rather than 1 us we have 10 us simulations, the 0/3 to 1/3 changes significantly. I do not find this argument and conclusion compelling at all.

      (3) While the MEMENTO methodology is novel and interesting, the method is presented as flawless in the manuscript, which is not true at all. It is stated on Page 5 with regards to the path generated by MEMENTO that "These paths are then by definition non-hysteretic." I think this is too big of a claim to say the paths generated by MEMENTO are non-hysteretic by definition. This claim is not even mentioned in the original MEMENTO paper. What is mentioned is that linear interpolation generates a hysteresis-free path by definition. There are two important problems here: (a) MEMENTO uses the linear interpolation as an initial step but modifies the intermediates significantly later so they are no longer linearly interpolated structures and thus the path is no longer hysteresis-free; (b) a more serious problem is the attribution of by-definition hysteresis-free features to the linearly interpolated states. This is based on conflating the hysteresis-free and unique concepts. The hysteresis in MD-based enhanced sampling is related to the presence of barriers in orthogonal space. For instance, one may use a non-linear interpolation of any type and get a unique pathway, which could be substantially different from the one coming from the linear interpolation. None of these paths will be hysteresis-free necessarily once subjected to MD-based enhanced sampling techniques.

    1. Reviewer #2 (Public Review):

      Summary:

      In this article, Kumar et al., report on a previously unappreciated mechanism of translational regulation whereby p130Cas induces LLPS condensates that then traffic out from focal adhesion into the cytoplasm to modulate mRNA translation. Specifically, the authors employed EGFP-tagged p130Cas constructs, endogenous p130Cas, and p130Cas knockouts and mutants in cell-based systems. These experiments in conjunction with various imaging techniques revealed that p130Cas drives assembly of LLPS condensates in a manner that is largely independent of tyrosine phosphorylation. This was followed by in vitro EGFP-tagged p130Cas-dependent induction of LLPS condensates and determination of their composition by mass spectrometry, which revealed enrichment of proteins involved in RNA metabolism in the condensates. The authors excluded the plausibility that p130Cas-containing condensates co-localize with stress granules or p-bodies. Next, the authors determined mRNA compendium of p130Cas-containing condensates which revealed that they are enriched in transcripts encoding proteins implicated in cell cycle progression, survival, and cell-cell communication. These findings were followed by the authors demonstrating that p130Cas-containing condensates may be implicated in the suppression of protein synthesis using puromycylation assay. Altogether, it was found that this study significantly advances the knowledge pertinent to the understanding of molecular underpinnings of the role of p130Cas and more broadly focal adhesions on cellular function, and to this end, it is likely that this report will be of interest to a broad range of scientists from a wide spectrum of biomedical disciplines including cell, molecular, developmental and cancer biologists.

      Strengths:

      Altogether, this study was found to be of potentially broad interest inasmuch as it delineates a hitherto unappreciated link between p130Cas, LLPS, and regulation of mRNA translation. More broadly, this report provides unique molecular insights into the previously unappreciated mechanisms of the role of focal adhesions in regulating protein synthesis. Overall, it was thought that the provided data sufficiently supported most of the authors' conclusions. It was also thought that this study incorporates an appropriate balance of imaging, cell and molecular biology, and biochemical techniques, whereby the methodology was found to be largely appropriate.

      Weaknesses:

      Two major weaknesses of the study were noted. The first issue is related to the experiments establishing the role of p130Cas-driven condensates in translational suppression, whereby it remained unclear whether these effects are affecting global mRNA translation or are specific to the mRNAs contained in the condensates. Moreover, some of the results in this section (e.g., experiments using cycloheximide) may be open to alternative interpretation. The second issue is the apparent lack of functional studies, and although the authors speculate that the described mechanism is likely to mediate the effects of focal adhesions on e.g., quiescence, experimental testing of this tenet was lacking.

    2. eLife assessment

      In this valuable study, Kumar et al., provide evidence suggesting that the p130Cas drives the formation of condensates that sprout from focal adhesions to cytoplasm and suppress translation. Pending further substantiation, this study was found to be likely to provide previously unappreciated insights into the mechanisms linking focal adhesions to the regulation of protein synthesis and was thus considered to be of broad general interest. However, the evidence supporting the proposed model was incomplete; additional evidence is warranted to substantiate the relationship between p130Cas condensates and mRNA translation and establish corresponding functional consequences.

    3. Reviewer #1 (Public Review):

      Summary:

      The authors demonstrated the phenomenon of p130Cas, a protein primarily localized at focal adhesions, and its formation of condensates. They identified the constituents within the condensates, which include other focal adhesion proteins, paxillin, and RNAs. Furthermore, they proposed a link between p130Cas condensates and translation.

      Strengths:

      Adhesion components undergo rapid exchange with the cytoplasm for some unclear biological functions. Given that p130Cas is recognized as a prominent mechanical focal adhesion component, investigating its role in condensate formation, particularly its impact on the translation process, is intriguing and significant.

      Weaknesses:

      The authors identified the disordered region of p130Cas and investigated the formation of p130Cas condensate. They attempted to demonstrate that p130Cas condensates inhibit translation, but the results did not fully support this assertion. There are several comments below:

      (1) Despite isolating p130Cas-GFP protein using GFP-trap beads, the authors cannot conclusively eliminate the possibility of isolating p130Cas from focal adhesions. While the characterization of the GFP-tagged pulls can reveal the proteins and RNAs associated with p130Cas, they need to clarify their intramolecular mechanism of localization within p130Cas droplets. Whether the protein condensates retain their liquid phase or these GFP-p130Cas pulls represent protein aggregate remains uncertain.

      (2) The authors utilized hexanediol and ammonium acetate to highlight the phenomenon of p130Cas condensates. Although hexanediol is an inhibitor for hydrophobic interactions and ammonium acetate is a salt, a more thorough explanation of the intramolecular mechanisms underlying p130Cas protein-protein interaction is required. Additionally, given that the size of p130Cas condensates can exceed >100um2, classification is needed to differentiate between p130Cas condensates and protein aggregation.

      (3) The connection between p130Cas condensates and translation inhibition appears tenuous. The data only suggests a correlation between p130Cas expression and translation inhibition. Further evidence is required to bolster this hypothesis.

    1. eLife assessment

      The manuscript presents a useful model for the field of endosome maturation, providing perspective on the role of the deubiquitinating enzyme UPS-50/USP8 in the process. The evidence presented in the paper is clear, incorporating well-designed experiments that suggest the dual actions of UPS-50 and USP8 in the conversion of early endosomes into late endosomes. Overall, the work is solid and centers on an intriguing subject.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript focuses on the role of the deubiquitinating enzyme UPS-50/USP8 in endosome maturation. The authors aimed to clarify how this enzyme drives the conversion of early endosomes into late endosomes. Overall, they did achieve their aims in shedding light on the precise mechanisms by which UPS-50/USP8 regulates endosome maturation. The results support their conclusions that UPS-50 acts by disassociating RABX-5 from early endosomes to deactivate RAB-5 and by recruiting SAND-1/Mon1 to activate RAB-7. This work is commendable and will have a significant impact on the field. The methods and data presented here will be useful to the community in advancing our understanding of endosome maturation and identifying potential therapeutic targets for diseases related to endosomal dysfunction. It is worth noting that further investigation is required to fully understand the complexities of endosome maturation. However, the findings presented in this manuscript provide a solid foundation for future studies.

      Strengths:

      The major strengths of this work lie in the well-designed experiments used to examine the effects of UPS-50 loss. The authors employed confocal imaging to obtain a picture of the aftermath of the USP-50 loss. Their findings indicated enlarged early endosomes and MVB-like structures in cells deficient in USP-50/USP8.

      Weaknesses:

      Specifically, there is a need for further investigation to accurately characterize the anomalous structures detected in the ups-50 mutant. Also, the correlation between the presence of these abnormal structures and ESCRT-0 is yet to be addressed, and the current working model needs to be revised to prevent any confusion between enlarged early endosomes and MVBs.

    3. Reviewer #2 (Public Review):

      Summary:

      In this study, the authors study how the deubiquitinase USP8 regulates endosome maturation in C. elegans and mammalian cells. The authors have isolated USP8 mutant alleles in C. elegans and used multiple in vivo reporter lines to demonstrate the impact of USP8 loss-of-function on endosome morphology and maturation. They show that in USP8 mutant cells, the early endosomes and MVB-like structures are enlarged while the late endosomes and lysosomal compartments are reduced. They elucidate that USP8 interacts with Rabx5, a guanine nucleotide exchange factor (GEF) for Rab5, and show that USP8 likely targets specific lysine residue of Rabx5 to dissociate it from early endosomes. They also find that the localization of USP8 to early endosomes is disrupted in Rabx5 mutant cells. They observe that in both Rabx5 and USP8 mutant cells, the Rab7 GEF SAND-1 puncta which likely represents late endosomes are diminished, although Rabex5 is accumulated in USP8 mutant cells. The authors provide evidence that USP8 regulates endosomal maturation in a similar fashion in mammalian cells. Based on their observations they propose that USP8 dissociates Rabex5 from early endosomes and enhances the recruitment of SAND-1 to promote endosome maturation.

      Strengths:

      The major highlights of this study include the direct visualization of endosome dynamics in a living multi-cellular organism, C. elegans. The high-quality images provide clear in vivo evidence to support the main conclusions. The authors have generated valuable resources to study mechanisms involved in endosome dynamics regulation in both the worm and mammalian cells, which would benefit many members of the cell biology community. The work identifies a fascinating link between USP8 and the Rab5 guanine nucleotide exchange factor Rabx5, which expands the targets and modes of action of USP8. The findings make a solid contribution toward the understanding of how endosomal trafficking is controlled.

      Weaknesses:

      - The authors utilized multiple fluorescent protein reporters, including those generated by themselves, to label endosomal vesicles. Although these are routine and powerful tools for studying endosomal trafficking, these results cannot tell whether the endogenous proteins (Rab5, Rabex5, Rab7, etc.) are affected in the same fashion.

      - The authors clearly demonstrated a link between USP8 and Rabx5, and they showed that cells deficient in both factors displayed similar defects in late endosomes/lysosomes. However, the authors didn't confirm whether and/or to which extent USP8 regulates endosome maturation through Rabx5. Additional genetic and molecular evidence might be required to better support their working model.

    4. Reviewer #3 (Public Review):

      Summary:

      The authors were trying to elucidate the role of USP8 in the endocytic pathway. Using C. elegans epithelial cells as a model, they observed that when USP8 function is lost, the cells have a decreased number and size in lysosomes. Since USP8 was already known to be a protein linked to ESCRT components, they looked into what role USP8 might play in connecting lysosomes and multivesicular bodies (MVB). They observed fewer ESCRT-associated vesicles but an increased number of "abnormal" enlarged vesicles when USP8 function was lost. At this specific point, it's not clear what the objective of the authors was. What would have been their hypothesis addressing whether the reduced lysosomal structures in USP8 (-) animals were linked to MVB formation? Then they observed that the abnormally enlarged vesicles, marked by the PI3P biosensor YFP-2xFYVE, are bigger but in the same number in USP8 (-) compared to wild-type animals, suggesting homotypic fusion. They confirmed this result by knocking down USP8 in a human cell line, and they observed enlarged vesicles marked by YFP-2xFYVE as well. At this point, there is quite an important issue. The use of YFP-2xFYVE to detect early endosomes requires the transfection of the cells, which has already been demonstrated to produce differences in the distribution, number, and size of PI3P-positive vesicles (doi.org/10.1080/15548627.2017.1341465). The enlarged vesicles marked by YFP-2xFYVE would not necessarily be due to the loss of UPS8. In any case, it appears relatively clear that USP8 localizes to early endosomes, and the authors claim that this localization is mediated by Rabex-5 (or Rabx-5). They finally propose that USP8 dissociates Rabx-5 from early endosomes facilitating endosome maturation.

      Weaknesses:

      The weaknesses of this study are, on one side, that the results are almost exclusively dependent on the overexpression of fusion proteins. While useful in the field, this strategy does not represent the optimal way to dissect a cell biology issue. On the other side, the way the authors construct the rationale for each approximation is somehow difficult to follow. Finally, the use of two models, C. elegans and a mammalian cell line, which would strengthen the observations, contributes to the difficulty in reading the manuscript.

      The findings are useful but do not clearly support the idea that USP8 mediates Rab5-Rab7 exchange and endosome maturation, In contrast, they appear to be incomplete and open new questions regarding the complexity of this process and the precise role of USP8 within it.

    1. eLife assessment

      This important study reports the formation of a new organelle, called giant unilocular vacuole (GUVac), in mammary epithelial cells through a macropinocytosis-like process. The evidence supporting conclusions is solid, using state-of-the-art cell biology techniques. This work will be of interest to cell biologists and contribute to the understanding of cell survival mechanisms against anoikis.

    2. Reviewer #1 (Public Review):

      Summary:

      The authors found that the loss of cell-ECM adhesion leads to the formation of giant monocular vacuoles in mammary epithelial cells. This process takes place in a macropinocytosis-like process and involves PI3 kinase. They further identified dynamin and septin as essential machinery for this process. Interestingly, this process is reversible and appears to protect cells from cell death.

      Strengths:

      The data are clean and convincing to support the conclusions. The analysis is comprehensive, using multiple approaches such as SIM and TEM. The discussion on lactation is plausible and interesting.

      Weaknesses:

      As the first paper describing this phenomenon, it is adequate. However, the elucidation of the molecular mechanisms is not as exciting as it does not describe anything new. It is hoped that novel mechanisms will be elucidated in the future. In particular, the molecules involved in the reversing process could be quite interesting. Additionally, the relationship to conventional endocytic compartments, such as early and late endosomes, is not analyzed.

    3. Reviewer #2 (Public Review):

      Summary:

      The manuscript "Formation of a giant unilocular vacuole via macropinocytosis-like process confers anoikis resistance" describes an interesting observation and provides initial steps towards understanding the underlying molecular mechanism.

      The manuscript describes that the majority of non-tumorigenic mammary gland epithelial cells (MCF-10A) in suspension initiate entosis. A smaller fraction of cells form a single giant unilocular vacuole (hereafter referred to as a GUVac). GUVac appeared to be empty and did not contain invading (entotic) cells. The formation of GUVac could be promoted by disrupting actin polymerisation with LatB and CytoD. The formation of GUVacs correlated with resistance to anoikis. GUVac formation was detected in several other epithelial cells from secretory tissues.

      The authors then use electron microscopy and super-resolution imaging to describe the biogenesis of GUVac. They find that GUVac formation is initiated by a micropinocytosis-like phenomenon (that is independent of actin polymerisation). This process leads to the formation of large plasma membrane invaginations, that pinch off from the PM to form larger vesicles that fuse with each other into GUVacs.

      Inhibition of actin polymerisation in suspended MCF-10a leads to the recruitment of Septin 6 to the PM via its amphipathic helix. Treatment with FCF (a septin polymerisation inhibitor) blocked GUVac biogenesis, as did pharmacological inhibition of dynamin-mediated membrane fission. The fusion of these vesicles in GUVacs required (perhaps not surprisingly) PI3P.

      Strengths:

      The authors have made an interesting and potentially important observation. They describe the formation of an endo-lysosomal organelle (a giant unilocular vacuole - GUVac) in suspended epithelial cells and correlate the formation of GUVacs with resistance to aniokis.

      Weaknesses:

      My major concern is the experimental strategy that is used throughout the paper to induce and study the formation GUVac. Almost every experiment is conducted in suspended cells that were treated with actin depolymerising drugs (e.g. LatB) and thus almost all key conclusions are based on the results of these experiments. I only have a few suggestions that would improve these experiments or change their outcome and interpretation.

      Yet, I believe it is essential to identify the endogenous pathway leading to the actin depolymerisation that drives the formation of GUVacs in detached epithelial cells (or alternatively to figure out how it is suppressed in most detached cells). A first step in that direction would be to investigate the polymerization status of actin in MCF-10a cells that 'spontaneously' form GUVacs and to test if these cells also become resistant to anoikis.

      Also, it would be great (and I believe reasonably easy) to better characterise molecular markers of GUVacs (LAMP's, Rab's, Cathepsins, etc....) to discriminate them from other endosomal organelles

    4. Reviewer #3 (Public Review):

      Summary:

      Loss of cell attachment to extracellular matrix (ECM) triggers aniokis (a type of programmed cell death), and resistance to aniokis plays a role in cancer development. However, mechanisms underlying anoikis resistance, and the precise role of F-actin, are not fully known.

      Here the authors describe the formation of a new organelle, giant unilocular vacuole (GUVac), in cells whose F-actin is disrupted during loss of matrix attachment. GUVac formation (diameter >500 nm) resulted from a previously unrecognised macropinocytosis-like process, characterized by inwardly curved micron-sized plasma membrane invaginations, dependent on F-actin depolymerization, septin recruitment, and PI(3)P. Finally, the authors show GUVac formation after loss of matrix attachment promotes resistance to anoikis.

      From these results, the authors conclude that GUVac formation promotes cell survival in environments where F-actin is disrupted and conditions of cell stress.

      Strengths:

      The manuscript is clear and well-written, figures are all presented at a very high level.

      A variety of cutting-edge cell biology techniques (eg time-lapse imaging, EM, super-resolution microscopy) are used to study the role of the cytoskeleton in GUVac formation. It is discovered that: (i) a macropinocytosis-like process dependent on F-actin depolymerisation, SEPT6 recruitment, and PI(3)P contributes to GUVac formation, and (ii) GUVac formation is associated with resistance to cell death.

      Weaknesses:

      The manuscript is highly reliant on the use of drugs, or combinations of drugs, for long periods of time (6hr, 18hr..). Wherever possible the authors should test conclusions drawn from experiments involving drugs also using other canonical cell biology approaches (eg siRNA, Crispr). Although suggestive as a first approach, it is not reliable to draw conclusions from experiments where only drug combinations are being advanced (eg LatB + FCF).

      F-actin is well known to play a wide variety of roles in cell death and other canonical cell death pathways (PMID: 26292640). The authors show using pharmacological inhibition that F-actin is key for GUVac formation. However, especially when testing for physiological relevance, how can these other roles for F-actin be ruled out?

      To test the role of septins in GUVac formation only recruitment studies and no direct functional work is performed. A drug forchlofeneuron (FCF) is used, but this is well known to have off-target effects (PMID: 27473917).

      Cells that possess GUVac are resistant to aniokis, but how are these cells resistant? This report is focused on mechanisms underlying GUVac formation and does not directly test for mechanisms underlying aniokis resistance.

    1. eLife assessment

      This study provides a useful strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) with serum derived from mCSCC-exposed mice. The exploration of serum-derived antibodies as a potential therapy for curing cancer is particularly promising but the study provides inadequate evidence for specific effects of mCSCC-binding serum antibodies. This study will be of interest to scientists seeking a novel immunotherapic strategy in cancer therapy.

    2. Joint Public Review:

      Summary:

      This study presents an immunotherapeutic strategy for treating mouse cutaneous squamous cell carcinoma (mCSCC) using serum from mice inoculated with mCSCC. The author hypothesizes that antibodies in the generated serum could aid the immune system in tumor volume reduction. The study results showed a reduction in tumor volume and altered expression of several cancer markers (p53, Bcl-xL, NF-κB, Bax) suggesting the potential effectiveness of this approach.

      Strengths:

      The approach shows potential effect on preventing tumor progression, from both the tumor size and the cancer biomarker expression levels bringing attention to the potential role of antibodies and B cell responses in cancer therapy.

      Weaknesses:

      These are some of the specific things that the author could consider to strengthen the evidence supporting the claims in their study.

      (1) The study fails to provide evidence of the specific effect of mCSCC-antibodies on mCSCC. The study utilized serum which also contains many immune response factors like cytokines that could contribute to tumor reduction. There is no information on serum centrifugation conditions, which makes it unclear whether immune components like antigen-specific T cells, activated NK cells, or other immune cells were removed from the serum. The study does not provide evidence of neutralizing antibodies through isolation, analysis of B cell responses, or efficacy testing against specific cancer epitopes. To affirm the specific antibodies' role in the observed immune response, isolating antibodies rather than employing whole serum could provide more conclusive evidence. Purifying the serum to isolate mCSCC-binding antibodies, such as through protein A purification, and ELISA would have been more useful to quantify the immune response. It would be interesting to investigate the types of epitopes targeted following direct tumor cell injection. A more thorough characterization of the antibodies, including B cell isolation and/or hybridoma techniques, would strengthen the claim.

      (2) In the study design, the control group does not account for the potential immunostimulatory effects of serum injection itself. A better control would be tumor-bearing mice receiving serum from healthy non-mCSCC-exposed mice. Additionally, employing a completely random process for allocating the treatment groups would be preferable. Also, the study does not explain why intravenous injection of tumor cells would produce superior antibodies compared to those naturally generated in mCSCC-bearing mice.

      (3) In Figure 2B, it would be more helpful if the author could provide raw data/figures of the tumor than just the bar graph. Similarly in Figure 3, the author should show individual data points in addition to the error bar to visualize the actual distribution.

      (4) The author mentioned that different stages of tumor cells have different surface biomarkers. Therefore, experimenting with injecting tumor cells at various stages could reveal the most immunogenic stage. Such an approach would allow for a comparative analysis of immune responses elicited by tumor cells at different stages of development.

      (5) In the abstract the author mentioned that using mCSCC is a proof-of-concept for this potential cancer treatment strategy. The discussion session should extend to how this strategy might apply to other cancer types beyond carcinoma.

    1. eLife assessment

      This paper presents a valuable automated method to track individual mammalian cells as they progress through the cell cycle using the FUCCI system. The authors have developed a technique for analyzing cells that grow in suspension and used their method to look at different tumor cell lines that grow in suspension and determine the effect of drugs that directly affect the cell cycle. They show solid evidence that the method can be applied to both adherent and non-adherent cell lines. This paper will be of interest to cell biologists investigating cell cycle effects.

    2. Reviewer #1 (Public Review):

      Summary:

      The manuscript proposes a series of steps using the FIJI environment, the authors have created a plugin for the initial steps of the process, merging images into an RGB stack, conversion to HSV, and then using brightness for reference and hue to distinguish the phases of the cycle. Then, the well-known Trackmate plugin was used to identify single cells and extract intensities. The data was further post-processed in R, where a series of steps, smoothing, scaling, and addressing missing frames were used to train a random forest. Hard-coded values of hue were used to distinguish G1, S, and G2/M. The process was validated with a score comparing the quality of the tracks and the authors reported the successful measure of the cell cycles.

      Strengths:

      The implementation of the pipeline seems easy, although it requires two separate platforms: Fiji and R. A similar approach could be implemented in a single programming environment like Python or Matlab and there would not be any need to export from one to the other. However, many labs have similar setups and that is not necessarily a problem.

      Weaknesses:

      I found two important weaknesses in the proposal:

      (1) The pipeline relies on a large number of hard-coded conditions: size of Gaussian blur (Gaussian should be written in uppercase), values of contrast, size of filters, levels of intensity, etc. Presumably, the authors followed a heuristic approach and tried values of these and concluded that the ones proposed were optimal. A proper sensitivity analysis should be performed. That is, select a range of values of the variables and measure the effect on the output.

      (2) Linked to the previous comments. Other researchers that want to follow the pipeline would have either to have exactly the same acquisition conditions as the manuscript or start playing with values and try to compensate for any difference in their data (cell diameter, fluorescent intensity, etc.) to see if they can match the results of the manuscript.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper presents an automated method to track individual mammalian cells as they progress through the cell cycle using the FUCCI system and applies the method to look at different tumor cell lines that grow in suspension and determine their cell cycle profile and the effect of drugs that directly affect the cell cycles, on progression through the cell cycle for a 72 hour period.

      Strengths:

      This is a METHODS paper. The one potentially novel finding is that they can identify cells that are at the G1-S transition by the change in color as one protein starts to go up and the other one goes down, similar to the change seen as cells enter G2/M.

      Weaknesses:

      They did not clearly indicate whether the G1/S cells are identified automatically or need to be identified by the person reviewing the data. In Figures 1 and S1, the movie shows cells with no color at a time corresponding to what is about the G1/S transition. Their assigned cell cycle phase is shown in Figure 1 but not in Figure S1. None of these pictures show the G1/S cells that they talk about being able to detect with a different color.

    4. Author Response:

      We greatly appreciate the insightful feedback provided by the reviewers and the editor on our manuscript titled "Automated workflow for the cell cycle analysis of non-adherent and adherent cells using a machine learning approach".  We will provide a revised version of the manuscript aiming to address the comments and recommendations provided by the reviewers to enhance the quality and clarity of our work. In detail:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript proposes a series of steps using the FIJI environment, the authors have created a plugin for the initial steps of the process, merging images into an RGB stack, conversion to HSV, and then using brightness for reference and hue to distinguish the phases of the cycle. Then, the well-known Trackmate plugin was used to identify single cells and extract intensities. The data was further post-processed in R, where a series of steps, smoothing, scaling, and addressing missing frames were used to train a random forest. Hard-coded values of hue were used to distinguish G1, S, and G2/M. The process was validated with a score comparing the quality of the tracks and the authors reported the successful measure of the cell cycles.

      Strengths:

      The implementation of the pipeline seems easy, although it requires two separate platforms: Fiji and R. A similar approach could be implemented in a single programming environment like Python or Matlab and there would not be any need to export from one to the other. However, many labs have similar setups and that is not necessarily a problem.

      Weaknesses:

      I found two important weaknesses in the proposal:

      (1) The pipeline relies on a large number of hard-coded conditions: size of Gaussian blur (Gaussian should be written in uppercase), values of contrast, size of filters, levels of intensity, etc. Presumably, the authors followed a heuristic approach and tried values of these and concluded that the ones proposed were optimal. A proper sensitivity analysis should be performed. That is, select a range of values of the variables and measure the effect on the output.

      (2) Linked to the previous comments. Other researchers that want to follow the pipeline would have either to have exactly the same acquisition conditions as the manuscript or start playing with values and try to compensate for any difference in their data (cell diameter, fluorescent intensity, etc.) to see if they can match the results of the manuscript.

      We thank Reviewer #1 for the insightful comments. We acknowledge the importance of ensuring the reproducibility and robustness of our pipeline among different sample types, acquisition conditions and, consequently, image S/N ratio and resolution. To address the concerns regarding the reliance on hard-coded conditions and the impact of varying parameter values on the output, we will complete the Methods section of the manuscript and the “Usage” section of the README file in the Github repository (https://github.com/ieoresearch/cellcycle-image-analysis)  providing a summary of best practices that should be applied in the pre-processing part of the analysis. As an example, the usable image filters types and their settings related to cells with different size, fluorescence intensities and acquisition conditions will be analysed in detail and general guidelines will be provided.

      Moreover, we will provide detailed documentation on the acquisition conditions required for reproducibility in the README file and Methods section.

      For the Tracking Analysis part, we will refer to the well documented TrackMate tutorial to adapt the tracking analysis to different cell types, image resolution and intensities.

      Reviewer #2 (Public Review):

      Summary:

      This paper presents an automated method to track individual mammalian cells as they progress through the cell cycle using the FUCCI system and applies the method to look at different tumor cell lines that grow in suspension and determine their cell cycle profile and the effect of drugs that directly affect the cell cycles, on progression through the cell cycle for a 72 hour period.

      Strengths:

      This is a METHODS paper. The one potentially novel finding is that they can identify cells that are at the G1-S transition by the change in color as one protein starts to go up and the other one goes down, similar to the change seen as cells enter G2/M.

      Weaknesses:

      They did not clearly indicate whether the G1/S cells are identified automatically or need to be identified by the person reviewing the data. In Figures 1 and S1, the movie shows cells with no color at a time corresponding to what is about the G1/S transition. Their assigned cell cycle phase is shown in Figure 1 but not in Figure S1. None of these pictures show the G1/S cells that they talk about being able to detect with a different color.

      Thank you for your valuable feedback regarding the identification of G1/S cells in our pipeline. To clarify, the G1/S phase identification process is entirely automated within our pipeline. We apologize for any confusion caused by the lack of explicit indication in our manuscript. We will ensure to update the manuscript to clearly state that the identification of G1/S cells is performed automatically by our algorithm, eliminating the need for manual intervention.

      Regarding the visualization of G1/S cells in Figures 1 and S1, we will revise the figures to include all the available frames referred to the G1/S transition. It's important to note that during this transition, fluorescence intensities for both the green and the red channels, are dimmer in comparison with their intensity levels during the G2/M transitions. This can result in frames that may seem visually darker, despite both colors coexisting at the same time point. In our revised figures, we will ensure to include all available frames relevant to the G1/S transition and provide a clearer representation of this phenomenon.

      In response to Reviewer #2's recommendation, we plan to conduct additional experiments to further validate our observations. We will utilize the EdU technology to highlight the S-phase in FUCCI cells, allowing for better discrimination between the red and green fluorescence of the FUCCI reporter during the initial S-phase.

      Additionally, we acknowledge that the link to the Docker container (https://hub.docker.com/r/emanuelsoda/rf_semi_sup)  was not included in the manuscript. We apologize for this oversight, and it will be included in the revised version of the paper.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      A summary of what the authors were trying to achieve.

      The authors cultured pre- and Post-vaccine PBMCs with overlapping peptides encoding S protein in the presence of IL-2, IL-7, and IL-15 for 10 days, and extensively analyzed the T cells expanded during the culture; by including scRNAseq, scTCRseq, and examination of reporter cell lines expressing the dominant TCRs. They were able to identify 78 S epitopes with HLA restrictions (by itself represents a major achievement) together with their subset, based on their transcriptional profiling. By comparing T cell clonotypes between pre- and post-vaccination samples, they showed that a majority of pre-existing S-reactive CD4+ T cell clones did not expand by vaccinations. Thus, the authors concluded that highly-responding S-reactive T cells were established by vaccination from rare clonotypes.

      An account of the major strengths and weaknesses of the methods and results.

      Strengths:

      Selection of 4 "Ab sustainers" and 4 "Ab decliners" from 43 subjects who received two shots of mRNA vaccinations.

      Identification of S epitopes of T cells together with their transcriptional profiling. This allowed the authors to compare the dominant subsets between sustainers and decliners.

      Weaknesses were properly addressed in the revised manuscript, and I do not have any additional concerns.

      We appreciate the reviewer for the constructive comments and recommendations, which were a great help for us to improve our manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The paper aims to investigate the relationship between anti-S protein antibody titers with the phenotypes&clonotypes of S-protein-specific T cells, in people who receive SARS-CoV2 mRNA vaccines. To do this, the paper recruited a cohort of Covid-19 naive individuals that receives the SARS-CoV2 mRNA vaccines and collect sera and PBMCs samples on different timepoints. Then they mainly generate three sets of data: 1). Anti-S protein antibody titers on all timepoints. 2) Single-cell RNAseq/TCRseq dataset for divided T cells after stimulation by Sprotein for 10 days. 3) Corresponding epitopes for each expanded TCR clones. After analyzing these result, the paper reports two major findings&claims: A) Individuals having sustained anti-S protein antibody response also have more so-called Tfh cells in their single-cell dataset. B). S-reactive T cells do exist before the vaccination, but they seems to be unable to response to Covid-19 vaccination properly.

      The paper's strength is it uses a very systemic and thorough strategy trying to dissect the relationship between antibody titers, T cell phenotypes, TCR clonotypes and corresponding epitopes, and indeed it reports several interesting findings about the relationship of Tfh clonotypes/sustained antibody and about the S-reactive clones that exist before the vaccination. The conclusion is solid in general but some claims are overstated. My suggestion is the authors should further limit their claims in abstract, for example,

      ”Even before vaccination, S-reactive CD4+ T cell clonotypes did exist, most of which (MAY) cross-reacted with environmental or symbiotic bacteria" -- The paper don't have experimental evidence to show these TCR clones respond to these epitopes.

      We thank the reviewer for pointing out the insufficient demonstration of experimental evidence. We have added the relevant data to Fig. S5 in the newly revised manuscript.

      "These results suggest that de novo acquisition of memory Tfh-like cells upon vaccination (LIKELY) contributes to the longevity of anti-S antibody titers." --Given the small sample size and the statistical analysis was not significant, this claim was overstated.

      "S-reactive T cell clonotypes detected immediately after 2nd vaccination polarized to follicular helper T (Tfh)-like cells (UNDER IN VITRO CULTURE)". -- the conclusion was based on vitro cultured cells, which had limitation.

      We thank the reviewer for the helpful suggestion. We have corrected some sentences in line with these suggestions in the newly revised manuscript.

      Recommendations for the authors:

      Please note: Though most of the overstatement was removed from the original manuscript, authors still need to modify some of the statements in "Abstract".

      We thank the reviewer for carefully reading our manuscript and giving us detailed suggestions. We have modified these statements in “Abstract” accordingly in the newly revised manuscript.

    2. Reviewer #1 (Public Review):

      • A summary of what the authors were trying to achieve.

      The authors cultured pre- and Post-vaccine PBMCs with overlapping peptides encoding S protein in the presence of IL-2, IL-7, and IL-15 for 10 days, and extensively analyzed the T cells expanded during the culture; by including scRNAseq, scTCRseq, and examination of reporter cell lines expressing the dominant TCRs. They were able to identify 78 S epitopes with HLA restrictions (by itself represents a major achievement) together with their subset, based on their transcriptional profiling. By comparing T cell clonotypes between pre- and post-vaccination samples, they showed that a majority of pre-existing S-reactive CD4+ T cell clones did not expand by vaccinations. Thus, the authors concluded that highly-responding S-reactive T cells were established by vaccination from rare clonotypes.

      • An account of the major strengths and weaknesses of the methods and results.

      Strengths

      • Selection of 4 "Ab sustainers" and 4 "Ab decliners" from 43 subjects who received two shots of mRNA vaccinations.<br /> • Identification of S epitopes of T cells together with their transcriptional profiling. This allowed the authors to compare the dominant subsets between sustainers and decliners.

      Weaknesses were adequately addressed in the revised manuscript, and I do not have any additional concerns.

    3. eLife assessment

      This important study by Lu et al aimed to determine the key factors of T cell responses associated with durable antibody responses following the initial two shots of COVID-19 mRNA vaccinations. By comparing the SARS-CoV-2 spike protein (S)-specific T cell subsets between "Ab sustainers" and "Ab decliners" that were present post-vaccination, the authors concluded that S-specific CD4+ T cells in "Ab sustainers" were enriched with Tfh cells. There is solid evidence as the authors applied multiple methods and approaches to address the key questions, and the presented data are robust.

    4. Reviewer #3 (Public Review):

      The paper aims to investigate the relationship between anti-S protein antibody titers with the phenotypes & clonotypes of S-protein-specific T cells in people who receive SARS-CoV2 mRNA vaccines. The paper recruited a cohort of COVID-19 naive individuals who received the SARS-CoV2 mRNA vaccines and collected sera and PBMCs samples on different time points. Then, three sets of data were generated: 1). Anti-S protein antibody titers on all time points. 2) Single-cell RNAseq/TCRseq analysis for divided T cells after in vitro stimulation by S-protein. 3) Peptide epitopes for each expanded TCR clone. Based on these, the paper reports two major findings: A) Individuals having more sustained anti-S protein antibody response also have more Tfh-featured S-specific cells in their blood after 2nd-dose vaccination. B). S-specific cross-reactive T cells exist in COVID-19 naive individuals, but most of these T cell clones are not expanded after SARS-CoV-2 vaccination.

      The paper's strength is that it uses a very systemic strategy trying to dissect the relationship between antibody titers, T cell phenotypes, TCR clonotypes and corresponding epitopes. The conclusion is solid in general. However, the weaknesses include the relatively small sample size (4 sustainers vs. 4 decliners) and the use of in vitro stimulated cells for analysis, which may 'blur' the classification of T cell subsets. Nevertheless, it may have great impact on future vaccine design because it demonstrated that promoting Tfh differentiation is crucial for the longevity of antibody response. Additionally, this paper nicely showed that most cross-reactive clones that are specific to environmental/symbiotic microbes did not expand post- vaccination, providing important fundamental insights into the establishment of T-cell responses after SARS-CoV-2 vaccination.

    1. Author Response

      The following is the authors’ response to the current reviews.

      At this stage the referees had only minor comments. Referee #1 asked whether archerfish indeed generalize in egocentric rather than allocentric coordinates. It might be that the current results do not rule out the idea that archerfish are unaware of changes in body position, they continue with previously successful actions, that seems as egocentric generalization. We agree with referee #1 and updated lines 255-260 in the results and added lines 329-336 in the discussion text that mentions this possibility. Referee #2 mentioned that a portion of fish did not make it to the final test which raises the question whether all individuals are able to solve the task. We agree with referee #2 and added paragraph at the discussion section to mention this point (lines 384-388). We also added the salinity of the water in the water tanks (line 98) as per suggestion of the Referee #2. Referee #2 suggested using a different term than “washout” in the behavioral experiments. Since the term “washout” is standard in the field, we keep the term in the text.


      The following is the authors’ response to the original reviews.

      eLife assessment

      This useful study explores how archerfish adapt their shooting behavior to environmental changes, particularly airflow perturbations. It will be of interest to experts interested in mechanisms for motor learning. While the evidence for an internal model for adaptation is solid, evidence for adaptation to light refraction, as initially hypothesized, is inconclusive. As such, the evidence supporting an egocentric representation might be caused by alternative mechanisms to airflow perturbations.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The authors examined whether archerfish have the capacity for motor adaptation in response to airflow perturbations. Through two experiments, they demonstrated that archerfish could adapt. Moreover, when the fish flipped its body position with the perturbation remaining constant, it did not instantaneously counteract the error. Instead, the archerfish initially persisted in correcting for the original perturbation before eventually adapting, consistent with the notion that the archerfish's internal model has been adapted in egocentric coordinates.

      Evaluation:

      The results of both experiments were convincing, given the observable learning curve and the clear aftereffect. The ability of these fish to correct their errors is also remarkable. Nonetheless, certain aspects of the experiment's motivation and conclusions temper my enthusiasm.

      (1) The authors motivated their experiments with two hypotheses, asking whether archerfish can adapt to light refractions using an innate look-up table as opposed to possessing a capacity to adapt. However, the present experiments are not designed to arbitrate between these ideas. That is, the current experiments do not rule out the look-up table hypothesis, which predicts, for example, that motor adaptation may not generalize to de novo situations with arbitrary actionoutcome associations. Such look-up table operations may also show set-size effects, whereas other mechanisms might not. Whether their capacity to adapt is innate or learned was also not directly tested, as noted by the authors in the discussion. Could the authors clarify how they see their results positioned in light of the two hypotheses noted in the Introduction?

      We agree with the referee that look up tables only confuse the issue. The question we tested is whether or not the fish uses adaptation mechanisms to correct its shooting. We have now changed the introduction both to eliminate the entire question of look up tables and also to clarify that both innate mechanisms and learning mechanisms can contribute to fish shooting, and that our research focuses on the question of whether the fish can adapt to a perturbation in its shooting caused by a change in its physical environment.

      (2) The authors claim that archerfish use egocentric coordinates rather than allocentric coordinates. However, the current experiments do not make clear whether the archerfish are "aware" that their position was flipped (as the authors noted, no visual cues were provided). As such, for example, if the fish were "unaware" of the switch, can the authors still assert that generalization occurs in egocentric coordinates? Or simply that, when archerfish are ostensibly unaware of changes in body position, they continue with previously successful actions.

      The fish has access to the body position switch: there are clues in a water tank that can help the fish orient inside the water tank. Additionally, there are no clues to the presence or direction of the air flow above the water tank. Moreover, previous experience has shown that the fish is sensitive to the visual cues and uses them to achieve consistent orientation within the tank when possible. These points have been added to the main text [lines 143-144, 254-257]

      (3) The experiments offer an opportunity to examine whether archerfish demonstrate any savings from one session to another. Savings are often attributed to a faster look-up table operation. As such, if archerfish do not exhibit savings, it might indicate a scenario where they do not possess a refined look-up table and must rely on implicit mechanisms to relearn each time.

      This is an important question. Indeed, we looked for the ‘saving’ effect in the data, but its noisy nature prevented us from drawing a concrete conclusion. We now mention this in lines 247-249.

      We have also eliminated the discussion of look up tables from the article.

      (4) The authors suggest that motor adaptation in response to wind may hint at mechanisms used to adapt to light refraction. However, how strong of a parallel can one draw between adapting to wind versus adapting to light refraction? This seems important given the claims in this paper regarding shared mechanisms between these processes. As a thought experiment, what would the authors predict if they provided a perturbation more akin to light refraction (e.g., a film that distorts light in a new direction, rather than airflow)?

      This is an important point. Indeed, our project started by looking for options to distort the refraction index or distort the light in a new direction. However, given the available ways of distorting the light to a new direction, it is hard to achieve that on the technical level. Initially, we tried using prism goggles, however the archerfish found it hard to shoot with the heavy load on the head. We have also explored oil on the water surface. However, given the available oils and the width of the film above water, it is hard to achieve considerable perturbation.

      Fish response to the perturbation matches the response to what would be expected for a change in light refraction. Light refraction perturbation does not change with the change in fish body position relative to the target. However, in response to (and in agreement with) the referees, we have generalized the context in which we see our results and discuss the results in terms of adaptation of the fish shooting behavior to changes in physical factors including light refraction, wind, fatigue, and others.

      (5) The number of fish excluded was greater than those included. This raises the question as to whether these fish are merely elite specimens or representative of the species in general.

      The filtering of the fish was in the training stage. The requirements were quite strict: the fish had to produce enough shots each day in the experimental setup. Very few fish succeeded. But all fish that got to the stage of perturbation exhibited the adaptation effect. We do not see a reason to think that the motivation to shoot will have a strong interaction with the shooting adaptation mechanisms.

      Reviewer #2 (Public Review):

      Summary:

      The work of Volotsky et al presented here shows that adult archerfish are able to adjust their shooting in response to their own visual feedback, taking consistent alterations of their shot, here by an air flow, into account. The evidence provided points to an internal mechanism of shooting adaptation that is independent of external cues, such as wind. The authors provide evidence for this by forcing the fish to shoot from 2 different orientations to the external alteration of their shots (the airflow). This paper thus provides behavioral evidence of an internal correction mechanism, that underlies adaptive motor control of this behavior. It does not provide direct evidence of refractory index-associated shoot adjustance.

      Strengths:

      The authors have used a high number of trials and strong statistical analysis to analyze their behavioral data.

      Weaknesses:

      While the introduction, the title, and the discussion are associated with the refraction index, the latter was not altered, and neither was the position of the target. The "shot" was altered, this is a simple motor adaptation task and not a question related to the refractory index. The title, abstract, and the introduction are thus misleading. The authors appear to deduce from their data that the wind is not taken into account and thus conclude that the fish perceive a different refractory index. This might be based on the assumption that fish always hit their target, which is not the case. The airflow does not alter the position of the target, thus the airflow does not alter the refractive index. The fish likely does not perceive the airflow, thus alteration of its shooting abilities is likely assumed to be an "internal problem" of shooting. I am sorry but I am not able to understand the conclusion they draw from their data.

      This is an important point. Indeed, our project started by looking for options to distort the refraction index or distort the light in a new direction. However, given the available ways of distorting the light to a new direction, it is hard to achieve that on the technical level. Initially, we tried using prism goggles, however the archerfish found it hard to shoot with the heavy load on the head. We have also explored oil on the water surface. However, given the available oils and the width of the film above water, it is hard to achieve considerable perturbation.

      Fish response to the perturbation matches the response to what would be expected for a change in light refraction. Light refraction perturbation does not change with the change in fish body position relative to the target. However, in response to (and in agreement with) the referees, we have generalized the context in which we see our results and discuss the results in terms of adaptation of the fish shooting behavior to changes in physical factors including light refraction, wind, fatigue, and others.

      Reviewer #2 (Recommendations For The Authors):

      I have had a hard time trying to understand how the authors concluded that the RI is important here as it is not altered. Thus I did not understand the conclusions drawn from this paper. The experiments are well described, but the conclusions are not to me. Maybe schematics would help to clarify. I am from outside the field and represent a naïve reader with an average intellect. The authors need to do a better job of explaining their results if they want others to understand their conclusions.

      See response to the public comments.

      Minor comments:

      Line 9: omit the "an".

      Done.

      Line 11: this sentence would fit way better if it followed the next one.<br /> Done.

      Line 15: and all the rest of the paper: washout is a strange term and for me associated with pharmacological manipulations - might only be me. I suggest using recovery instead throughout the manuscript.

      The term ‘washout’ is often used in the field of motor adaptation to describe the return to original condition. For example:

      Kluzik J, Diedrichsen J, Shadmehr R, Bastian AJ (2008) Reach adaptation: what determines whether we learn an internal model of the tool or adapt the model of our arm? J Neurophysiol 100:1455-64. doi: 10.1152/jn.90334.2008

      Donchin O, Rabe K, Diedrichsen J, Lally N, Schoch B, Gizewski ER, Timmann D (2012) Cerebellar regions involved in adaptation to force field and visuomotor perturbation. J Neurophysiol 107:134-47

      Line 19: the fish does not expect the flow, it expects that it shoots too short- no?

      Done.

      Line 35: fix the citation - in your reference manager.

      Done.

      Line 52: provide some examples of the mechanisms you think of or papers of it for naive readers. Otherwise, this sentence is not helpful for the reader.

      Done.

      Line 183: it's unclear which parameter you mean. Rephrase.

      Done.

      Line 197: should read to test "the" - same sentence: you repeat yourself- rephrase the sentence.

      Done.

      Figure 4: it was unclear to me why the figure was differentiating between fishes until I read the legend. Why not include direct information in the figure? A schematic maybe? Legend: you have a double "that" in C.

      We added the title for each column with the information about the direction of air.

      Figures: in all figures, perturbation is wrongly spelled! Change the term washout to recovery.

      Done. We kept the term ‘washout’

    1. Author response:

      We are grateful to reviewer #1 for positive evaluation of our work and for providing valuable comments that will significantly enhance the presentation of our results. We understand reviewer #2's negative assessment because we did not discuss an alternative model of dosage compensation in Drosophila. We will address this omission in the Introduction section of the revised manuscript and remove any controversial statements from other parts of the text. However, it is important to clarify that our study does not focus on the mechanisms of dosage compensation. The main goal of the manuscript was to investigate the assembly of the MSL complex and its specific binding to the Drosophila X chromosome. We utilized male survival data to demonstrate the efficacy of MSL complex binding to the X chromosome, a relationship that has been supported by numerous independent studies. We understand that Reviewer #2 agrees that disruption of the MSL complex binding results in male lethality. As far as we understand, Reviewer #2 suggests that the MSL complex does not activate transcription of X chromosome genes, but instead facilitate the recruitment of MOF protein and potentially other general transcription factors to the X chromosome. This could explain the decrease in autosomal gene expression due to a reduction in activating factors like MOF at autosomal promoters. In the upcoming revision, we aim to strike a balance between the two models that elucidate dosage compensation in Drosophila. We appreciate your feedback and look forward to enhancing the clarity and coherence of our manuscript based on your insightful comments.

      Reviewer #2 (Public Review):

      Summary:

      A deletion analysis of the MSL1 gene to assess how different parts of the protein product interact with the MSL2 protein and roX RNA to affect the association of the MSL complex with the male X chromosome of Drosophila was performed.

      Strengths:

      The deletion analysis of the MSL1 protein and the tests of interaction with MSL2 are adequate.

      We thank the reviewer for the positive assessment of the experimental work done.

      This reviewer does not adhere to the basic premise of the authors that the MSL complex is the primary mediator of dosage compensation of the X chromosome of Drosophila.

      We completely agree with this reviewer's claim. In the Introduction section we’ll attempt to make clear that there are two models for the functional role of specific recruitment of the MSL complex to the X chromosome in males.

      Several lines of evidence from various laboratories indicate that it is involved in sequestering the MOF histone acetyltransferase to the X chromosome but there is a constraint on its action there. When the MSL complex is disrupted, there is no overall loss of compensation but there is an increase in autosomal expression. Sun et al (2013, PNAS 110: E808-817) showed that ectopic expression of MSL2 does not increase expression of the X and indeed inhibits the effect of acetylation of H4Lys16 on gene expression. Aleman et al (2021, Cell Reports 35: 109236) showed that dosage compensation of the X chromosome can be robust in the absence of the MSL complex. Together, these results indicate that the MSL complex is not the primary mediator of X chromosome dosage compensation. The authors use sex-specific lethality as a measure of disruption of dosage compensation, but other modulations of gene expression are the likely cause of these viability effects.

      Sun et al (2013, PNAS 110: E808-817) showed that recruitment of the MSL complex-specific subunit MSL2 or the MOF protein to the UAS promoter resulted in recruitment of the entire MSL complex in males but not transcriptional activation. This important result argues that the MSL complex does not activate transcription. However, it must be taken into account that the GAL4 DNA binding region used to recruit the chimeric MSL2 protein to the UAS promoter was directly fused to the MSL2 RING domain, which is critical for interaction of MSL2 with MSL1 and its ubiquitination activity (this activity could potentially be involved in transcription activation). It also remains poorly understood what happens to the MSL complex after recruitment to the promoters or HAS on the X chromosome. Subcomplex MSL1/MSL3/MOF can acetylate TF and H4K16 during RNA polymerase II elongation, resulting in increasing of transcription. The separate role of MSL2 and MSL1 in the activation of transcription of gene promoters is also shown. Sun et al. showed that in females, recruitment of MOF to the UAS promoter leads to a strong increase in transcription, which is associated with the inclusion of MOF in the non-specific lethal (NSL) complex, which is bound to promoters and is required for strong transcription activation. In males, MOF is preferentially recruited to the UAS promoter in the full MSL complex or perhaps in the MSL1/MSL3/MOF subcomplex, which stimulates transcription during RNA polymerase II elongation much less strongly than NSL complex. The same result was obtained in the Prestel et al. 2010 (Mol Cell 38:815-26). In this study the GAL4 binding sites were inserted upstream of the lacZ and mini-white genes. Activation of transcription after recruitment of GAL4-MOF to the GAL4 sites was studied in males and females. As in Sun et al. 2013, strong activation of the reporter was observed in females. A weak transcriptional activation of the reporter gene in males was shown, and the MOF protein was detected not only on the promoter, but also in the coding and 3’ regions of the reporter.

      We do not understand how the paper by Aleman et al (Cell Reports 35: 109236, 2021) is consistent with the hypothesis that the MSL complex is not involved in the transcriptional activation of X chromosomal genes. The main conclusions of this paper: 1) Inactivation of Mtor leads to selective activation of the male X chromosome. 2) Mtor-driven attenuation of male X occurs in broad domains linked by the MSL complex. 3) Mtor genetically interacts with MSL components and reduces male mortality; 4) Mtor restrains dose-compensated expression at the level of nascent transcription. Thus, the paper shows that the MSL complex has an activator activity that is partially inhibited by Mtor. Accordingly, inactivation of Mtor only partially restored the survival of males in which dosage compensation was not completely inactivated.

      A detailed explanation was provided by Birchler and Veitia (2021, One Hundred Years of Gene Balance: How stoichiometric issues affect gene expression, genome evolution, and quantitative traits. Cytogenetics and Genome Research 161: 529-550).

      We agree that an alternative model of the dosage compensation mechanism is reasonable. We can assume that both mechanisms can function jointly provide effective dosage compensation in Drosophila males. At the suggestion of the reviewer to reconsider the entire context of the article, we will make many small changes throughout the manuscript.

      Reviewer #1 (Recommendations For The Authors):

      Overall, I found the text well written and the figures logically organized (especially Figure 5, which had the potential to confuse). The authors especially excelled in bringing together the decades of literature in the Discussion.

      I offer several suggestions to improve the readability:

      Consider presenting the coiled-coil domain homology in Figure 1A as a contrast for the N-terminal region, which the authors claim is poorly conserved.

      We’ll add the coiled-coil domain homology in Figure 1A in new version of MS.

      It is difficult to visualize the red MSL2 in Figure 2; the green and red panels should be presented separately in the main text, as they are in the Supplemental Figure 2.

      We’ll prepare Figure 2 with separate green and red panels.

      The ChIP-seq experiments for MSL proteins are well presented, but in my opinion, add little to the overall conclusions:

      Figure 6 mostly recapitulates what has already been published and utilized by several groups, most recently the authors themselves (Tikhonova 2019): that MSL expressed in females targets the X/HAS, similar to in males. While these are nice supporting data for the female transgenic system, I do not believe this figure should be prominently featured as if this is a novelty of the current study.

      We fully agree with the reviewer's comment about the limitation of scientific novelty in Figure 6. It has an auxiliary meaning. Therefore, we decided to transfer this figure to Supplementary material.

      The ChIP experiments in Figure 7 agree with the conclusions in Figures 2 and 3 (polytene chromosome immunostaining) when it comes to X/autosome localization. I believe it would help with the flow of the paper if these experiments were combined or at least placed closer together in the narrative, rather than falling at the end.

      We’ll move Figure 7 closer to polytene chromosome immunostaining. We agree with reviewer that this placement of the figure will make it easier to perceive the meaning of the article as a whole.

      I find Figure 8 difficult to understand, especially since the "clusters" are not annotated in the figure, but are described in the text. I struggled to follow the authors' conclusions based on these data. The authors could clarify the figure with annotations, although to be honest I do not currently see the value of this analysis/figure.

      In the new version of the article, we will try to make this figure more understandable: we will add explanations to the figure and a legend to it, and we will also try to place emphasis more clearly in the text of the article.

    2. eLife assessment

      In this paper, the male sex-lethal (MSL) complex of proteins and RNA is studied through a domain analysis of one of its components, MSL1, and its interaction with others. While these results could be useful to researchers in the field, several studies have shown that the view that the MSL complex mediates dosage compensation is no longer considered tenable. Since there are many ways to alter viability, claims based on sex-specific viability as a reflection of dosage compensation should be viewed with much caution, and the evidence is currently considered inadequate to support the claims.

    3. Reviewer #1 (Public Review):

      Summary:

      Babosha et al. deeply investigate the N-terminal region of the Drosophila dosage compensation protein MSL1. Much of the prior research into the dosage compensation complex has focused on the male-specific MSL2 protein. However, the authors point out prior evidence that the N-terminus of MSL1 is important for protein function, including interaction with MSL2. Through a series of transgenic deletions and substitutions, the authors pinpoint two regions: N-terminal amino acids 3-7 and 41-65, which are critical for the binding of MSL1 to the X-chromosome and recruitment of MSL2. To deepen these observations, the authors perform well-controlled immunoprecipitation experiments to test the interaction of mutant MSL1 proteins with the lncRNA roX2, which is critical for the stability and localization of the dosage compensation complex. Through immunoprecipitation, the authors discover that the interaction of their mutant MSL1 proteins with roX2 is compromised. They suggest that the roX-MSL1 interaction is mediated by the N-terminal amino acids and is also critical for interaction with MSL2 and X-specific localization. This agrees with previous models that MSL1 and MSL2 directly interact through other regions.

      This work lays the foundation for future investigations into the overall structure of the dosage compensation machinery, which allows this unique complex to specifically target the X-chromosome through still unclear mechanisms.

      Strengths:

      The data provided by the authors is of high quality and supports the authors' conclusions, which are nicely contextualized in the text with previous models. The novelty of this study is specifically pinpointing the amino acid regions of MSL1 that interact with roX. The authors point out that, surprisingly, the N-terminal region of MSL1 is not particularly well conserved, indicating that the interactions outlined in this study might be Drosophila/Diptera-specific.

      The major strength of this study is that the authors find agreement between multiple dimensions of experimentation: the regions of MSL1 that are required for roX2 interaction (immunoprecipitation experiments) are also the regions that are critical for MSL1 localization to polytene chromosomes in an artificial female in vivo system, which are also critical for male-specific survival. The authors later suggest that it is the roX2 interaction that is responsible for the latter observations, although there is no direct evidence for this suggestion.

      Weaknesses:

      A minor weakness of the study is that it largely supports, and incrementally expands, the existing model in the field: that roX RNAs mediate the assembly of the complex on chromatin. I hesitate to call this a weakness, as supporting an existing model is still strong scientifically. However, the current study does not dramatically push the model forward.

    4. Reviewer #2 (Public Review):

      Summary:

      A deletion analysis of the MSL1 gene to assess how different parts of the protein product interact with the MSL2 protein and roX RNA to affect the association of the MSL complex with the male X chromosome of Drosophila was performed.

      Strengths:

      The deletion analysis of the MSL1 protein and the tests of interaction with MSL2 are adequate.

      Weaknesses:

      This reviewer does not adhere to the basic premise of the authors that the MSL complex is the primary mediator of dosage compensation of the X chromosome of Drosophila. Several lines of evidence from various laboratories indicate that it is involved in sequestering the MOF histone acetyltransferase to the X chromosome but there is a constraint on its action there. When the MSL complex is disrupted, there is no overall loss of compensation but there is an increase in autosomal expression. Sun et al (2013, PNAS 110: E808-817) showed that ectopic expression of MSL2 does not increase expression of the X and indeed inhibits the effect of acetylation of H4Lys16 on gene expression. Aleman et al (2021, Cell Reports 35: 109236) showed that dosage compensation of the X chromosome can be robust in the absence of the MSL complex. Together, these results indicate that the MSL complex is not the primary mediator of X chromosome dosage compensation. The authors use sex-specific lethality as a measure of disruption of dosage compensation, but other modulations of gene expression are the likely cause of these viability effects.

      A detailed explanation was provided by Birchler and Veitia (2021, One Hundred Years of Gene Balance: How stoichiometric issues affect gene expression, genome evolution, and quantitative traits. Cytogenetics and Genome Research 161: 529-550). The relevant portions of that article that pertain to Drosophila are quoted below. The cited references can be found in that publication.

      "In Drosophila, the sex chromosomes consist of an X and a Y. The Y in this species contains only a few genes required for male fertility (Zhang et al., 2020). The X consists of approximately 20% of the genome. Thus, females have two X chromosomes and males have one. Muller (1932) found that the expression of genes between the two sexes was similar but when individual genes on the X were varied in dosage they exhibited a proportional dosage effect. Each copy in a male was expressed at about twice the level as each copy in a female. Females with three X chromosomes are highly inviable but when they do survive to the adult stage, Stern (1960) found that they too exhibited dosage compensation in that the expression in the triple X genotype was similar to normal females and males. Studies in triploid flies found that dosage compensation also occurred among X; AAA, XX; AAA, and XXX; AAA genotypes via upregulation of the Xs, where X indicates the dosage of the X and A indicates the triploid nature of the autosomes (see Birchler, 2016 for further discussion). Diploid and triploid females have a similar per-gene expression but the other five genotypes each must modulate gene expression by different amounts equivalent to an inverse relationship between the X versus autosomal dosage to achieve a balanced expression between the X and the A (Birchler, 1996).

      Some years ago, mutations were sought in Drosophila that were lethal to males but viable in females. A number of such mutations were found and termed Male Specific Lethal (MSL) loci (Belote and Lucchesi, 1980). Once the products of these genes were identified, they were found to be at high concentrations on the male X chromosome (Kuroda et al., 1991). One of these genes encodes a histone acetyltransferase that acetylates Lysine16 of Histone H4 (Bone et al., 1994; Hilfiker et al., 1997). The recognition of the MSL complex and its association with the male X was an important set of contributions to an understanding of sex chromosome evolution in Drosophila (Kuroda et al., 2016). Thus, the hypothesis arose that the MSL complex accumulated this chromatin modifier on the male X to activate the expression about two-fold to bring about dosage compensation. Other data that contributed to this hypothesis were that when autoradiography of nascent transcription on salivary gland polytene chromosomes was examined in the MSL maleless mutation, the ratio of the number of grains over the X versus an autosomal region was reduced compared to the normal ratio (Belote and Lucchesi, 1980).

      It has been pointed out (Hiebert and Birchler, 1994; Bhadra et al., 1999; Pal Bhadra et al., 2005; Sun et al., 2013a; Birchler, 2016), however, that the grain counts over the X and the autosomes when considered in absolute terms rather than as a ratio show that the X more or less retained dosage compensation and the autosomal numbers are about doubled, i.e. exhibit an inverse dosage effect. The same situation occurs with the msl3 mutation (Okuno et al., 1984), another MSL gene, in that the autoradiographic grain numbers as an absolute measure show retention of X dosage compensation and an autosomal increase. The data treatment to produce an X to A ratio seemed reasonable in the context of the time when all regulation in eukaryotes was considered positive. However, when studies were conducted in such a manner as to assay the absolute effect on gene expression in the maleless mutation, in adults (Hiebert and Birchler, 1994), larvae (Hiebert and Birchler, 1994; Bhadra et al., 1999; 2000; Pal Bhadra et al., 2005), and embryos (Pal Bhadra et al., 2005), the trend was for retention of dosage compensation of X linked genes and an increase in expression of autosomal genes.

      In global studies, if the X to autosomal expression does not change between mutant and normal, one can conclude that dosage compensation is operating. However, a lower X to A ratio could be a loss of compensation or an increased transcriptome size from the increase of the autosomes, as suggested by the absolute data of Belote and Lucchesi (1980) and Okuno et al (1984) and was visualized directly in embryos (Pal Bhadra et al., 2005). The transcriptome size in aneuploids can change, which cannot be detected in RNA-seq analyses alone (Yang et al., 2021), so it is an important consideration for studies of dosage compensation. It was recently acknowledged that in MSL2 knockdowns the relative X expression is decreased and a moderate autosomal increase is found (Valsecchi et al., 2021b). A similar trend is evident in the microarray data on MSL2 knockdown in SL2 tissue culture cells (Hamada et al., 2005) and in the roX RNA (noncoding RNAs essential for MSL localization on the male X) mutants (Deng and Meller, 2006). This trend is in fact consistent with the absolute data that suggest an increase in the transcriptome size (Figure 7). A global change in transcriptome size can cause a generalized dosage compensation of a single chromosome to appear as a proportional dosage effect (loss of compensation) to some degree (Figure 7).

      Examination of expression in triple X metafemales, where there is no MSL complex, found that X-linked genes generally show dosage compensation but there is a generalized inverse effect on the autosomes, which could account for the detrimental effects of metafemales (Birchler et al., 1989; Sun et al., 2013b). An examination in metafemales of alleles of the white eye color gene that do or do not exhibit dosage compensation in males, showed the same response, namely, increased expression if there was no dosage compensation in males and no difference from normal females for the male dosage-compensated alleles (Birchler, 1992). This experiment demonstrated a relationship between the mechanism of dosage compensation in males and metafemales and implicated the inverse dosage effect in both. An involvement of the inverse effect in Drosophila dosage compensation provides an explanation for how the five levels of gene expression can be explained (Birchler, 1996), whereas an all-or-none presence of a complex on the X does not. The stoichiometric relationship of regulatory gene products provides a means to read the relative dosage at multiple doses to produce the appropriate inverse level.

      What then is the function of the MSL complex? It was discovered that the MSL complex will actually constrain the effect of H4 lysine16 acetylation to prevent it from causing overexpression of genes (Bhadra et al., 1999; 2000; Pal Bhadra et al., 2005; Sun and Birchler 2009; Sun et al., 2013a). Indeed, in the chromatin remodeling Imitation Switch (ISWI) mutants, the male X chromosome was specifically overexpressed suggesting that its normal function is needed for the constraint to occur (Pal Bhadra et al., 2005). Independently, the Mtor nuclear pore component shows a similar specific male X upregulation when Mtor is knocked down and this effect was shown to operate on the transcriptional level (Aleman et al., 2021). Interestingly, the increased expression of the X in the Mtor knockdown is accompanied by an inverse modulation of a substantial subset of autosomal genes, illustrating why the constraining process evolved to counteract male X overexpression. The constraining effect might involve a number of gene products (Birchler, 2016) and is an interesting direction for further study.

      Furthermore, when the H4Lys16 acetylase was individually targeted to reporter genes, there was an increase in expression (Sun et al., 2013a). However, when other members of the MSL complex were present in normal males or ectopically expressed, this increase did not occur (Sun et al., 2013a). It thus appears that the function of the MSL complex is to sequester the acetylase from the autosomes and constrain it on the X (Bhadra et al., 1999; 2000; Pal Bhadra et al., 2005; Sun and Birchler, 2009; Sun et al., 2013a). Indeed, in the Mtor knockdowns, the X-linked genes with the greatest upregulation were those with the greatest association with the acetylase and the H4K16ac histone mark (Aleman et al 2021), supporting the idea of a constraining activity that becomes released in the Mtor knockdown. When the MSL complex is disrupted, there is an inverse effect on the autosomes that occurs but in normal circumstances the sequestration mutes this effect. The MSL complex disruption releases the acetylase to be uniformly distributed across all chromosomes as determined cytologically (Bhadra et al., 1999) or via ChIPseq for H4Lys16ac (Valsecchi et al., 2021a). Indeed, the quantity of the H4Lys16ac mark only has a proportional effect on gene expression when the constraining activity is disrupted (Aleman et al., 2021) or when the MSL complex is not present (Sun et al., 2013a). Thus, in normal flies, there is a more or less equalized expression of the X and autosomes despite the monosomy for 20% of the genome.

      The component of the complex that is expressed in males and thought to organize the complex to the male X, MSL2, was recently found to also be associated with autosomal dosage-sensitive regulatory genes (Valsecchi et al., 2018). MSL2 was found to modulate these autosomal dosage-sensitive genes in various directions, which illustrates that MSL2 has a role in dosage balance that goes beyond the X chromosome. This finding is consistent with the evolutionary scenario that the initial attraction of the complex to the X chromosome was to upregulate dosage-sensitive genes in hemizygous regions as the progenitor Y became deleted for them, with the constraining activity evolving to prevent an overexpression as the amount of acetylase on the male X increased with time (Birchler, 2016).

      The MSL hypothesis takes an X-centric view that does not accommodate what is now known about dosage effects across the whole genome. The idea that dissolution of the MSL complex would cause a reduction in expression of the male X-linked genes without any consequences for the autosomes is not consistent with current knowledge of gene regulatory networks and their dosage sensitivity. Indeed, the finding of dosage compensation in large autosomal aneuploids that operates on the transcriptional level (Devlin et al., 1982; 1984; Birchler et al., 1990; Sun et al., 2013c), as well as a predominant inverse effect by the same (Devlin, et al., 1988; Birchler et al., 1990), argues that one must consider the inverse effect for an understanding of the evolution of dosage compensation in Drosophila (and other species). Further discussion of models of Drosophila compensation has been published (Birchler, 2016).

      What is likely to be the most critical issue with sex chromosome evolution is the consequences for dosage-sensitive regulatory genes. This fact is nicely illustrated by the retention of these types of genes in different independent vertebrate sex chromosome evolutions (Bellott and Page, 2021). In Drosophila, by contrast, dosage compensation is more of a blanket effect on most but not all X-linked genes despite the fact that many genes on the X are unlikely to have dosage detrimental effects, although dosage-sensitive genes might have played a role as noted above. The particularly large size of the X in Drosophila compared to the whole genome is potentially a contributing factor because such a large genomic imbalance is likely to modulate most genes across the genome. Also, there is no evidence of a WGD in Drosophila as there is in other species for which the inverse effect has been documented (maize, Arabidopsis, yeast, mice, human). These other species have various numbers of retained duplicate dosage-sensitive regulatory genes from WGDs. Thus, the relative change of regulatory genes in aneuploids in these species will not be as great compared to some of their interactors in the remainder of the genome, which could result in lesser magnitudes of some trans-acting effects, similar to how aneuploids in ascending ploidies have fewer effects as described above. The absence of duplicate regulatory genes in Drosophila would predict a stronger inverse effect in general and that could have been capitalized upon to produce dosage compensation of most genes on the X chromosome despite many of them not being dosage critical. While sex chromosome evolution must accommodate dosage-sensitive genes for proper development and viability, it could also be capitalized upon to evolve sexual dimorphisms in expression (Sun et al., 2013c)."

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      I have only a few comments that I think will improve the manuscript and help readers better appreciate the context of the reported results.

      We would like to thank the Reviewer for their time in reviewing our manuscript. We appreciate the helpful feedback and assistance in ensuring the highest quality publication possible.

      One paradox, that the authors point out, is that the drastic effects of TALK-1 L114P on plasma membrane potential do not result in a complete loss of insulin secretion. One important consideration is the role of intracellular stores in insulin secretion at physiological levels of hyperglycemia. This needs to be discussed more thoroughly, especially in the light of recent papers like Postic et al 2023 AJP and others. The authors do show an upregulation of IP3-induced Ca release. It is not clear whether they think this is a direct or indirect effect on the ER. Is there more IP3? More IP3R? Are the stores more full?

      The reviewer brings up an important point. Although we see a significant reduction in glucose-stimulated depolarization in most islets from TALK-1 L114P mice, some glucosestimulated calcium influx is still present (especially from female islets); this suggests that a subset of islet β-cells are still capable of depolarization. Because our original membrane potential recordings were done in whole islets without identification of the cell type being recorded, we have now repeated these electrical recordings in confirmed β-cells (see Supplemental figure 6). The new data shows that 33% of TALK-1 L114P β-cells show action potential firing in 11 mM glucose, which would be predicted to stimulate insulin secretion from a third of all TALK-1 L114P β-cells; this could be responsible for the remaining glucosestimulated insulin secretion observed from TALK-1 L114P islets. However, ER calcium store release could also allow for some of the calcium response in the TALK-1 L114P islets. We have now detailed this in the discussion; this now details the Postic et. al. study showing that glucose-stimulated beta-cell calcium increases involve ER calcium release as it occurs in the presence of voltage-dependent calcium channel inhibition. Future studies can assess this using SERCA inhibitors and determining if glucose-stimulated calcium influx in TALK-1 L114P islets is lost. We also find that muscarinic stimulated calcium influx from ER stores is greater in TALK-1 L114P mice. We currently do not have data to support the mechanism for this enhancement of muscarinic-induced islet calcium responses from islets expressing TALK1 L114P. Our hypothesis is that greater TALK-1 current on the ER membrane is enhancing ER calcium release in response to IP3R activation. There is an equivalent IP3R expression in control and TALK-1 L114P islets based on transcriptome analysis, which is now included in the manuscript. However, whether there is greater IP3 production, greater ER calcium storage, and/or greater ER calcium release requires further analysis. Because this finding was not directly related to the metabolic characterization of this TALK-1 L114P MODY mutation, we are planning to examine the ER functions of TALK-1L114P thoroughly in a future manuscript.

      The authors point to the possible roles of TALK-1 in alpha and delta cells. A limitation of the global knock-in approach is that the cell type specificity of the effects can't easily be determined. This should be more explicitly described as a limitation.

      We thank the reviewer for this suggestion and have added this to the discussion. This is now included in a paragraph at the end of the discussion detailing the limitations of this manuscript.

      The official gene name for TALK-1 is KCNK16. This reviewer wonders whether it wouldn't be better for this official name to be used throughout, instead of switching back and forth. The official name is used for Abcc8 for example.

      We thank the reviewer for this suggestion and have revised the manuscript to include Kcnk16 L114P. The instances of TALK-1 L114P that remain in the manuscript are in cases where the text specifically discusses TALK-1 channel function.

      There are several typos and mistakes in editing. For example, on page 5 it looks like "PMID:11263999" has not been inserted. I suggest an additional careful proofreading.

      We have revised this reference, thoroughly proofread the revised manuscript, and corrected typos.

      The difference in lethality between the strains is fascinating. Might be good to mention other examples of ion channel genes where strain alters the severe phenotypes? Additional speculation on the mechanism could be warranted. It also offers the opportunity to search for genetic modifiers. This could be discussed.

      We thank the reviewer for this suggestion and have added details on mutations where strain alters lethality.

      The sex differences are interesting. Of course, estrogen plays a role as mentioned at the bottom of page 16, but there have been more involved analyses of islet sex differences, including a recent paper from the Rideout group. Is there a sex difference in the islet expression of KCNK16 mRNA or protein, in mice or humans?

      We thank the reviewer for the important comments on the TALK-1 L114P sex differences. We have revised the manuscript to include greater discussion about female β cell resilience to stress, which may allow greater insulin secretion in the presence of the TALK-1 L114P channels; this is based on the Brownrigg et. al. study pointed out by the reviewer (PMID: 36690328). Because these sex differences in islet function were examined in mice, we looked at KCNK16 expression in mouse beta-cells. While there is a trend for greater KCNK16 expression in sorted male beta-cells (average RPKM 6296.25 +/-953.84) compared to sorted female beta-cells (5148.25 +/- 1013.22). Similarly, there was a trend toward greater KCNK16 expression in male HFD treated mouse beta-cells (average RPKM 8020.75 +/- 1944.41) compared to female HFD treated mouse beta-cells (average RPKM 7551 +/- 2952.70). We have now added this to the text.

      Page 15-16 "Indeed, it has been well established that insulin signaling is required for neonatal survival; for example, a similar neonatal lethality phenotype was observed in mice without insulin receptors (Insr-/-) where death results from hyperglycemia and diabetic ketoacidosis by P3 (40)." Formally, the authors are not examining insulin signaling. A better comparison is that of the Ins1/Ins2 double knockout model of complete hypoinsulinemia.

      We thank the reviewer for suggesting this as the appropriate comparison model and have now revised the manuscript to detail the 48-hour average life expectancy of Ins1/Ins2 double knockout mice (PMID: 9144203).

      There are probably too many abbreviations in the paper, making it harder to read by nonspecialists. I recommend writing out GOF, GSIS, WT, K2P, etc.

      We thank the reviewer for this suggestion and have revised the manuscript to reduce the use of most abbreviations.

      Reviewer #2:

      We would like to thank the Reviewer for their time in reviewing our manuscript. We appreciate the helpful feedback and assistance in ensuring the highest quality publication possible. We have thoroughly addressed all the reviewer’s comments and revised the manuscript accordingly. These changes have strengthened the manuscript and are summarized below.

      (1) The authors perform an RNA-sequencing showing that the cAMP amplifying pathway is upregulated. Is this also true in humans with this mutation? Other follow-up comments and questions from this observation:

      a) Will this mean that the treatment with incretins will improve glucose-stimulated insulin secretion and Ca2+ signalling and lower blood glucose? The authors should at least present data on glucose-stimulated insulin secretion and/or Ca2+ signalling in the presence of a compound increasing intracellular cAMP.

      b) Will an OGTT give different results than the IPGTT performed due to the fact that the cAMP pathway is upregulated?

      c) Is the increased glucagon area and glucagon secretion a compensatory mechanism that increases cAMP? What happens if glucagon receptors are blocked?

      We thank the reviewer for the suggestions. Although cAMP pathways were upregulated in the TALK-1 L114P islets, the changes in expression were only modest as examined by qRTPCR. Thus, we are not sure if this plays a role in secretion. For humans with this mutation, there have been such a small number of patients and no islets isolated from these patients. Therefore, we are unaware if the cAMP amplifying pathway is upregulated in humans with the MODY associated TALK-1 L114P mutation. We have performed the suggested experiment assessing calcium from TALK-1 L114P islets in response to liraglutide (see Supplemental figure 10); there was no liraglutide response in TALK-1 L114P islets. We have also performed the OGTT experiments as suggested and these have now been added to the manuscript (see Supplemental figure 3). We do not believe that the increased glucagon is a compensatory response, because: 1. TALK-1 deficient islets have less glucagon secretion due to reduced SST secretion (see PMID: 29402588); 2. There is no change in insulin secretion at 7mM glucose, however, glucagon secretion is significantly elevated from islets isolated from TALK-1 L114P mice; 3. TALK-1 is highly expressed in delta-cells, and in these cells TALK-1 L114P would be predicted to cause significant hyperpolarization and significant reductions in calcium entry as well as SST secretion. Thus, reduced SST secretion may be responsible for the elevation of glucagon secretion. We plan to investigate delta-cells within islets from TALK-1 L114P mice in future studies to determine if changes in SST secretion are responsible for the elevated glucagon secretion from TALK-1 L114P islets.

      (2) The performance of measurements in both male and female mice is praiseworthy. However, despite differences in the response, the authors do not investigate the potential reason for this. Are hormonal differences of importance?

      We thank the reviewer for this important point. It is indeed becoming clear that there are many differences between male and female islet function and responses to stress. Thus, we have revised the manuscript to include greater discussion about these differences such as female β cell resilience to stress, which may allow greater insulin secretion in the presence of the TALK-1 L114P channels; this is based on the Brownrigg et. al. study pointed out by reviewer 1 (PMID: 36690328). While the differences in islet function and GTT between male and female L114P mice are clear, they both show diminished islet calcium handling, defective hormone secretion, and development of glucose intolerance. This manuscript was intended to demonstrate how the MODY TALK-1 L114P causing mutation caused glucose dyshomeostasis, which we have determined in both male and female mice. The mechanistic determination for the differences between male and female mice and islets with TALK-1 L114P could be due to multiple potential causes (as detailed in PMID: 36690328), thus, we believe that comprehensive studies are required to thoroughly determine how the TALK-1 L114P mutation differently impacts male and female mice and islets, which we plan to complete in a future manuscript.

      (3) MINOR: Page 5 .." channels would be active at resting Vm PMID:11263999.." The actual reference has not been added using the reference system.

      We thank the reviewer for noticing this mistake, which has now been corrected.

      Reviewer #3:

      The manuscript is overall clearly presented and the experimental data largely support the conclusions. However, there are a number of issues that need to be addressed to improve the clarity of the paper.

      We would like to thank the Reviewer for their time in reviewing our manuscript. We appreciate the helpful feedback and assistance in ensuring the highest quality publication possible. We have thoroughly addressed all the reviewer’s comments and revised the manuscript accordingly. These changes have strengthened and improved the clarity of the manuscript.

      Specific comments:

      (1) Title: The terms "transient neonatal diabetes" and "glucose dyshomeostasis in adults" are used to describe the TALK-1 L114P mutant mice. Transient neonatal diabetes gives the impression that diabetes is resolved during the neonatal period. The authors should clarify the criteria used for transient neonatal diabetes, and the difference between glucose dyshomeostasis and MODY. Longitudinal plasma glucose and insulin data would be very informative and help readers to follow the authors' narrative.

      We appreciate the helpful comment and have added longitudinal plasma glucose from neonatal mice to address this (see Supplemental figure 2). The new data now shows the TALK-1 L114P mutant mice undergo transient hyperglycemia that resolves by p10 and then occurs again at week 15. Insulin secretion from P4 islets is also included that shows that male animals homozygous for the TALK-1 L114P mutation have the largest impairment in glucosestimulated insulin secretion, followed by male heterozygous TALK-1 L114P P4 islets that also have impaired insulin secretion (see Figure 1). The amount of hyperglycemia correlates with the defects in neonatal islet insulin secretion.

      (2) Another concern for the title is the term "α-cell overactivity." This could be taken to mean that individual α-cells are more active and/or that there are more α-cells to secrete glucagon. The study does not provide direct evidence that individual α-cells are more active. This should be clarified.

      We appreciate the helpful comment and have revised the manuscript title accordingly.

      (3) In the Introduction, it is stated that because TALK-1 activity is voltage-dependent, the GOF mutation is less likely to cause neonatal diabetes, yet the study shows the L114P TALK-1 mutation actually causes neonatal diabetes by completely abolishing glucose-stimulated Ca2+ entry. This seems to imply TALK-1 activity (either in the plasma membrane or ER membrane) has more impact on Vm or cytosolic Ca2+ in neonates than initially predicted. Some discussion on this point is warranted.

      These are important points and we have added details to the discussion about this. For example, the discussion now states that, “This suggests a greater impact of TALK-1 L114P in neonatal islets compared to adult islets. Future studies during β-cell maturation are required to determine if TALK-1 activity is greater on the plasma membrane and/or ER membrane compared with adult β-cells.” The introduction has also been revised to clarify the voltagedependence of TALK-1.

      (4) What is the relative contribution of defects in plasma membrane depolarization versus ER Ca2+ handling on defective insulin secretion response?

      We thank the reviewer for bringing up this important point. TALK-1 L114P islets show blunted glucose-stimulated depolarization and glucose-stimulated calcium entry, however, the L114P islets show equivalent Ca2+ entry as control islets in response high KCl (Figure 5GH). As the KCl stimulated Ca2+ influx is similar between control and TALK-1 L11P islets, this indicates that plasma membrane TALK-1 L114P has a hyperpolarizing role that significantly blunts glucose-stimulated depolarization and reduces activation of voltage-dependent calcium channels. We have further tested this by looking at glucose-stimulated β-cell membrane potential depolarization in TALK-1 L11P islets, which is significantly blunted (Figure4 A and B; Supplemental figure 6). However, 33% of TALK-1 L11P β-cells showed glucose-stimulated electrical excitability (Supplemental figure 6), which likely accounts for the modest GSIS from TALK-1 L11P islets. New data has also been included showing that KCl stimulation causes a significant depolarization of β-cells from TALK-1 L11P islets (Supplemental figure 6). Because plasma membrane TALK-1 L114P is largely responsible for the hyperpolarized membrane potential and blunted glucose-stimulated Ca2+ entry, this suggests that TALK-1 L11P on the plasma membrane is primarily responsible for the altered insulin secretion. The discussion has been revised to reflect this.

      (5) The Jacobson group has previously shown that another K2P channel TASK-1 is also involved in ER Ca2+ homeostasis and that TASK inhibitors restored ER Ca2+ in TASK-1 expressing cells. Is TASK-1 expressed in β-cell ER membrane? Can the mishandling of Ca2+ caused by TALK-1 L114P be reversed by TASK-1 inhibitors?

      We thank the reviewer for bringing up this important point in relation to ER calcium handling by K2P channels. We have found that TASK-1 channels expressed in alpha-cells enhance ER calcium release and that inhibitors or TASK-1 channels elevate alpha-cell ER calcium storage. We did not observe any significant changes in the gene (Kcnk3) encoding TASK-1 between islets from control or TALK-1 L11P mice, which has now been added to the manuscript. However, because the TALK-1 L11P-mediated reduction of glucose-stimulated depolarization and inhibition of calcium entry are both prevented in the presence of high KCl (see Figure X); this strongly suggests that TALK-1 L114P K+ flux at the membrane is hyperpolarizing the membrane potential and limiting depolarization and calcium entry. This suggests that TALK-1 L114P control of ER calcium handling is not the primary contributor to the blunted glucose-stimulate calcium handling. Furthermore, acetylcholine stimulation of islets from both control and TALK-1 L114P islets elicited ER calcium release, which indicates that for the most part ER calcium release is still responsive to cues that control release, but they are altered. Taken together this suggests that the TALK-1 L114P impact on ER calcium is not the primary mediator of blunted glucose-stimulated islet calcium entry and insulin secretion.

      (6) The electrical recording experiments were conducted using whole islets. The authors should comment on how the cells were identified as β-cells, especially in mutant islets in which there is an increased number of α-cells.

      The reviewer brings up an important point. As indicated, the original membrane potential recordings were conducted using whole islets. While the recorded cells could mostly be βcells based on mouse islets typically containing >80% β-cells, there is a possibility that some of the cells included in these recordings were α-cells or δ-cells (especially because of the noted α-cell hyperplasia in TALK-1 L114P islets). Thus, we have now included data from bcells that were identified with an adenoviral construct containing a rat insulin promoter driving a fluorescent reporter. This allowed the fluorescent β-cells to be monitored with electrophysiological membrane potential recordings. The new data (see Supplemental figure 6) shows a significant reduction in glucose-stimulated depolarization in 67% of β-cells with the L114P mutation compared to controls.

      Minor:

      (1) Some references need formatting.

      The references have been revised accordingly.

      (2) Please define glucose-stimulated phase 0 Ca2+ response for non-expert readers.

      This has been defined accordingly.

      (3) Page 14 bottom: The sentence "Unlike the only other MODY-associated.........., TALK-1 is not inhibited by sulfonylureas" seems out of place and lacks context.

      We thank the reviewer for this suggestion and have deleted this sentence.

      (4) Figure 6: It would be helpful to provide a protein name for the genes shown in panel D.

      The protein names for the genes have now been included in the discussion of these genes.

    2. Reviewer #2 (Public Review):

      Summary:

      This work follows previous work from the group where they have demonstrated the role of TASK1 in the regulation of glucose stimulated insulin secretion. Moreover, a recent study links a mutation in KCNK16, the gene encoding TALK-1 channels to MODY. Here the authors have constructed a mouse model with the specific mutation (TALK-1 L114P mutation) and investigated the phenotype. They have to perform a couple of breeding tricks to find a model that is lethal in adult which might complicate the conclusions, however, the phenotype of the heterozygote model used have a MODY-like phenotype. The study is convincing and solid.

      Strengths:

      (1) The work is a natural follow-up from previous studies from the groups.<br /> (2) The authors present convincing and solid data that in the long perspective will help patients with this mutations.<br /> (3) Both in vivo and in vitro data are presented to give the full picture of the phenotype.<br /> (4) Data from both female and male mice are presented.

      Weaknesses:

      The authors have answered all my comments in the revised version and I find no more weaknesses. Some questions still remain but have been clearly discussed in the new version of the manuscript.

    3. eLife assessment

      This study characterizes how a point mutation in the TALK-1 potassium channel, encoded by the KCNK16 gene, causes MODY diabetes. The mutation, L114P, causes a gain-of-function to increase K+ currents and inhibit glucose-stimulated insulin secretion. Increased glucagon likely results from paracrine effects in the islets. The data are convincing and the work will be valuable for understanding islet function.

    4. Reviewer #1 (Public Review):

      Summary:

      This paper focuses on the effects of a L114P mutation in the TALK-1 channel on islet function and diabetes. This mutation is clinically relevant and a cause of MODY diabetes. This work employs a mouse model with heterozygous and homozygous mutants. The homozygous mice are homozygous lethal from severe hyperglycemia. The work shows that the mutation increases K+ currents and inhibits insulin secretion. This is a very nice paper with mechanistic insight and clear clinical importance. It is generally well written and the data is well presented.

      Comments on revision:

      I have no further comments to add at this time. The authors have adequately addressed my concerns.

    5. Reviewer #3 (Public Review):

      Summary

      The L114P gain of function mutation in the K2P channel TALK-1 encoded by KCNJ16 has been associated with maturity-onset diabetes of the young (MODY). In this study, Nakhe et al. generated mice carrying L114P TALK-1 and evaluated the impact of the mutation on pancreatic islet functions and glucose homeostasis. The authors report that the mutation increases neonatal lethality, owing to hyperglycemia caused by a lack of glucose-stimulated Ca2+ influx and insulin secretion. Adult mutant mice showed glucose intolerance and fasting hyperglycemia, which is attributed to blunted glucose-stimulated insulin secretion as well as increased glucagon secretion. Interestingly, male mice were more affected than female mice. Islets from adult mutant mice were found to have reduced Ca2+ entry upon glucose stimulation but also enhanced IP3-induced ER Ca2+ release, consistent with previous studies from the group showing a role of TALK-1 in ER Ca2+ homeostasis. Finally, comparison of bulk RNA sequencing results from WT and mutant islets revealed altered expression of genes involved in β-cell identify, function and signaling, which also contributes to the observed islet dysfunction.

      Strengths

      This is a well-executed and rigorous study that will be of great interest to the diabetes and islet biology communities. The findings provide convincing evidence supporting a causal role of the L114P gain of function TALK-1 mutation in glucose-stimulated insulin secretion defects and diabetes. The neonatal diabetes phenotype and the gender difference uncovered by the study have important clinical implications. The complexity of TALK-1 expression and hormone secretion in different endocrine cell types and how it impacts glucose homeostasis is elegantly illustrated in the L114P TALK-1 mouse model. The authors carefully and thoroughly addressed limitations of their study and discussed future directions. The importance of TALK-1 in β-cell and islet function demonstrated by this study will prompt future efforts targeting this important channel for diabetes treatment.

    1. Author response:

      The following is the authors’ response to the original reviews.

      We appreciate the thoughtful review of our manuscript by the reviewers, along with their valuable suggestions for enhancing our work. In response to these suggestions, we conducted additional experiments and made significant revisions to both the text and figures. In the following sections, we first highlight the major changes made to the manuscript, and thereafter address each reviewer's comments point-by-point. We hope these additional data and revisions have improved the robustness and clarity of the study and manuscript. Please note that as part of a suggested revision we have changed the manuscript title to be: Bacterial vampirism mediated through taxis to serum.

      Major revisions and new data:

      (1) We conducted additional experiments testing taxis to serum using a swine ex vivo enterohemorrhagic lesion model in which we competed wildtype versus chemotaxis deficient strains (Fig. 8). We selected swine for these experiments due to their similarity in gastrointestinal physiology to humans. In these experiments we see that chemotaxis, and the chemoreceptor Tsr, mediate localization to, and migration into, the lesion. We also tested, and confirmed, taxis to serum from swine and serum from horse, that supporting that serum attraction is relevant in other host-pathogen systems.

      (2) We present additional experimental data and quantification of chemotaxis responses to human serum treated with serine-racemase (Fig. S3). This treatment reduces wildtype chemoattraction and the wildtype no longer possesses an advantage over the tsr strain, providing further evidence that L-serine is the specific chemoattractant responsible for Tsr-mediated attraction to serum.

      (3) We present additional data in the form of 17 videos of chemotaxis experiments with norepinephrine and DHMA showing null-responses under various conditions. These data provide additional support to the conclusion that these chemicals are not responsible for bacterial attraction to serum. We have included these raw data as a new supplementary file (Data S1) for those in the field that are interested in these chemicals.

      (4) Based on comments from Reviewer 2 regarding whether the position of the ligand and ligand-binding site residues in the previously-reported EcTsr LBD structure are incorrect, or whether these differences are due to the proteins being from different organisms, we performed paired crystallographic refinements to determine which positions result in model improvement (Fig. 7J). Altering the EcTsr structure to have the ligand and ligandbinding site positions from our new higher resolution and better-resolved structure of Salmonella Typhimurium Tsr results in a demonstrably better model, with both Rwork and Rfree lower by about 1% (Fig. 7J). These data support our conclusion that the correct positions for both structures are as we have modeled them in the S. Typhimurium Tsr structure. We also solved an additional crystal structure of SeTsr LBD captured at neutral pH (7-7.5) that confirms our structure captured with elevated pH (7.5-9.7) has no major changes in structure or ligand-binding interactions (Fig. S6, Table S2).

      (5) Based on comments from Reviewer 2 on the accuracy of the diffusion calculations, we present a new analysis (Fig. S2) comparing the experimentally-determined diffusion of A488 compared to its calculated diffusion. We found that:

      [line 111]: “As a test case of the accuracy of the microgradient modeling, we compared our calculated values for A488 diffusion to the normalized fluorescence intensity at time 120 s. We determined the concentration to be accurate within 5% over the distance range 70270 µm (Fig. S2). At smaller distances (<70 µm) the measured concentration is approximately 10% lower than that predicted by the computation. This could be due to advection effects near the injection site that would tend to enhance the effective local diffusion rate.”

      (6) Both reviewers asked us to better justify why we focused on the chemoreceptor Tsr, and had questions about why we did not investigate Tar. The low concentration of Asp in serum suggests Tar could have some effect, but less so than Trg or Tsr (see Fig. 4A). We have revised the text throughout to better convey that we agree multiple chemoreceptors are involved in the response and clarify our rationale for studying the role of Tsr:

      [line 178]: “We modeled the local concentration profile of these effectors based on their typical concentrations in human serum (Fig. 4B). Of these, by far the two most prevalent chemoattractants in serum are glucose (5 mM) and L-serine (100-300 µM) (Fig. 4B-F). This suggested to us that the chemoreceptors Trg and/or Tsr could play important roles in serum attraction.”

      [line 186]: “Since tsr mutation diminishes serum attraction but does not eliminate it, we conclude that multiple chemoattractant signals and chemoreceptors mediate taxis to serum. To further understand the mechanism of this behavior we chose to focus on Tsr as a representative chemoreceptor involved in the response, presuming that serum taxis involves one, or more, of the chemoattractants recognized by Tsr that is present in serum: L-serine, NE, or DHMA.”

      [line 468] “Serum taxis occurs through the cooperative action of multiple bacterial chemoreceptors that perceive several chemoattractant stimuli within serum, one of these being the chemoreceptor Tsr through recognition of L-serine (Fig. 4).”

      Point-by-point responses to reviewer comments:

      Reviewer #1:

      (1) Presumably in the stomach, any escaping serum will be removed/diluted/washed away quite promptly? This effect is not captured by the CIRA assay but perhaps it might be worth commenting on how this might influence the response in vivo. Perhaps this could explain why, even though the chemotaxis appears rapid and robust, cases of sepsis are thankfully relatively rare.

      To clarify, the Enterobacteriaceae species we have tested here are colonizers of the intestines, not the stomach, and cases of bacteremia from these species are presumably due to bloodstream entry through intestinal lesions. Whether or not intestinal flow acts as a barrier to bloodstream entry is not something we test here, and so we have not commented on this idea in the manuscript. We do demonstrate that attraction to serum occurs within seconds-to-minutes of exposure. We expect that the major protective effects against sepsis are the host antibacterial factors in serum, which are well-described in other work. We have been careful to state throughout the text that we see attraction responses, and growth benefits, to serum that is diluted in an aqueous media, which is different than bacterial growth in 100% serum or in the bloodstream.

      (2) The authors refer to human serum as a chemoattractant numerous times throughout the study (including in the title). As the authors acknowledge, human serum is a complex mixture and different components of it may act as chemoattractants, chemo-repellents (particularly those with bactericidal activities) or may elicit other changes in motility (e.g. chemokinesis). The authors present convincing evidence that cells are attracted to serine within human serum - which is already a well-known bacterial chemoattractant. Indeed, their ability to elucidate specific elements of serum that influence bacterial motility is a real strength of the study. However, human serum itself is not a chemoattractant and this claim should be re-phrased - bacteria migrate towards human serum, driven at least in part by chemotaxis towards serine.

      Throughout the text we have changed these statements, including in the title, to either be ‘taxis to serum’ or ‘serum attraction.’ On the timescales we tested our data support that chemotaxis, not chemokineses or other forms of direction motility, is what drives rapid serum attraction, since a motile but non-chemotactic cheY mutant cannot localize to serum (Fig. 4). We present evidence of one of these chemotactic interactions (L-Ser).

      (3) Linked to the previous point, several bacterial species (including E. coli - one of the bacterial species investigated here) are capable of osmotaxis (moving up or down gradients in osmolality). Whilst chemotaxis to serine is important here, could movement up the osmotic gradient generated by serum injection play a more general role? It could be interesting to measure the osmolality of the injected serum and test whether other solutions with similar osmolality elicit a similar migratory response. Another important control here would be to treat human serum with serine racemase and observe how this impacts bacterial migration.

      As addressed above, we have added additional experiments of serum taxis treated with serine racemase showing competition between WT and cheY, and WT and tsr (Fig. S3). These data support a role for L-serine as a chemoattractant driving attraction to serum. The idea of osmotaxis is interesting, but outside the scope of this work since we focus on chemoattraction to L-serine as one of the mechanisms driving serum attraction, and have multiple lines of evidence to support that.

      (4) The migratory response of E. coli looks striking when quantified (Fig. 6C) but is really unclear from looking at Panel B - it would be more convincing if an explanation was offered for why these images look so much less striking than analogous images for other species (E.g. Fig. 6A).

      We agree that the E. coli taxis to serum response is less obvious. We have brightened those panels to hopefully make it clearer to interpret (more cells in field of view over time). Also, as stated in the y-axes of these plots, this quantification was performed by enumerating the number of cells in the field of view, and the Citrobacter and Escherichia responses are shown on separate y-axes (now Fig. 8C). As indicated, the experiments have different numbers of starting motile cells, which we presume accounts for the difference in attraction magnitude. When investigating diverse bacterial systems we found there to be differences in motility under the culturing and experimental conditions we employed, for multiple reasons, and so for these data we thought it best to report raw cell numbers rather data normalized to the starting number of bacteria, as we do elsewhere. In the specific case of these E. coli responding to serum, please view Supplementary Movie S3, which both clearly shows the attraction response and that the bacteria grew in a longer, semi-filamentous form that seem to impair their swimming speed.

      (5) It is unclear why the fold-change in bacterial distribution shows an approximately Gaussian shape with a peak at a radial distance of between 50 -100 um from the source (see for example Fig. 2H). Initially, I thought that maybe this was due to the presence of the microcapillary needle at the source, but the CheY distribution looks completely flat (Fig. 3I). Is this an artifact of how the fold-change is being calculated? Certainly, it doesn't seem to support the authors' claim that cells increase in density to a point of saturation at the source. Furthermore, it also seems inappropriate to apply a linear fit to these non-linear distributions (as is done in Fig. 2H and in the many analogous figures throughout the manuscript).

      We have revised the text to address this point, and removed the comment about cells increasing in density to a point of saturation: [Line 138] “We noted that in some experiments the population peak is 50-75 µm from the source, possibly due to a compromise between achieving proximity to nutrients in the serum and avoidance of bactericidal serum elements, but this behavior was not consistent across all experiments. Overall, our data show S. enterica serovars that cause disease in humans are exquisitely sensitive to human serum, responding to femtoliter quantities as an attractant, and that distinct reorganization at the population level occurs within minutes of exposure (Fig. 3, Movie 2).”

      We can confirm that this is not an artifact of quantification. Please refer to the videos of these responses, which demonstrates this point (Movies 1-5).

      (6) The authors present several experiments where strains/ serovars competed against each other in these chemotaxis assays. As mentioned, these are a real strength of the study - however, their utility is not always clear. These experiments are useful for studying the effects of competition between bacteria with different abilities to climb gradients.

      However, to meaningfully interpret these effects, it is first necessary to understand how the different bacteria climb gradients in monoculture. As such, it would be instructive to provide monoculture data alongside these co-culture competition experiments.

      Thank you for this suggestion. We agree that the coculture experiments showing strains competing for the same source of effector give a different perspective than monoculture. These experiments allow us to confirm taxis deficiencies or advantages with greater sensitivity, and ensure that the bacteria in competition have experienced the same gradient. This type of competition experiment is often used in in vivo experimentation for the same advantages. We note that in the gut the bacteria are not in monoculture and chemotactic bacteria do have to compete against each other for access to nutrients. Repeating all of the experiments we present to show both the taxis responses in coculture and monoculture would be an extraordinary amount of work that we do not believe would meaningfully change the conclusions of this study.

      (7) Linked to the above point, it would be especially instructive to test a tsr mutant's response in monoculture. Comparing the bottom row of Fig. 3G to Fig. 3I suggests that when in co-culture with a cheY mutant, the tsr mutant shows a higher fold-change in radial distribution than the WT strain. Fig. 4G shows that a tsr mutant can chemotaxis towards aspartate at a similar, but reduced rate to WT. This could imply that (like the trg mutant), a tsr mutant has a more general motility defect (e.g. a speed defect), which could explain why it loses out when in competition with the WT in gradients of human serum, but actually seems to migrate strongly to human serum when in co-culture with a cheY mutant. This should be resolved by studying the response of a tsr mutant in monoculture.

      Addressed above.

      (8) In Fig. 4, the response of the three clinical serovars to serine gradients appears stronger than the lab serovar, whilst in Fig. 1, the response to human serum gradients shows the opposite trend with the lab serovar apparently showing the strongest response. Can the authors offer a possible explanation for these slightly confusing trends?

      We suspect this relates to the fact that pure L-serine is a chemoattractant, whereas treatment with serum exposes the bacteria both to chemoattractants and, likely, chemorepellents. Strains may navigate the landscape of these stimuli different for a variety of reasons that are not simple to tease apart. The final magnitude of change in bacterial localization depends on multiple factors including swimming speed, adaptation, sensitivity of chemoattraction, and cooperative signaling of the chemoreceptor nanoarray. Thus, we cannot state with certainty how and why these strains are different across all experiments, but we can state that they are attracted to both serum and L-serine.

      (9) In Fig. S2, it seems important to present quantification of the effect of serine racemase and the reported lack of response to NE and DHMA - the single time-point images shown here are not easy to interpret.

      As suggested, we present quantification of the serum racemase treated samples (now Fig. S3). To assist in the interpretation of this max projections Fig. S3 now noted the chemotactic response (chemoattraction for L-serine, null-response for NE/DHMA). Further, we revised the text to state: [line 209: “We observed robust chemoattraction responses to L-serine, evident by the accumulation of cells toward the treatment source (Fig. S3E, Movie 4), but no response to NE or DHMA, with the cells remaining randomly distributed even after 5 minutes of exposure (Fig. S3F-I, Movie 5, Movie S1).”

      (10) Importantly, the authors detail how they controlled for the effects of pH and fluid flow (Line 133-136). Did the authors carry out similar controls for the dual-species experiments where fluorescent imaging could have significantly heated the fluid droplet driving stronger flow forces?

      Most of our microfluidics experiments were performed in a temperature-controlled chamber (see Methods). Since the strains in the coculture experiments experienced the same experimental conditions we have no evidence of fluorescence-imaginginduced temperature changes that have impacted whether or not the bacteria are attracted to serum or the effectors we investigated.

      (11) The inference of the authors' genetic analysis combined with the migratory response of E. coli and C. koseri to human serum shown in Fig. 6 is that Tsr drives movement towards human serum across a range of Enterobacteriaceae species. The evidence for the importance of Tsr here is currently correlative - more causal evidence could be presented by either studying the response of tsr mutants in these two species (certainly these should be readily available for E. coli) or by studying the response of these two species to serine gradients.

      We have revised the text to state: [line 402] “Without further genetic analyses in these strain backgrounds, the evidence for Tsr mediating serum taxis for these bacteria remains circumstantial. Nevertheless, taxis to serum appears to be a behavior shared by diverse Enterobacteriaceae species and perhaps also Gammaproteobacteria priority pathogen genera that possess Tsr such as Serratia, Providencia, Morganella, and Proteus (Fig. 8B).”

      We note that other work has thoroughly investigated E. coli serine taxis.

      Figure Suggestions

      (1) Fig. 2 - The inset bar charts in panels H-J and the font size in their axes labels are too small - this suggestion also applies to all analogous figures throughout the manuscript.

      We have increased the size of the text for these inset plots. We have also broken up some of the larger figures.

      (2) Panel 2F - the cartoon bacterial cell and 'number of bacteria' are confusing and seem to contradict the y-axis label. This also applies to several other figures throughout the manuscript where the significance of this cartoon cell is quite hard to interpret.

      As suggested, we have removed this cartoon.

      (3) Panels G-I in Fig. 3 are currently tricky to interpret - it would be easier if the authors were to use three different colours for the three different strains shown across these panels.

      We have broken up Figure 2 (which also had these types of plots) so that hopefully these labels are more clear. For the Figure in question (now Fig. 4), due to the many figures and different types of data and comparisons it was difficult to find a color scheme for these strains that would be consistent across the manuscript. These colors also reflect the fluorescence markers. We note that not only do we use color to indicate the strain but also text labels.

      (4) Panels 3B-F would be best moved to a supplementary figure as this figure is currently very busy. Similarly, I would potentially consider presenting only the bottom row of panels in Panels G-I in the main figure (which would then be consistent with analogous data presented elsewhere).

      We have opted to keep these panels in the main text (now Fig. 4) as they are relevant to understanding (1) our justification for why to pursue certain chemoeffector-chemoreceptor interactions and not others, and (2) how the chemoattraction response can be understood both in terms of bacterial population distribution and relevant cells over time.

      (5) Fig. 4 and possibly elsewhere - perhaps best not to use Ser as an abbreviation for Serine here because it could potentially be confused with an abbreviation for serum.

      It is unfortunate that these two words are so similar. However, Ser is the canonical abbreviation for the amino acid serine. Serum does not have a canonical abbreviation.

      (6) Fig. 4 - I would move panels H - K to a separate supplementary figure - currently, they are too squished together and it is hard to make out the x-axis labels. I would also consider moving panels E-G to supplementary as well so that the microscopy images presented elsewhere in the figure can be presented at an appropriate size.

      Since we are allowed more figures, we could also break some of these figures up into multiple ones.

      (7) Similarly, I would move some panels from Fig. 5 to supplementary as the figure is currently quite busy.

      We have rearranged the figure (now Fig. 7) to move the bioinformatics data to Fig. 8 to allow more space for the panels.

      Other suggestions

      (8) Line 179 - how do the concentrations quote for serine and glucose compare to aspartate? This would be helpful to justify the authors' decision not to investigate Tar as a potential chemoreceptor.

      This is addressed in our comments above and in Fig. 4A and Fig. 4B-F. Human serum L-Asp is much lower concentration (about 20-fold).

      (9) Line 282 - Serine levels in serum are quantified at 241 uM, but this is only discussed in the context of serum growth effects. Could this information be better used to design/ inform the serine gradients that were tested in chemotaxis assays?

      We tested a wide range of serine concentrations and show even much lower sources of serine than is present in serum is sufficient for chemoattraction. Also, the K1/2 for serine is 105 uM (Fig. S4), which is surpassed by the concentration in serum (Fig. S5).

      (10) The word 'potent' in the title might be too vague, especially as the strength of the response varies between strains/species. It may perhaps be more useful to focus on the rapidity/sensitivity of the response. However, presumably the sensitivity of the response will be driven by the sensitivity of the response to serine (which is already known for E. coli at least). Also, as noted in the public review, human serum itself is not a chemoattractant so I would consider re-phasing this in the title and elsewhere.

      As suggested, and discussed above, we have implemented this change.

      (11) Typo line 59 'context of colonizing of a healthy gut'.

      Addressed.

      (12) Typo line 538 - there is an extra full stop here.

      Addressed.

      Reviewer #2:

      (1) This study is well executed and the experiments are clearly presented. These novel chemotaxis assays provide advantages in terms of temporal resolution and the ability to detect responses from small concentrations. That said, it is perhaps not surprising these bacteria respond to serum as it is known to contain high levels of known chemoattractants, serine certainly, but also aspartate. In fact, the bacteria are shown to respond to aspartate and the tsr mutant is still chemotactic. The authors do not adequately support their decision to focus exclusively on the Tsr receptor. Tsr is one of the chemoreceptors responsible for observed attraction to serum, but perhaps, not the receptor. Furthermore, the verification of chemotaxis to serum is a useful finding, but the work does not establish the physiological relevance of the behavior or associate it with any type of disease progression. I would expect that a majority of chemotactic bacteria would be attracted to it under some conditions. Hence the impact of this finding on the chemotaxis or medical fields is uncertain.

      We agree that the data we show are mostly mechanistic and further work is required to learn whether this bacterial behavior is relevant in vivo and during infections. We present new data using an ex vivo intestinal model which supports the feasibility of serum taxis mediating invasion of enterohemorrhagic lesions (Fig. 8).

      (2) The authors also state that "Our inability to substantiate a structure-function relationship for NE/DHMA signaling indicates these neurotransmitters are not ligands of Tsr." Both norepinephrine (NE) and DHMA have been shown previously by other groups to be strong chemoattractants for E. coli (Ec), and this behavior was mediated by Tsr (e.g. single residue changes in the Tsr binding pocket block the response). Given the 82% sequence identity between the Se and Ec Tsr, this finding is unexpected (and potentially quite interesting). To validate this contradictory result the authors should test E. coli chemotaxis to DHMA in their assay. It may be possible that Ec responds to NE and DHMA and Se doesn't. However, currently, the data is not strong enough to rule out Tsr as a receptor to these ligands in all cases. At the very least the supporting data for Tsr being a receptor for NE/DHMA needs to be discussed.

      Addressed above. The focus of this study is serum attraction and the mechanisms thereof. We never saw any evidence to support the idea that NE/DHMA drives attraction to serum, nor are chemoeffectors for Salmonella, and provide these null-results in Data S2.

      (3) The authors also determine a crystal structure of the Se Tsr periplasmic ligand binding domain bound to L-Ser and note that the orientation of the ligand is different than that modeled in a previously determined structure of lower resolution. I agree that the SeTsr ligand binding mode in the new structure is well-defined and unambiguous, but I think it is too strong to imply that the pose of the ligand in the previous structure is wrong. The two conformations are in fact quite similar to one another and the resolution of the older structure, is, in my view, insufficient to distinguish them. It is possible that there are real differences between the two structures. The domains do have different sequences and, moreover, the crystal forms and cryo-cooling conditions are different in each case. It's become increasingly apparent that temperature, as manifested in differential cooling conditions here, can affect ligand binding modes. It's also notable that full-length MCPs show negative cooperativity in binding ligands, which is typically lost in the isolated periplasmic domains. Hence ligand binding is sensitive to the environment of a given domain. In short, the current data is not convincing enough to say that a previous "misconception" is being corrected.

      Thank you for this comment, which spurred us to investigate this idea more rigorously. As described above we performed new refinements of the E. coli structure edited to have the positions of the ligand and ligand-binding site as modeled in our new Tsr structure from Salmonella (Fig. 7J). The best model is obtained with these poses. Along with the poor fit of the E. coli model to the density, the best interpretations for these positions, for both structures, are as we have modeled them in the Salmonella Tsr structures.

      Figure suggestions

      (1) Figure 2 looks busy and unorganized. Fig 2C could be condensed into one image where there are different colored rings coming from the source point that represent different time points.

      Addressed above. Fig. 2 has been broken apart to help improve clarity.

      (2) What is the second (bottom) graph of 2D? I think only the top graph is necessary.

      We have added an explanation to the figure legend that the top graph shows the means and the bottom shows SEM. The plots cannot easily be overlaid.

      (3) Similarly, Fig 2E doesn't need to have so many time points. Perhaps 4 at maximum.

      As the development of the response over time is a key take-home of the study, we do not wish to reduce the timepoints shown.

      (4) The legend for Figure 2F uses the unit 'µM' to mean micrometers but should use 'µm'.

      Corrected.

      (5) In Figures 2H-J, the lime green text is difficult to read. The word "serum" does not need to be at the top of each panel. I recommend shortening the y-axis titles on the graphs so you can make the graphs themselves larger.

      Addressed above.

      (6) In Figures 2H-J, I am confused about what is being shown in the inset graph. The legend says it's the AUC for the data shown. However, in the third panel (S. Typhimurium vs. S. Enteriditus) the data appears to be much more disparate than the inset indicates. I don't think that this inset is necessary either.

      The point of this inset graph is to quantify the response through integration of the curve, i.e., area under the curve, which is a common way to quantify complex curves and compare responses as single values. We are using this method to calculate statistical significant of the response compared to a null response. We have added further clarification to the figure legend regarding these plots: Inset plots show foldchange AUC of strains in the same experiment relative to an expected baseline of 1 (no change). p-values shown are calculated with an unpaired two-sided t-test comparing the means of the two strains, or one-sided t-test to assess statistical significance in terms of change from 1-fold (stars).

      (7) Line 154, change "relevant for" to "observed in".

      Changed.

      (8) Line 171, according to the Mist4 database, Salmonella enterica has seven chemoreceptors. Why are only Tar, Tsr, and Trg mentioned? Why were only Tsr and Trg tested?

      Addressed above.

      (9) Line 192, be clear that you are referring to genes and not proteins, as italics are used.

      Revised to make this distinction clear.

      (10) Line 193, have other studies found a Trg deletion strain to be non-chemotactic? If so, cite this source here.

      We state that the Trg deletion strain had deficiencies in motility, and also have revised the text to include the clarification that this was not noted in earlier work with this strain: [line 173]: We were surprised to find that the trg strain had deficiencies in swimming motility (data not shown). This was not noted in earlier work but could explain the severe infection disadvantage of this mutant 34. Because motility is a prerequisite for chemotaxis, we chose not to study the trg mutant further, and instead focused our investigations on Tsr.

      (11) Why wasn't a Tar deletion mutant also analyzed? The authors say that based on the known composition of serum, serine and glucose are the most abundant. However, the serum does have aspartate at 10s of micromolar concentrations.

      Addressed above.

      (12) “The Tsr deletion strain still exhibits an obvious chemoattraction to serum. There are other protein(s) involved in chemoattraction to serum but the text does not discuss this.”

      Addressed above.

      (13) “In Figure 3B-F, the text is very difficult to read even when zoomed in on.”

      We have increased the font size of these panels.

      (14) “All of the text in Figure 5 is extremely small and difficult to read.”

      Addressed above. We split this figure in two to help improve clarity.

      (15) “I wonder about the accuracy of the concentration modeling. It seems like there are a lot of variables that could affect the diffusion rates, including the accuracy of the delivery system. Could the concentrations be verified by the dye experiments?”

      Addressed above. We provide a new analysis comparing experimental diffusion of A488 dye compared to calculations (Fig. S2).

    2. eLife assessment

      This work uses an interdisciplinary approach combining microfluidics, structural biology, and genetic analyses to provide important findings that show that pathogenic enteric bacteria exhibit taxis toward human serum. The data are compelling and show that the behavior utilizes the bacterial chemotaxis system and the chemoreceptor Tsr, which senses the amino acid L-serine. The work provides an ecological context for the role of serine as a bacterial chemoattractant and could have clinical implications for bacterial bloodstream invasion during episodes of gastrointestinal bleeding.

    3. Reviewer #1 (Public Review):

      Updated summary:

      Glenn et al. present solid evidence that both lab and clinical Salmonella enterica serovars rapidly migrate towards human serum using an exciting approach that combines microfluidics, structural biology and genotypic analysis. The authors succeed in bringing to light a novel context for the role of serine as a bacterial chemoattractant as well as documenting what is likely to be a key step in bloodstream entry for some of the main sepsis-associated pathogens during gastrointestinal bleeding. They illustrate the generality of their findings through phylogenetic analysis, testing additional species within the Enterobacteriaceae family and showing attraction towards swine and equine serum. Their interdisciplinary approach here greatly increases the scope of their findings.<br /> I would also like to note that, whilst I enjoyed the interdisciplinary scope of this study, I am personally not well placed to review the protein structural aspects of this work.

      Additional strengths of the revised manuscript:

      All weaknesses raised in my review of the original manuscript have been satisfactorily addressed in the revised manuscript. It is interesting to note that the accumulation pattern of the bacteria 50-75 um from the source of serum could, as the author's now note, be due to the avoidance of bactericidal serum elements. Alternative explanations, however, could include chemoreceptor saturation (i.e. close to the serum source, high ligand concentrations could saturate chemoreceptors preventing further chemotaxis) or Weber's Law considerations (cell's ability to detect a given change in chemical concentrations diminishes with increasing background concentrations - thus, as cells get closer to the serum source, their ability to chemotax decreases).

      The authors have also added new experimental data and analyses and these constitute major new strengths of the revised manuscript:<br /> - The authors show that the competitive advantage of WT cells relative to a tsr mutant is removed when serum is treated with serine-racemase and this provides strong evidence that chemotaxis towards serine is responsible for the reduced attraction of the tsr mutant towards serum (i.e. rather than any possible pleiotropic effects).<br /> - New experimental data showing Salmonella enterica is also attracted to swine and equine serum (including an ex vivo swine model) is a useful addition that hints at the potential generality of the response reported here.<br /> - The authors now include additional data to back up the intriguing lack of a movement response towards norepinephrine and DHMA reported here.

      Additional weaknesses of the revised manuscript:

      - The addition of an ex vivo swine model is an exciting new inclusion in the updated manuscript. However, information regarding biological and technical replication here is currently unclear or missing.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript characterizes a chemoattractant response to human serum by pathogenic bacteria, focusing on pathogenic stratins of Salmonella enterica Se. The researchers conduct the chemotaxis assays using a micropipette injection method that allows real-time tracking of bacterial population densities. They found that clinical isolates of several Se strains present a chemoattractant response to human serum. The specific chemoattractant within the serum is identified as L-serine, a highly characterized and ubiquitous chemoattractant, that is sensed by the Tsr receptor. They further show that chemoattraction to serum is impaired with a mutant strain devoid of Tsr. X-ray crystallography is then used to determine the structure of L-serine in the Se Tsr ligand binding domain, which differs slightly from a previously determine structure of a homologous domain. They went on to identify other pathogens that have a Tsr domain through a bioinformatics approach and show that these identified species also present a chemoattractant response to serum.

      Strengths and Weaknesses:

      This study is well executed and the experiments are clearly presented. These novel chemotaxis assays provide advantages in terms of temporal resolution and ability to detect responses from small concentrations. That said, it is perhaps not surprising these bacteria respond to serum as it is known to contain high levels of known chemoattractants, serine certainly, but also aspartate. In fact, the bacteria are shown to respond to aspartate and the tsr mutant is still chemotactic. The authors do not adequately support their decision to focus exclusively on the Tsr receptor. Tsr is one of the chemoreceptors responsible for observed attraction to serum, but perhaps, not the receptor. Furthermore, the verification of chemotaxis to serum is a useful finding, but the work does not establish the physiological relevance of the behavior or associate it with any type of disease progression. I would expect that a majority of chemotactic bacteria would be attracted to it under some conditions. Hence the impact of this finding on the chemotaxis or medical fields is uncertain.

      The authors also state that "Our inability to substantiate a structure-function relationship for NE/DHMA signaling indicates these neurotransmitters are not ligands of Tsr." Both norepinephrine (NE) and DHMA have been shown previously by other groups to be strong chemoattractants for E. coli (Ec), and that this behavior was mediated by Tsr (e.g. single residue changes in the Tsr binding pocket block the response). Given the 82% sequence identity between the Se and Ec Tsr, this finding is unexpected (and potentially quite interesting). To validate this contradictory result the authors should test E. coli chemotaxis to DHMA in their assay. It may be possible that Ec responds to NE and DHMA and Se doesn't. However, currently the data is not strong enough to rule out Tsr as a receptor to these ligands in all cases. At the very least the supporting data for Tsr being a receptor for NE/DHMA needs to be discussed.

      The authors also determine a crystal structure of the SeTsr periplasmic ligand binding domain bound to L-Ser and note that the orientation of the ligand is different than that modeled in a previously determined structure of lower resolution. I agree that the SeTsr ligand binding mode in the new structure is well-defined and unambiguous, but I think it is too strong to imply that the pose of the ligand in the previous structure is wrong. The two conformations are in fact quite similar to one another and the resolution of the older structure, is, in my view, insufficient to distinguish them. It is possible that there are real differences between the two structures. The domains do have different sequences and, moreover, the crystal forms, and cryo-cooling conditions are different in each case. It's become increasingly apparent that temperature, as manifested in differential cooling conditions here, can affect ligand binding modes. It's also notable that full-length MCPs show negative cooperativity in binding ligands, which is typically lost in the isolated periplasmic domains. Hence ligand binding is sensitive to the environment of a given domain. In short, the current data is not convincing enough to say that a previous "misconception" is being corrected.

    1. Reviewer #2 (Public Review):

      In this study, Wang et al., report the significance of XAP5L and XAP5 in spermatogenesis, involved in transcriptional regulation of the ciliary gene in testes. In previous studies, the authors demonstrate that XAP5 is a transcription factor required for flagellar assembly in Chlamydomonas. Continuing from their previous study, the authors examine the conserved role of the XAP5 and XAP5L, which are the orthologue pair in mammals.

      XAP5 and XAP5L express ubiquitously and testis specifically, respectively, and their absence in the testes causes male infertility with defective spermatogenesis. Interestingly, XAP5 deficiency arrests germ cell development at the pachytene stage, whereas XAP5L absence causes impaired flagellar formation. RNA-seq analyses demonstrated that XAP5 deficiency suppresses ciliary gene expression including Foxj1 and Rfx family genes in early testis. By contrast, XAP5L deficiency abnormally remains Foxj1 and Rfx genes in mature sperm. From the results, the authors conclude that XAP5 and XAP5L are the antagonistic transcription factors that function upstream of Foxj1 and Rfx family genes.

      This reviewer thinks the overall experiments are performed well and that the manuscript is clear. However, the current results do not directly support the authors' conclusion. For example, the transcriptional function of XAP5 and XAP5L requires more evidence. In addition, this reviewer wonders about the conserved XAP5 function of ciliary/flagellar gene transcription in mammals - the gene is ubiquitously expressed despite its functional importance in flagellar assembly in Chlamydomonas. Thus, this reviewer thinks authors are required to show more direct evidence to clearly support their conclusion with more descriptions of its role in ciliary/flagellar assembly.

    2. eLife assessment

      This study reports useful data suggesting the critical roles of two ancient proteins, XAP5 and XAP5L, in controlling the transcriptional program of ciliogenesis during mouse spermatogenesis. However, this study is considered incomplete because the data only partially support the conclusion. This work will be of interest to biomedical researchers who work on ciliogenesis and reproduction.

    3. Reviewer #1 (Public Review):

      Summary:

      Wang et al. generate XAP5 and XAP5L knockout mice and find that they are male infertile due to meiotic arrest and reduced sperm motility, respectively. RNA-Seq was subsequently performed and the authors concluded that XAP5 and XAP5L are antagonistic transcription factors of cilliogenesis (in XAP5-KO P16 testis: 554 genes were unregulated and 1587 genes were downregulated; in XAP5L-KO sperm: 2093 genes were unregulated and 267 genes were downregulated).

      Strengths:

      Knockout mouse models provided strong evidence to indicate that XAP5 and XAP5L are critical for spermatogenesis and male fertility.

      Weaknesses:

      The key conclusions are not supported by evidence. First, the authors claim that XAP5 and XAP5L transcriptionally regulate sperm flagella development; however, detailed molecular experiments related to transcription regulation are lacking. How do XAP5 and XAP5L regulate their targets? Only RNA-Seq is not enough. Second, the authors declare that XAP5 and XAP5L are antagonistic transcription factors; however, how do XAP5 and XAP5L regulate sperm flagella development antagonistically? Only RNA-Seq is not enough. Third, I am concerned about whether XAP5 really regulates sperm flagella development. XAP5 is specifically expressed in spermatogonia and XAP5-cKO mice are in meiotic arrest, indicating that XAP5 regulates meiosis rather than sperm flagella development.

    1. eLife assessment

      This important study demonstrated that ablation of astrocytes in the lumbar spinal cord not only reduced neuropathic pain but also caused microglia activation. The findings presented add considerable value to the current understanding of the role of astrocyte elimination in neuropathic pain, offering convincing evidence that supports existing hypotheses and insights into the interactions between astrocytes and microglial cells, likely through IFN-mediated mechanisms. This study may also offer a new therapeutic strategy for the treatment of debilitating neuropathic pain in patients with SCI.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study the authors demonstrated that ablation of astrocytes in lumbar spinal cord not only reduced neuropathic pain but also caused microglia activation. Furthermore, RNA sequencing and bioinformatics revealed an activation of STING/type I IFNs signal pathway in spinal cord microglia after astrocyte ablation.

      Strengths:

      The findings are novel and interesting and provide new insights into astrocyte-microglia interaction in neuropathic pain. This study may also offer a new therapeutic strategy for the treatment of debilitating neuropathic pain in patients with SCI.

      Weaknesses:

      More details are needed to justify the sample size, statistics, and sex of animals.

    3. Reviewer #2 (Public Review):

      Summary:

      In the manuscript, Zhao et al. have carried out a thorough examination of the effects of targeted ablation of resident astrocytes on behavior, cellular responses, and gene expression after spinal cord injury. Employing transgenic mice models alongside pharmacogenetic techniques, the authors have successfully achieved the selective removal of these resident astrocytes. This intervention led to a notable reduction in neuropathic pain and induced a shift in microglial cell reactivation states within the spinal cord, significantly altering transcriptome profiles predominantly associated with interferon (IFN) signaling pathways.

      Strengths:

      The findings presented add considerable value to the current understanding of the role of astrocyte elimination in neuropathic pain, offering convincing evidence that supports existing hypotheses and valuable insights into the interactions between astrocytes and microglial cells, likely through IFN-mediated mechanisms. This contribution is highly relevant and suggests that further exploration in this direction could yield meaningful results.

      Weaknesses:

      The methodology and evidence underpinning the study are solid, yet some areas would benefit from further clarification, particularly concerning methodological details and the choice of statistical analyses. Additionally, the manuscript's organization and clarity could be improved, as certain figures and schematics appear inconsistent or misleading.

    1. eLife assessment

      This study presents a valuable finding that the blood-brain barrier functionality changes with age and differs between males and females. The analysis is solid, comprising a large and racially diverse dataset, and utilizes a contrast-agent-free MRI method. Since limited work has been done in the MRI field on the blood-brain barrier using this method, this study is of great interest to neuroimaging researchers and clinicians.

    2. Reviewer #1 (Public Review):

      Summary:

      This work revealed an important finding that the blood-brain barrier (BBB) functionality changes with age and is more pronounced in males. The authors applied a non-invasive, contrast-agent-free approach of MRI called diffusion-prepared arterial spin labeling (DP-pCASL) to a large cohort of healthy human volunteers. DP-pCASL works by tracking the movement of magnetically labeled water (spins) in blood as it perfuses brain tissue. It probes the molecular diffusion of water, which is sensitive to microstructural barriers, and characterizes the signal coming from fast-moving spins as blood and slow-moving spins as tissue, using different diffusion gradients (b-values). This differentiation is then used to assess the water exchange rates (kw) across the BBB, which acts as a marker for BBB functionality. The main finding of the authors is that kw decreases with age, and in some brain regions, kw decreases faster in males. The neuroprotective role of the female sex hormone, estrogen, on BBB function is discussed as one of the explanations for this finding, supported by literature. The study also shows that BBB function remains stable until the early 60s and remarkably decreases thereafter.

      Strengths:

      The two main strengths of the study are the MRI method used and the amount of data. The authors employed a contrast-agent-free MRI method called ASL, which offers the opportunity to repeat such experiments multiple times without any health risk - a significant advantage of ASL. Since ASL is an emerging field that requires further exploration and testing, a study evaluating blood-brain barrier functionality is of great importance. The authors utilized a large dataset of healthy humans, where volunteer data from various studies were combined to create a substantial pool. This strategy is effective for statistically evaluating differences in age and gender.

      Weaknesses:

      Gender-related differences are only present in some brain regions, not in the whole brain or gray matter - which is usually the assumption unless stated otherwise. From the title, this was not clear. Including simulations could increase readers' understanding related to model fitting and the interdependence of parameters, if present. The discussion follows a clear line of argument supported by literature; however, focusing solely on AQP4 channels and missing a critical consideration of other known/proven changes in transport mechanisms through the BBB and their effects substantially weakens the discussion.

    3. Reviewer #2 (Public Review):

      Summary:

      This study used a novel diffusion-weighted pseudo-continuous arterial spin labelling (pCASL) technique to simultaneously explore age- and sex-related differences in brain tissue perfusion (i.e., cerebral blood flow (CBF) & arterial transit time (ATT) - a measure of CBF delivery to brain tissue) and blood-brain barrier (BBB) function, measured as the water exchange (kw) across the BBB. While age- and sex-related effects on CBF are well known, this study provides new insights to support the growing evidence of these important factors in cerebrovascular health, particularly in BBB function. Across the brain, the decline in CBF and BBB function (kw) and elevation in ATT were reported in older adults, after the age of 60, and more so in males compared to females. This was also evident in key cognitive regions including the insular, prefrontal, and medial temporal regions, stressing the consideration of age and sex in these brain physiological assessments.

      Strengths:

      Simultaneous assessment of CBF with BBB along with transit time and at the voxel-level helped elucidate the brain's vulnerability to age and sex-effects. It is apparent that the investigators carefully designed this study to assess regional associations of age and sex with attention to exploring potential non-linear effects.

      Weaknesses:

      It appears that no brain region showed concurrent CBF and BBB dysfunction (kw), based on the results reported in the main manuscript and supplemental information. Was an association analysis between CBF and kw performed? There is a potential effect of the level of formal education on CBF (PMID: 12633147; 15534055), which could have been considered and accounted for as well, especially for a cohort with stated diversity (age, race, sex).

    1. eLife assessment

      This work substantially advances our understanding of pharmacological inhibition of SWI/SNF as a therapeutic approach for cancer. The study is well-written and provides compelling evidence, including comprehensive datasets, compound screens, gene expression analysis, epigenetics, as well as animal studies. This study provides a fundamental advance for the uveal melanoma research field that might be exploited to target this deadly cancer and more generally for targeting transcriptional dependency in cancers.

    2. Reviewer #1 (Public Review):

      Summary:

      The presented study by Centore and colleagues investigates the inhibition of BAF chromatin remodeling complexes. The study is well-written, and includes comprehensive datasets, including compound screens, gene expression analysis, epigenetics, as well as animal studies. This is an important piece of work for the uveal melanoma research field, and sheds light on a new inhibitor class, as well as a mechanism that might be exploited to target this deadly cancer for which no good treatment options exist.

      Strengths:

      This is a comprehensive and well-written study.

      Weaknesses:

      There are minimal weaknesses.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors generate an optimized small molecule inhibitor of SMARCA2/4 and test it in a panel of cell lines. All uveal melanoma (UM) cell lines in the panel are growth-inhibited by the inhibitor making the focus of the paper. This inhibition is correlated with the loss of promoter occupancy of key melanocyte transcription factors e.g. SOX10. SOX10 overexpression and a point mutation in SMARCA4 can rescue growth inhibition exerted by the SMARCA2/4 inhibitor. Treatment of a UM xenograft model results in growth inhibition and regression which correlates with reduced expression of SOX10 but not discernible toxicity in the mice. Collectively the data suggest a novel treatment of uveal melanoma.

      Strengths:

      There are many strengths of the study including the strong challenge of the on-target effect, the assays used, and the mechanistic data. The results are compelling as are the effects of the inhibitor. The in vivo data is dose-dependent and doses are low enough to be meaningful and associated with evidence of target engagement.

      Weaknesses:

      The authors introduce the field stating that SMARCA4 inhibitors are more effective in SMARCA2 deficient cancers and the converse. Since the desirable outcome of cancer therapy would be synthetic lethality it is not clear why a dual inhibitor is desirable. Wouldn't this be associated with more side effects? It is not known how the inhibitor developed here impacts normal cells, in particular T cells which are essential for any durable response to cancer therapies in patients. Another weakness is that the UM cell lines used do not molecularly resemble metastatic UM. These UM most frequently have mutations in the BAP1 tumor suppressor gene. It is not clear if the described SMARCA2/4 inhibitor is efficacious in BAP1 mutant UM cell lines in vitro or BAP1 mutant patient-derived xenografts in vivo.

    4. Reviewer #3 (Public Review):

      Summary:

      This manuscript reports the discovery of new compounds that selectively inhibit SMARCA4/SMARCA2 ATPase activity that work through a different mode as previously developed SMARCA4/SMARCA2 inhibitors. They also demonstrate the anti-tumor effects of the compounds on uveal melanoma cell proliferation and tumor growth. The findings indicate that the drugs exert their effects by altering chromatin accessibility at binding sites for lineage-specific transcription factors within gene enhancer regions. In uveal melanoma, altered expression of the transcription factor, SOX10, and SOX10 target gene underlies the anti-proliferative effects of the compounds. This study is significant because the discovery of new SMARCA4/SMARCA2 inhibitory compounds that can abrogate uveal melanoma tumorigenicity has therapeutic value. In addition, the findings provide evidence for the therapeutic use of these compounds in other transcription factor-dependent cancers.

      Strengths:

      The strengths of this manuscript include biochemical evidence that the new compounds are selective for SMARCA4/SMARCA2 over other ATPases and that the mode of action is distinct from a previously developed compound, BRM014, which binds the RecA lobe of SMARCA2. There is also strong evidence that FHT1015 suppresses uveal melanoma proliferation by inducing apoptosis. The in vivo suppression of tumor growth without toxicity validates the potential therapeutic utility of one of the new drugs. The conclusion that FHT1015 primarily inhibits SMARCA4 activity and thereby suppresses chromatin accessibility at lineage-specific enhancers is substantiated by ATAC-seq and ChIP-seq studies.

      Weaknesses:

      The weaknesses include a lack of more precise information on which SMARCA4/SMARCA2 residues the drugs bind. Although the I1173M/I1143M mutations are evidence that the critical residues for binding reside outside the RecA lobe, this site is conserved in CHD4, which is not affected by the compounds. Hence, this site may be necessary but not sufficient for drug binding or specifying selectivity. A more precise evaluation of the region specifying the effect of the new compounds would strengthen the evidence that they work through a novel mode and that they are selective. Another concern is that the mechanisms by which FHT1015 promotes apoptosis rather than simply cell cycle arrest are not clear. Does SOX10 or another lineage-specific transcription factor underlie the apoptotic effects of the compounds?

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Major comments:

      (1) It is nice that the authors compared their model to the one "without lookahead" in Figure 4, but this comparison requires more evidence in my opinion, as I explain in this comment. The model without lookahead is closely related or possibly equivalent to the standard predictive coding. In predictive coding, one can make the network follow the stimulus rapidly by reducing the time constant tau. However, as the time constant decreases, the network would become unstable both in simulations (due to limited integration time step) and physical implementation (due to noise). Therefore I wonder if the proposed model has an advantage over standard predictive coding with an optimized time constant. Hence I suggest to also add a comparison between the proposed model, and the predictive coding with parameters (such as tau) optimized independently for each model. Of course, we know that the time-constant of biological neurons is fixed, but biological neurons might have had different time constants (by changing leak conductance) and such analysis could shed light on the question of why the neurons are organized the way they are.

      The comparison with a predictive network for which the neuronal time constants shrink towards 0 is in fact helpful. We added two news subsections in the SI that formally compares the NLA with other approaches, Equilibrium propagation and the Latent Equilibrium, with a version of Equilibrium Propagation also covering the standard predictive coding you describe (SI, Sect.C and D). The Subsection C concludes: “In the Equilibrium propagation we cannot simply take the limit t0 since then the dynamics either disappears (when tau remains on the left, t Du  0) or explodes (when t is moved to the right, dt/ t  ∞), leading to either too small or too big jumps.”

      We have also expanded the passage on the predictive coding in the main text, comparing our instantaneous network processing (up to a remaining time constant tin) with experimental data from humans (see page 10 of the revised ms). The new paragraph ends with:

      “Notice that, from a technical perspective, making the time constants of individual cortical neurons arbitrarily short leads to network instabilities and is unlikely the option chosen by the brain (see SI Sect. C, Comparison to the Equilibrium Propagation).”

      A new formal definition of the moving equilibrium in the Methods (Sect. F) helps to understand this notion of being in a balanced equilibrium state during the dynamics. This formal definition directly leads to the contraction analysis in the SI, Sect. D, showing why the Latent Equilibrium is always contractive, while the current form of the NLA may show jumps at the corner of a ReLu (since a second order derivative of the transfer function enters in the error propagation).

      The reviewer perhaps has additional simulations in mind that compare the robustness of the different models. However, as this paper is more about presenting a novel concept with a comprehensive theory (summing up to 45 pages), we prefer to not add more than the simulations necessary to check the statements of the theorems.

      (2) I found this paper difficult to follow, because the Results sections went straight into details, and various elements of the model were introduced without explaining why they are necessary. Furthermore, the neural implementation was introduced after the model simulations. I suggest reorganizing the manuscript, to describe the model following Marr's levels of description and then presenting the results of simulations. In particular, I suggest starting the Results section by explaining what computation the network is trying to achieve (describe the setup, function L, define its integral over time, and explain that the goal is to find a model minimizing this integral). Then, I suggest presenting the algorithm the neurons need to employ to minimize this integral, i.e. their dynamics and plasticity (I wonder if r=rho(u) + tau rho(u)' is a consequence of action minimization or a necessary assumption - please clarify it). Next please explain how the algorithms could be implemented in biological neurons. Afterward please present the results of the simulation.

      We are sorry to realize that we could not convey the main message clearly enough. After rewriting the paper and straightening the narrative, we hope it is simpler to understand now.

      The paper does not suggest a new model to solve a task, and writing down the function to be minimized is not enough. The point of the NLA is that the time integral of our Lagrangian is minimized with respect to the prospective coordinates, i.e. the discounted future voltage. It is about the question how dynamic equations in biology are derived. Of course, we also solve these equations, prove theorems and perform simulations. But the main point that biology seems to deal with time differently than physics deals with time. Biology “thinks” in terms of future quantities, physics “thinks” in terms of current quantities. We tried to explain this better now in the Introduction, the Results (e.g. after Eq. 5) and the Methods.

      (3) Understanding the paper requires background knowledge that most readers of eLife are unlikely to have, even if they are mathematically minded. For example, I am from the field of computational neuroscience, and I have never heard about Least Action principle from physics or the EulerLagrange equation. I felt lost after reading this paper, and to be able to write this review I needed to watch videos on the Euler-Lagrange equation. To help other readers, I have two suggestions: First, I feel that Eq 4-6 could be moved to the methods, because I found the concept of u~ difficult to understand, and it does not appear in the algorithm. Second, I advise to write in the Introduction, what knowledge is required to follow this paper, and point the readers to resources where they can find the required information. The authors may specify what background is required to follow the main text, and what is required to understand the methods.

      We hope that after explaining the rationale better, it becomes clear that we cannot skip the equations for the prospective coordinates. Likewise, the Euler-Lagrange equations need to be presented in the abstract form, since these are the equations that are eventually transformed into the “model”. We tried to give the basic intuition for this in the main text. As we explained above, the equations asked to be skipped represent the essence of the proposal. It is about how to derive a model equations.

      Moreover, we give more explanations in the Methods to understand the derivations, and we refer to the specifically sections in the SI for further details. We are aware that a full understanding of the theory requires some basic knowledge of the calculus of variation.

      We are hesitating to write in the Introduction what type of knowledge is required to understand the paper. An understanding can be on various levels. Moreover, the materials that are considered to be helpful depend on the background. While for some it is a Youtube, for some Wikipedia, and for others it is a textbook where specific ingredients can be extracted. But we do cite two textbooks in the Results and more in the SI, Sect. F, when referring to the principle of least action in physics and the mathematics, including weblinks.

      Minor comments

      Eq.3: The Authors refer to this equation as a Lagrangian. Could you please clarify why? Is the logic to minimize the energy subject to a constraint that Cost = 0?

      Thanks for asking. The cost is not really a constraint, it is globally minimized, in parallel steps. We are explaining this right after Eq. 3. “We `prospectively' minimize L locally across a voltage trajectory, so that, as a consequence, the local synaptic plasticity for W will globally reduce the cost along the trajectory (Theorem 1 below).”

      We were adding two sentence that explain why this function in Eq. 3 is called a Lagrangian: “While in classical energy-based approaches L is called the total energy, we call it the `Lagrangian' because it will be integrated along real and virtual voltage trajectories as done in variational calculus (leading to the Euler-Lagrange equations, see below and SI, Sect. F)”

      p.4, below Eq. 5 - Please explain the rationale behind NLA, i.e. why is it beneficial that "the trajectory u˜(t) keeps the action A stationary with respect to small variations δu˜"? I guess you wish to minimize L integrated over time, but this is not evident from the text.

      Hmm, yes and no. We wish to minimize the cost, and on the way there minimize the action. Since the global minimization of C is technically difficult, one looks for stationary trajectory as defined in the cited sentence, while minimizing L with respect to W, to eventually minimize the cost.

      In the text we now explain after Eq. 5:

      “The motivation to search for a trajectory that keeps the action stationary is borrowed from physics. The motivation to search for a stationary trajectory by varying the near-future voltages ũ instead of u is assigned to the evolutionary pressure in biology to 'think ahead of time'. To not react too late, internal delays involved in the integration of external feedback need to be considered and eventually need to be overcome. In fact, only for the 'prospective coordinates' defined by looking ahead into the future, even when only virtually, will a real-time learning from feedback errors become possible (as expressed by our Theorems below).”

      Bottom of page 8. The authors say that in the case of single equilibrium and strong nudging the model reduced to the Least Control Principle. Does it also reduce to Predictive coding for supervised learning? If so, it would be helpful to state so.

      Yes, in this case the prediction error in the apical dendrite becomes the one of predictive coding. We are stating this now right at the end of the cited sentence:

      “In the case of strong nudging and a single steady-state equilibrium, the NLA principle reduces to the Least-Control Principle (Meulemans et al., 2022) that minimizes the mismatch energy E^M for a constant input and a constant target, with the apical prediction error becoming the prediction error from standard predictive coding (Rao & Ballard, 1999).”

      In the Discussion we also added a further point (iv) to compare the NLA principle with predictive coding. Both “improve” the sensory representation, but the NLA does in favor of an output, and the predictive coding in favor of the sensory prediction itself (see Discussion).

      Whenever you refer to supplementary materials, please specify the section, so it is easier for the reader to find it.

      Done. Sorry to not have done it earlier. We are now also indicate specific sections when referring to the Methods.

      Reviewer #2 (Recommendations For The Authors):

      There are no major issues with this article, but I have several considerations that I think would greatly improve the impact, clarity, and validity of the claims.

      (1) Unifying the narrative. There are many many ideas put forward in what feels like a deluge. While I appreciate the enthusiasm, as a reader I found it hard to understand what it was that the authors thought was the main breakthrough. For instance, the abstract, results, introduction, and discussion all seem to provide different answers to that question. The abstract seems to focus on the motor error idea. The introduction seems to focus on the novel prospective+predictive setup of the energy function. The discussion lists the different perks of the theory (delay compensation, moving equilibrium, microcircuit) without referring to the prospective+predictive setup of the energy function.

      Thanks much for these helpful hints. Yes, the paper became an agglomerate of many ideas, also own to the fact that we wish to show how the NLA principle can be applied to explain various phenomenology in neurosicence. We now simplified the narrative to this one point of providing a novel theoretical framework for neuroscience, and explaining why this is novel and why it “suddenly works” (the prospective minimization of the energy).

      As you can see from the dominating red in the revised pdf, we did fully rewrite Abstract, Introduction and Discussion under the narrative of the NLA and prospective coding.

      (2) Laying out the organization of the notation clearly. There are quite a few subtle distinctions of what is meant by the different weight matrices (omnibus matrix then input vs recurrent then layered architecture), different temporal horizon formalisms (bar, not bar, tilde), different operators (L, curly L, derivative version, integral version). These different levels are introduced on the fly, which makes it harder to grasp. The fact that there are many duplicate notations for the same quantities does not help the reader. For instance u_0 becomes equal to u_N at one point (above Eq 25). Another example is the constant flipping between integrated and 'current input' pictures. So laying out the multiple layers early, making a table or a figure for the notation, or sticking with one level would help convey the idea to a wide readership.

      Thanks for the hints. We included the table you suggested, but put it to the SI as it became a full page itself. We banned the curly L abbreviating the look-ahead operator.

      The “change of notation” you are alluding to is tricky, though. In a recurrent layer, the index of the output neuron is called o. In a forward network with N layer, the index of the output neurons becomes the last layer N. One has to introduce the layer index l anway for the deeper layers l < N, and we found it more consistent to explain that, while switching from the recurrent to the forward network, the voltage of the output layer becomes now u_o = u_N. There are more of these examples, like the weight matrix W splitting into a intrinsic network part W_net across which errors backpropagate, and a part conveying the input, W_in, that has to be excluded when writing the backpropagation formula for general networks. Again, in the case of the feedforward networks, the notation reduces to W_l, with index l coding for the layer. Presenting the general approach and a specific example may appear as we would duplicate notations – we haven’t found a solution here.

      (3) Separate the algorithm from the implementation level. I particularly struggled with separating the ideas that belonged to the algorithm level (cost function, optimization objectives) and the biophysics. The two are interwoven in a way that does not have to be. Particularly, some of the normative elements may be implemented by other types of biophysics than the authors have in mind. It is for this reason that I think that separating more clearly what belongs to the implementation and algorithm levels would help make the ideas more widely understood. On this point, a trigger point for me was the definition of the 'prospective input rates' e_i, which comes in the second paragraph.

      We are very sorry to have made you thinking that the 'prospective input rates' would be e_i. The prospective input rates are r_i. The misunderstanding likely appeared by an unclear formulation from our side that is now corrected (see first and second paragraph of the Results where we introduce r_i and e_i).

      From a biophysical perspective, it is quite arbitrary to define the input to be the difference between the basal input and the somatic (prospective) potential. It sounds like it comes from some unclear normative picture at this point. But the authors seem to have in mind to use the fact that the somatic potential is the sum of apical and basal input, that's the biophysical picture.

      We hope to have disentangled the normative and biophysical view in the 2nd and 3rd paragraph of the Results, respectively. We introduce the prospective error ei as abstract notion in the first paragraph, while explaining that it will be interpreted as somato-dendritic mismatch error in neuron I in the next paragraph. The second paragraph contains the biophysical details with the apical and basal morphology.

      (4) Experts and non-expert would appreciate an explanation of why/how the choice of state variables matters in the NLA. The prospective coding state variables cannot be said to be the naïve guess. Why does the simple u, dot{u} not work as state variables applied on the same energy function, as would be a naïve application of the Lagrangian ideas?

      We are very glad for this hint to present an intuition behind the variation of the action with respect to a prospective state, instead of the state itself. The simple L(u, dot{u}) does not work because one does not obtain the first-order voltage dynamics compatible with the biophysics. We made an effort to explain the intuition to non-experts and experts in an additional paragraph right after presenting the voltage and error dynamics (Eq. 7 on page 4).

      Here is how the paragraph starts (not displaying the formulas here):

      “From the point of view of theoretical physics, where the laws of motion derived from the least-action principle contain an acceleration term (as in Newton's law of motion, like … for a harmonic oscillator), one may wonder why no second-order time derivative appears in the NLA dynamics. As an intuitive example, consider driving into a bend. Looking ahead in time helps us to reduce the lateral acceleration by braking early enough, as opposed to braking only when the lateral acceleration is already present. This intuition is captured by minimizing the neuronal action A with respect to the discounted future voltages ũi instead of the instantaneous voltages ui.

      Keeping up an internal equilibrium in the presence of a changing environment requires to look ahead and compensate early for the predicted perturbations.

      Technically, …”

      More details are given in the Methods after Eq. 20. Moreover, in the last part of the SI, Sect. F, we have made the link to the least-action principle in physics more explicitly. There we show how the voltage dynamics can be derived from the physical least-action principle by including the Rayleigh dissipation (Eq. 92 and 95).

      (5) Specify that the learning rules have not been observed. Though the learning rules are Hebbian, the details of the rules have not to my knowledge been observed. Would be worth mentioning as this is a sticking point of most related theories.

      We agree, and we do now explicitly write in the Discussion that the learning rule still awaits to be experimentally tested.

      6) Some relevant literature. Chalk et al. PNAS (2018) have explored the relationship between temporal predictive coding and Rao & Ballard predictive coding based on the parameters of the cost function. Harkin et al. eLife (2023) have shown that 'prospective coding' also takes place in the serotonergic system, while Kim ... Ma (2021) have put forward similar ideas for dopamine, both may participate in setting the cost function. Instantaneous voltage propagation is also a focus of Greedy et al. (2023). The authors cite Zenke et al. for spiking error propagation, but there are biological references to that end.

      Thanks much for these hints. We do now cite the book of Gerstner & Kistler on spiking neurons, and more specifically the spike-based approach for learning to represent signals (Brendel, .., Machens, Denève, PLoS CB, 2020). Otherwise, we had difficulties to incorporate the other literature that seems to us not directly related to our approach, even when related notions come up (like predictive coding and temporal processing in Chalk et al. (2018), where various temporal coding schemes coding efficiency is studied as a function of the signal-to-noise ratio), or the apical activities in Greedy et al. (2022), where bursting, multiplexing and synaptic facilitation arises). We found it would confuse more than it would help if we would cite these papers too (we do already cite 95 papers).

      (7) In the main text, theorem two is presented as proof without assumptions on the level of nudging, but the actual proof uses strong assumptions in that respect, relying on numerical ad hoc observations for the general case.

      Thanks for pointing this out. We agree it is a better style to state all the critical assumptions in Theorem itself, rather than deferring them to the Methods. We now state: “Then, for suitable top-down nudging, learning rates, and initial conditions, the ….weights …evolve such that…”.

      (8) In the discussion regarding error-backpropagation, it seems to me that it could be clarified that the current algorithm asks for a weight alignment between FF and FB matrices as well as between FB and interneuron circuit matrices. Whether all of these matrices can be learned together remains to be shown; neither Akrout, Kunin nor Max et al. have shown this explicitly. Particularly when there are other inputs to the apical dendrites from other areas.

      Yes, it is difficult to learn to align all in parallel. Nevertheless, our simulations in fact do align the lateral and vertical circuits, at is also claimed in Theorem 2. Yet, as specified in the theorem, “for suitable learning rates” (that were all the same, but were commonly reduced after some training time, as previously explained in the Methods, Details for Fig. 5).

      In the Discussion we now emphasis that, in general, simulating all the circuitries jointly from scratch in a single phase is tricky. We write:

      “A fundamental difficulty arises when the neuronal implementation of the Euler-Lagrange equations requires an additional microcircuit with its own dynamics. This is the case for the suggested microcircuit extracting the local errors. Formally, the representation of the apical feedback errors first needs to be learned before the errors can teach the feedforward synapses on the basal dendrites. We showed that this error learning can itself be formulated as minimizing an apical mismatch energy. What the lateral feedback through interneurons cannot explain away from the top-down feedback remains as apical prediction error.

      Ideally, while the network synapses targetting the basal tree are performing gradient descent on the global cost, the microcircuit synapses involved in the lateral feedback are performing gradient descent on local error functions, both at any moment in time.

      The simulations show that this intertwined system can in fact learn simultaneously with a common learning rate that is properly tuned. The cortical model network of inter- and pyramidal neurons learned to classify handwritten digits on the fly, with 10 digit samples presented per second. Yet, the overall learning is more robust if the error learning in the apical dendrites operates in phases without output teaching but with corresponding sensory activity, as may arise during sleep (see e.g. Deperrois et al., 2022 and 2023).”

      (9) The short-term depression model is assuming a slow type of short-term depression, not the fast types that are the focus of much recent experimental literature (like Campagnola et al. Science 2022).

      This assumption should be specified.

      Thanks for hinting to this literature that we were not aware of. We are now citing the releaseindependent plasticity (Campagnola et al. 2022) in the context of our synaptic depression model.

      (10) There seems to be a small notation issue: Eq 21 combines vectors of the size of the full network (bar{e}) and the size of the readout network (bar{e}star).

      Well, for notational convenience we set the target error to e*=0 for non-output neurons. This way we can write the total error for an arbitrary network neuron as the sum of the backpropagated error plus the putative target error (if the neuron is an output neuron). Otherwise we would always have to distinguish between network neuron that may be output neurons, and those that are not. We did say this in the main text, but are repeating it now again right after Eq. 21. -- Notations are often the result of a tradoff.

    1. Author response:

      Reviewer #1 (Public Review):

      Summary:

      The manuscript presents a compelling model to explain the impact of mosaicism in preimplantation genetic testing for aneuploidies.

      Strengths:

      A new view of mosaicism is presented with a computational model, that brings new insights into an "old" debate in our field. It is a very well-written manuscript.

      Weaknesses:

      Although the manuscript is very well written, this is in a way that assumes that the reader has existing knowledge about specific terms and topics. This was apparent through a lack of definitions and minimal background/context to the aims and conclusions for some of the author's findings.

      There is a need for some examples to connect real evidence and scenarios from clinical reports with the model.

      We thank the reviewer for their assessment. Some background was condensed for space, and we wrote the manuscript to be understood by readers with existing reproductive genetics background. We will add more detail and explain terminology more clearly. There are a number of published case studies that can link real-life clinical data with the model’s findings. We will include a summary of them in the text.

      Reviewer #2 (Public Review):

      Summary:

      Although an oversimplification of the biological complexities, this modeling work does add, in a limited way, to the current knowledge on the theoretical difficulties of detecting mosaicism in human blastocysts from a single trophectoderm biopsy in PGT. However, many of the premises that the modeling was built on are theoretical and based on unproven biological and clinical assumptions that could yet lead to be untrue. Therefore, the work should be considered only as a simplified model that could assist in further understanding of the complexities of preimplantation embryo mosaicism, but assumptions of real-world application are, at this stage, premature and should not be considered as evidence in favour of any clinical strategies.

      Strengths:

      The work has presented an intriguing theoretical model for elaborating on the interpretation of complex and still unclear biological phenomena such as chromosomal mosaicism in preimplantation embryos.

      We thank the reviewer for this detailed review, and that they see the value of theoretical modelling. We agree that this model makes simplifications; we took this simplified approach to focus on the core contradiction between clinical experience and previous modelling. Expanding the model to consider additional aspects of balanced mitotic nondisjunctions and technical accuracy is something we want to address; we are discussing whether this is something that can be practically added to this manuscript, or will involve enough work that should be developed as a further study.

      Weaknesses:

      Lines 134-138: The spatial modeling of mitotic errors in the embryo was oversimplified in this manuscript. There is only limited (and non-comprehensive) evidence that meiotic errors leading to chromosome mosaicism arise from chromosome loss or gain only (e.g. anaphase lag). This work did not take into account the (more recognised) possibility of mitotic nondisjunction where following the event there would be clones of cells with either one more or one less of the same chromosome. Although addressed in the discussion (lines 572-574), not including this in the most basic of modeling is a significant oversight that, based on the simple likelihood, could significantly affect results.

      As above, we certainly plan to address this in future modelling; developing the model to account for this while also incorporating the issue of technical uncertainty in the state of each cell in the biopsy from sequencing.

      General comment: the premise of the manuscript is that an embryologist (embryology laboratory) is aware of and can accurately quantify the number of cells in a blastocyst or TE biopsy. The reality is that it is not possible to accurately do this without the destruction of the sample which is obviously not clinically applicable. Based on many assumptions the findings show that taking small biopsies poorly classifies mosaic embryos, which is not disputed. However, extrapolating this to the clinic and making suggestions to biopsy a certain amount of cells (lines 539-540) is careless and potentially harmful by suggesting the introduction of potential change in clinical practice without validation. Additionally, no embryologist in the field can tell how many cells are present in a clinical TE biopsy, making this suggestion even more impractical.

      We will revise this to make the technical limitations of clinical TE biopsies clearer.

      On a more general clinical consideration, the authors should acknowledge that when reporting findings of unproven clinical utility and unknown predictive values this inevitably results in negative consequences for infertile couples undergoing IVF. It is proven and established that when couples face the decision on how to manage a putative mosaicism finding, the vast majority decide on embryo disposal. It was recently reported in an ESHRE survey that about 75% of practitioners in the field consider discarding or donating to research embryos with reported mosaicism. A prospective clinical trial showed that about 30% live birth rate reduction can be expected if mosaic embryos are not considered (Capalbo et al., AJHG 2021). The real-world experience is that when mosaicism is reported, embryos with almost normal reproductive potential are discarded. The authors should be more careful with the clinical interpretation and translation of these theoretical findings.

      The clinical potential of mosaic embryos is much more nuanced than a simple ‘they should be discarded’ or ‘they should be treated like euploid embryos’. While the study mentioned by the reviewer (Capalbo et al., AJHG 2021) does indeed suggest that embryos with putative low level mosaicism have good potential, it also suggests that embryos with putative high level mosaicism are largely to be considered aneuploid and should therefore be discarded. Therefore, even the mentioned study supports a ‘ranking’ of embryos by their mosaic result. Furthermore, large controlled retrospective studies have indicated that even high level mosaic embryos have reproductive potential (Viotti Fertility & Sterility 2021 and Viotti F&S 2023). Recent case reports have shown that mosaicism can occasionally persist from embryo to late gestation and even birth, at times associating with negative medical findings. Therefore, while the true clinical potential of embryos classified as mosaic is still being defined, here we are merely suggesting that from a modelling standpoint, the features of mosaicism detected with PGT-A can help guide clinical decisions (complementing the observations reported in the clinical studies).

      There is a robust consensus within the field of clinical genetics and genomics regarding the necessity to exclusively report findings that possess well-established clinical validity and utility. This consensus is grounded in the imperative to mitigate misinterpretation and ineffective actions in patient care. However, the clinical framework delineated in this manuscript diverges from the prevailing consensus in clinical genetics. Clinical genetics and genomics prioritize the dissemination of findings that have undergone rigorous validation processes and have demonstrated clear clinical relevance and utility. This emphasis is crucial for ensuring accurate diagnosis, prognosis, and therapeutic decision-making in patient care. By adhering to established standards of evidence and clinical utility, healthcare providers can minimize the potential for misinterpretation and inappropriate interventions. The framework proposed in this manuscript appears to deviate from the established principles guiding clinical genetics practice. It is imperative for clinical frameworks to align closely with the consensus guidelines and recommendations set forth by professional organizations and regulatory bodies in the field. This alignment not only upholds the integrity and reliability of genetic testing and interpretation but also safeguards patient well-being and clinical outcomes.

      References:

      ACMG Board of Directors. (2015). Clinical utility of genetic and genomic services: a position statement of the American College of Medical Genetics and Genomics. Genetics in Medicine, 17(6), 505-507. https://doi.org/10.1038/gim.2014.194.

      Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier-Foster, J., ... ACMG Laboratory Quality Assurance Committee. (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in Medicine, 17(5), 405-424. https://doi.org/10.1038/gim.2015.30

      We will update where necessary to match these references.

      Line 61: "Self correction" - This terminology is unfortunately indiscriminately used in the field for PGT when referring to mosaicism and implies that the embryo can actively correct itself from a state of inherent abnormality. Apart from there being no evidence to suggest that there is an active process by which the embryo itself can correct chromosomal errors, most presumed euploid/aneuploid mosaic embryos will have been euploid zygotes and therefore "self-harm" may be a better explanation. True self-correction in the form of meiotic trisomy/monosomy rescue is of course theoretically possible but not at all clinically significant. The concept being conveyed in this part of the manuscript is not disputed but it is strongly suggested that the term "self correction" is not used in this context, nor in the rest of the manuscript, to prevent the perpetuation of misinformation in the field and instead use a better description.

      This is a good point. We have used ‘self correction’ as a shorthand, but the reality is more nuanced. It will often be a passive process in which aneuploid cell lineages fail to proliferate over time (‘aneuploidy depletion’). The idea of ‘self harm’ is interesting; aneuploidy arising from a healthy euploid embryo. We can also see a further situation where the gametes suffered damage (e.g. DNA fragmentation, unresolved crossovers, persistence of meiotic breaks) leading to mitotic errors. In that case, the embryo would suffer the consequences of harm in the gametes, and ‘aneuploidy rescue’ may be a useful term also. We will discuss this further and reword the terminology along these lines.

      Lines 69-73: The ability to quantify aneuploidy in known admixtures of aneuploid cells is indeed well established. However, the authors claim that the translation of this to embryo biopsy samples is inferred with some confidence and that if a biopsy shows an intermediate chromosome copy number (ICN), that the biopsy and the embryo are mosaic. There are no references provided here and indeed the only evidence in the literature relating to this is to the contrary. Multifocal biopsy studies have shown that an ICN result in a single biopsy is often not seen in other biopsies from the same embryo (Capalbo et al 2021; Kim et al., 2022; Girardi et al., 2023; Marin, Xu, and Treff 2021). Multifocal biopsies showing reciprocal gain and loss which would provide stronger validation for the presence of true mosaicism are also rare. In this work, the entire manuscript is based on the accuracy of ICN in a biopsy being reflective of mosaicism in the embryo. The evidence however points to a large proportion of ICN detected in embryo biopsy potentially being technical artifacts (misdiagnosing both constitutionally normal and abnormal (meiotic aneuploid) embryos as mosaic. Therefore, although results from the modelling provide insight into theoretical results, these can not be used to inform clinical decision-making at all.

      We thank the reviewer for raising this important conceptual point, which needs to be addressed. The fact that mosaicism is often not observed in serial biopsies of the same embryo is precisely an inherent feature of mosaicism and is an invalid argument to discount the original diagnosis as false. The detection of ICN is not trivial and certain PGT-A platforms might not have the capability to discern noise from true ICN, hence the need for proper validation of the technology. The most stringent validation method for mosaicism detection remains the admixture experiment, such that when ICN patterns are detected the most obvious conclusion is that the biopsy contained a mosaic mix of cells. We aim to add wording regarding these points in the manuscript.

      Lines 87-89: The authors make the claim that emerging evidence is suggestive that the majority of embryos are mosaic to some degree. If in fact, mosaicism is the norm, the clinical importance may be limited.

      If the majority of embryos are mosaic to some degree, it is important to understand the impacts that this may have on PGT-A biopsies and how informative such biopsies may be. Returning to the point the reviewer made above about mitotic aneuploidies as an important consideration: a mitotic nondisjunction at the first cleavage would result in a embryo that was entirely aneuploid. A mitotic nondisjunction occurring at the second cleavage would result in an embryo with 50% aneuploid cells, at the third cleavage, 25% aneuploid cells. If these aneuploid cells fail to proliferate, or are removed (either actively or passively), the level of aneuploidy will fall over time. While mosaicism is a binary (an embryo is or is not a mosaic of karyotypes), even if most embryos are mosaic, the clinical importance will depend on the level of aneuploidy.

      Line 102-103: The statement that data shows that the live birth rate per ET is generally lower in mosaic embryos than euploid embryos is from retrospective cohort studies that suffer from significant selection bias. The authors have ignored non-selection study results (Capalbo et al, ajhg 2021) that suggest that putative mosaicism has limited predictive value when assessed prospectively and blinded.

      We will add the referenced multifocal biopsy study, but in contrast to the reviewer we see the data it contains as supporting our position in this paper. Capalbo et al. performed rebiopsies of trophectoderm and a biopsy of inner cell mass and found that high level mosaic or aneuploid trophectoderm tended to correlate with abnormal karyotypes in the inner cell mass while low level mosaics correlated with a normal inner cell mass. This supports our point that measuring levels of aneuploidy in the trophectoderm is relevant, and that this gives useful information for ranking embryos.

      Lines 94-98: The authors have misrepresented the works they have presented as evidence for biopsy result accuracy (Kim et al., 2023; Victor et al 2019; Capalbo et al., 2021; Girardi et al., 2023, and any others). These studies show that a mosaic biopsy is not representative of the whole embryo and can actually be from embryos where the remainder of the embryo shows no evidence of mosaicism. There is also a missing key reference of Capalbo et al, AJHG 2021, and Girardi et al., HR 2023 where multifocal biopsies were taken.

      As above, we will add more information on these multifocal biopsy studies; we believe these studies also support our position: that individual biopsies are not predictive of aneuploidy level in an embryo. If mosaicism is detected in the biopsy, then the embryo is mosaic, but if the remainder of the embryo is euploid then that single biopsy was not an accurate representation of the embryo. This could also apply in reverse - if mosaicism is not detected in the biopsy, it does not mean there is no mosaicism in the embryo, only that mosaicism could not be identified.

      Lines 371-372: "Selecting the embryo with the lowest number of aneuploid cells in the biopsy for transfer is still the most sensible decision". Where is the evidence for this other than the modeling which is affected by oversimplification and unproven assumptions? Although the statement seems logical at face value, there is no concrete evidence that the proportion of aneuploid cells within a biopsy is valuable for clinical outcomes, especially when co-evaluated with other more relevant clinical information.

      We made this statement as part of a thought experiment to explain the difference between the concepts of absolute measurements versus embryo ranking. This section is not a result of the model, or clinical advice; it is a statement that in the specific example embryos given, the embryo with the fewest aneuploid cells in the biopsy would still be the embryo with the fewest aneuploid cells overall, and thus transferring this embryo (in the absence of any other differences of embryo quality) would remain sensible.

      Lines 431-463: In this section, the authors discuss clinical outcome data from the transfer of putative mosaic embryos and make conclusions about the relationship between ICN level in biopsy and successful pregnancy outcomes. The retrospective and selective nature of the data used in forming the results has the potential to lead to incorrect conclusions when applied to prospective unselected data.

      We believe the clinical data is a useful biological reality check, and we are discussing how to integrate it better with the modelling.

      Reviewer #3 (Public Review):

      Unfortunately, this study fails to incorporate the most important variable impacting the ability to predict mosaicism, the accuracy of the test. The fact is that most embryos diagnosed as mosaic are not mosaic. There may be 4 cases out of thousands and thousands of transfers where a confirmation was made. Mosaicism has become a category of diagnosis in which embryos with noisy NGS profiles are placed. With VeriSeq NGS it is not possible to routinely distinguish true mosaicism from noise. An analysis of NGS noise levels (MAPD) versus the rate of mosaics by clinic using the registry will likely demonstrate this is the case. Without accounting for the considerable inaccuracy of the method of testing the proposed modeling is meaningless.

      We disagree with the reviewer that the modelling is meaningless; we disagree that mosaicism is rare (see our other points). However, if we grant that mosaicism is rare, that almost all embryos are euploid or aneuploid, and that technical noise is the primary factor generating intermediate copy number values, then it is still important to understand how to interpret such intermediate values. Low-level mosaics would more likely represent miscalled euploid embryos, and high-level mosaics would more likely represent miscalled aneuploid embryos. We demonstrate that ranking on these intermediate values correlates with implantation rates and live birth rates, supporting their use. We do agree that technical accuracy of the NGS is an important consideration, and we will be incorporating this into our modelling in the future.

      Recent data using more accurate methods of identifying mosaicism indicate that the prevalence of true preimplantation embryonic mosaicism is only 2%, which is also consistent with findings made post-implantation. This model fails to account for the possibility that, because so few embryos are actually mosaic, there is actually no relevance to clinical care whatsoever. In fact, differences in clinical outcomes of embryos designated as mosaic could be entirely attributed to poor embryo quality resulting in noise levels that make NGS results fall into the "mosaic" category.

      As we also wrote in the point above, we disagree; it is possible that a euploid embryo may be misinterpreted as a mosaic. It is also possible that an aneuploid embryo is misinterpreted as a mosaic. Whether the intermediate copy number values arise through biological or technical reasons, they contain information that is useful to decisions on whether to transfer. We also note a recent paper that performed single-cell dissociation of trophectoderm versus inner cell mass which found that mosaicism in human embryos is very common (Chavli et al, 2024, DOI:10.1172/JCI174483).

      Additional comments:

      “Indeed, as more data emerges, it appears that the majority of embryos from both healthy and infertile couples are mosaic to some degree (Coticchio et al., 2021; Griffin et al., 2022).”

      This statement should be softened as all embryos will be considered mosaic when a method with a 10% false positive rate is applied to 10 more parts of the same embryo. The distinction between artifact and true mosaicism cannot be made with nearly all current methods of testing. When virtually no embryos display uniform aneuploidy in a rebiopsy study, there should be great concern over the accuracy of the testing used. The vast majority of aneuploidy is meiotic in origin.

      We note that reviewer 2 wrote that mitotic aneuploidy was the key concern, whereas reviewer 3 states meiotic aneuploidy is more common; we argue that both are relevant; a recent study by McCoy et al, 2023 (DOI:10.1186/s13073-023-01231-1) found that both drive arrest of human IVF embryos.

      “Experimental data provides strong evidence that, for the most part, the biopsy result obtained accurately represents the chromosome constitution of the rest of the embryo (Kim 96 et al., 2022; Navratil et al., 2020; Victor et al., 2019).”

      This statement is incorrect given published systematic review of the literature indicates a 10% false positive rate based on rebiopsy results.

      This shows that accurately classifying a mosaic embryo based on a single biopsy is not robust.

      This is exactly why the practice of designating embryo mosaics with intermediate copy numbers should not exist.

      We agree that accurately classifying a mosaic embryo based on a single biopsy is not robust. That is one of the main messages of this paper. What we show here is that biopsies from a mosaic embryo are indeed likely to disagree with each other - but we find that there is still enough information at a population level for this to be an indicator or embryo outcomes. We have not yet performed modelling to explore the effect of technical error, so we will not speculate on the impact, but we reiterate a point made earlier: the most stringent validation method for mosaicism detection remains the admixture experiment, such that when intermediate copy number patterns are detected the most obvious conclusion is that the biopsy contained a mosaic mix of cells.

    1. eLife assessment

      In this useful study, the authors report the efficacy, hematological effects, and inflammatory response of the BPaL regimen (containing bedaquiline, pretomanid, and linezolid) compared to a variation in which Linezolid is replaced with the preclinical development candidate spectinamide 1599, administered by inhalation in tuberculosis-infected mice. The authors provide convincing evidence that supports the replacement of Linezolid in the current standard of care for drug-resistant tuberculosis. However, a limitation of the work is the lack of control experiments with bedaquiline and pretomanid only, to further dissect the relevant contributions of linezolid and spectinamide in efficacy and adverse effects. Although the manuscript is well written overall, a re-formulation of some of the stated hypotheses and conclusions, as well as the addition of text to contextualize translatability, would improve its value.

    2. Reviewer #2 (Public Review):

      Summary:

      Replacing linezolid (L) with the preclinical development candidate spectinamide 1599, administered by inhalation, in the BPaL standard of care regimen achieves similar efficacy, and reduces hematological changes and pro-inflammatory responses.

      Strengths:

      The authors not only measure efficacy but also quantify histological changes, hematological responses, and immune responses, to provide a comprehensive picture of treatment response and the benefits of the L to S substitution.

      The authors generate all data in two mouse models of TB infection, each reproducing different aspects of human histopathology.

      Extensive supplementary figures ensure transparency.

      Weaknesses:

      The articulation of objectives and hypotheses could be improved.

    3. Reviewer #3 (Public Review):

      Summary:

      In this paper, the authors sought to evaluate whether the novel TB drug candidate, spectinamide 1599 (S), given via inhalation to mouse TB models, and combined with the drugs B (bedaquiline) and Pa (pretomanid), would demonstrate similar efficacy to that of BPaL regimen (where L is linezolid). Because L is associated with adverse events when given to patients long-term, and one of those is associated with myelosuppression (bone marrow toxicity) the authors also sought to assess blood parameters, effects on bone marrow, immune parameters/cell effects following treatment of mice with BPaS and BPaL. They conclude that BPaL and BPaS have equivalent efficacy in both TB models used and that BPaL resulted in weight loss and anemia (whereas BPaL did not) under the conditions tested, as well as effects on bone marrow.

      Strengths:

      The authors used two mouse models of TB that are representative of different aspects of TB in patients (which they describe well), intending to present a fuller picture of the activity of the tested drug combinations. They conducted a large body of work in these infected mice to evaluate efficacy and also to survey a wide range of parameters that could inform the effect of the treatments on bone marrow and on the immune system. The inclusion of BPa controls (in most studies) and also untreated groups led to a large amount of useful data that has been collected for the mouse models per se (untreated) as well as for BPa - in addition to the BPaS and BPaL combinations which are of particular interest to the authors. Many of these findings related to BPa, BPaL, untreated groups, etc corroborate earlier findings and the authors point this out effectively and clearly in their manuscript. To go further, in general, it is a well-written and cited article with an informative introduction.

      Weaknesses:

      The authors performed a large amount of work with the drugs given at the doses and dosing intervals started, but at present, there is no exposure data available in the paper. It would be of great value to understand the exposures achieved in plasma at least (and in the lung if more relevant for S) in order to better understand how these relate to clinical exposures that are observed at marketed doses for B, Pa, and L as well as to understand the exposure achieved at the doses being evaluated for S. If available as historical data this could be included/cited. Considering the great attempts made to evaluate parameters that are relevant to clinical adverse events, it would add value to understand what exposures of drug effects such as anemia, weight loss, and bone marrow effects, are being observed.

      It would also be of value to add an assessment of whether the weight loss, anemia, or bone marrow effects observed for BPaL are considered adverse, and the extent to which we can translate these effects from mouse to patient (i.e. what are the limitations of these assessments made in a mouse study?). For example, is the small weight loss seen as significant, or is it reversible? Is the magnitude of the changes in blood parameters similar to the parameters seen in patients given L?

      In addition, it is always challenging to interpret findings for combinations of drugs, so the addition of language to explain this would add value: for example, how confident can we be that the weight loss seen for only the BPaL group is due to L as opposed to a PK interaction leading to an elevated exposure and weight loss due to B or Pa?

      Turning to the evaluations of activity in mouse TB models, unfortunately, the evaluations of activity in the BALB/c mouse model as well as the spleens of the Kramnik model resulted in CFU below/at the limit of detection and so, to this reviewer's understanding of the data, comparisons between BPaL and BPaS cannot be made and so the conclusion of equivalent efficacy in BALB/c is not supported with the data shown. There is no BPa control in the BALB/c study, therefore it is not possible to discern whether L or S contributed to the activity of BPaL or BPaS; it is possible that BPa would have shown the same efficacy as the 3 drug combinations. It would be valuable to conduct a study including a BPa control and with a shorter treatment time to allow comparison of BPa, BPaS, and BPaL. In the Kramnik lungs, as the authors rightly note, the studies do not support any contribution of S or L to BPa - i.e. the activity observed for BPa, BPaL, and BPaS did not significantly differ. Although the conclusions note equivalency of BPaL and BPaS, which is correct, it would be helpful to also include BPa in this statement; it would be useful to conduct a study dosing for a longer period of time or assessing a relapse endpoint, where it is possible that a contribution of L and/or S may be seen - thus making a stronger argument for S contributing an equivalent efficacy to L. The same is true for the assessment of lesions - unfortunately, there was no BPa control meaning that even where equivalency is seen for BPaL and BPaS, the reader is unable to deduce whether L or S made a contribution to this activity.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      The manuscript by Jingsong Zhou and colleagues tries to uncover the reasons for the resistance of extraocular muscles (EOMs) to degenerative changes induced by amyotrophic lateral sclerosis (ALS). The findings of the study offer valuable information that EOMs are spared in ALS because they produce protective factors for the NMJ and, more specifically, factors secreted by EOM-derived satellite cells. While most of the experimental approaches are convincing, the use of sodium butyrate (NaBu) in this study needs further investigation, as NaBu might have a variety of biological effects. Overall, this work may help develop future therapeutic interventions for patients with ALS.

      We agree with the editor that NaBu have a variety of biological effects that require further investigation. Our team previously have explored the effect of NaBu treatment on intestinal microbiota and intestinal epithelial permeability (DOI: 10.1016/j.clinthera.2016.12.014), on the mitochondrial respiratory function of NSC-34 motor neuron cell line overexpressing hSOD1G93A (DOI: 10.3390/biom12020333) and on the mitochondrial function of skeletal muscle myofibers of G93A mice (DOI: 10.3390/ijms22147412). Other research teams have also explored the role of NaBu (or HDAC inhibition) in neuronal survival and axonal transport (DOIs: 10.1073/pnas.0907935106; 10.1038/s41467-017-00911-y; 10.15252/embj.2020106177; 10.1093/hmg/ddt028).

      Since the theme of this manuscript is the transcriptomic characteristics of EOM SCs, to include data of how NaBu affect cellular/molecular processes of other tissues will somewhat deviate from the theme. It would be more appropriate to develop a separate manuscript focusing on other tissues.

      We appreciate the feedback from the Editors and reviewers. We realized that our previous description on butyrate’s beneficial role might be overstated in the Abstract Section. We have made two changes to avoid potential overstatement of our finding: (1) We modified the Abstract to state that “the NaBu-induced transcriptomic changes resembling the patterns of EOM SCs “may contribute to” (instead of “underlie”) the beneficial effects observed in G93A mice” (Page 1, Line 29); (2) We have edited the corresponding paragraph in the Discussion section to emphasize that the effect of NaBu treatment is multi-faceted (Page 11, Line 459-461).

      Recommendations for the authors:

      Reviewer #3 (Recommendations For The Authors):

      line 388-389. The sentence has been corrected but is still not clear. What do the authors mean by ".....resulting in higher proportion of COX-deficient myofibers than other muscles». What other muscles do they refer to?

      Other muscles refer to muscles whose stem cells remain dormant under physiological conditions (uninjured, innervated), such as EDL. We have edited the sentence accordingly. (Page 10, Line 431-432)

      In reference to the results shown in Fig. 2, 7, 8 and 9. Since the experimenters were not blinded, this should be explicitly stated in the Methods section.

      We have added the disclaimer in the current “Data analysis and statistics” section in Methods as follows: “The experimenters were not blinded to the samples in data collection and analysis.” (Page 15, Line 636)

      Figure 7 C has been amended but now the inserted ANOVA values interfere with the correct visualization of Fig. 7D, can panels D be moved down so that they are better separated from panels in Fig. 7C

      Thanks for the comment and we have edited Figure 7 accordingly.

      Reviewer #4 (Recommendations For The Authors):

      The authors have revised the manuscript per the reviewer's comments in this study. While most of the concerns were addressed, a few concerns remain.

      The molecular basis of how AAV-mediated delivery of Cxcl12 improves the phenotype of satellite cells is still unclear.

      Thanks for the comment. As one of the earliest discovered chemokines, the chemotactic role of Cxcl12-Cxcr4 axis on cells and cellular processes (such as axons) has been comprehensively investigated by different functional assays from overexpression to protein application to inhibitor application to knockdown by shRNAs in different types of tissues. To list a few examples, the establishment of the correct routing trajectories of mammalian motor axons and oculomotor axons during embryonic development (DOIs: 10.1016/j.neuron.2005.08.011; 10.1167/iovs.18-25190). The regeneration of injured motor axon terminals guided by terminal Schwann cells in adult mice (DOI: 10.15252/emmm.201607257). The migration of neural crest cells to sympathetic ganglia in the formation of sympathetic nerve system during embryogenesis (DOI: 10.1523/JNEUROSCI.0892-10.2010). The migration of myoblasts in the process of fusion into myotubes (DOIs: 10.1242/jcs.066241; 10.1111/boc.201200022; 10.1074/jbc.M706730200).

      Because the existence of so many detailed mechanistic studies, our goal for this manuscript is not to identify a novel mechanism of how Cxcl12-mediated chemotaxis is achieved. Rather, we used it as one of the proof-of-concept mechanisms contributing to the resistance of EOMs against ALS and benefits of NaBu treatment. Certainly, it is not the sole mechanism.

      To address the reviewer’s concern, we have expanded discussion about the previous studies regarding the chemotactic effect of Cxcl12 in the discussion section. (Page 10, Line 435-436, Page 11, Line 445-446)

      The NaBu experiments may need additional support from other approaches. NaBu effects may not be directly related to satellite cells or muscle cells. Thus, the animal experiment results need to be carefully interpreted.

      We agree that NaBu have a variety of biological effects that require further investigation. Our team previously have explored the effect of NaBu treatment on intestinal microbiota and intestinal epithelial permeability (DOI: 10.1016/j.clinthera.2016.12.014), on the mitochondrial respiratory function of NSC-34 motor neuron cell line overexpressing hSOD1G93A (DOI: 10.3390/biom12020333) and on the mitochondrial function of skeletal muscle myofibers of G93A mice (DOI: 10.3390/ijms22147412). Other research teams have also explored the role of NaBu (or HDAC inhibition) in neuronal survival and axonal transport (DOIs: 10.1073/pnas.0907935106; 10.1038/s41467-017-00911-y; 10.15252/embj.2020106177; 10.1093/hmg/ddt028).

      Since the theme of this manuscript is the transcriptomic characteristics of EOM SCs, to include data of how NaBu affect cellular/molecular processes of other tissues will somewhat deviate from the theme. It would be more appropriate to develop a separate manuscript specifically addressing the impact of NaBu on other tissues.

      We appreciate the feedback from the reviewers. We realized that our previous description on butyrate’s beneficial role might be overstated in the Abstract Section. In response, we have made two changes to avoid potential overstatement of our finding: (1) We modified the Abstract to state that “the NaBu-induced transcriptomic changes resembling the patterns of EOM SCs “may contribute to” (instead of “underlie”) the beneficial effects observed in G93A mice” (Page 1, Line 29); (2) We edited the corresponding paragraph in the Discussion section to emphasize that the effect of NaBu treatment is multi-faceted (Page 11, Line 459-461).

    2. eLife assessment

      The manuscript by Jingsong Zhou and colleagues uncovers why the extraocular muscles (EOMs) are preserved while other muscles undergo degenerative changes in amyotrophic lateral sclerosis (ALS). In this work, the authors have used a mouse model of familial ALS that carries a G93A mutation in the Sod1 gene to demonstrate that NaBu treatment partially restores the integrity of NMJ in the limb and diaphragm muscles of G93A mice. The findings of the study offer important information that EOMs are spared in ALS because they produce protective factors for the NMJ and, more specifically, factors secreted by EOM-derived satellite cells. While most of the experimental approaches are convincing, the use of sodium butyrate (NaBu) in this study needs further investigation, as NaBu might have a variety of biological effects. Overall, this work may help develop future therapeutic interventions for patients with ALS.

    3. Joint Public Review:

      Summary:

      In their paper Li et al. investigate the transcriptome of satellite cells obtained from different muscle types including hindlimb, diaphragm and extraocular muscles (EOM) from wild type and G93A transgenic mice (end stage ALS) in order to identify potential factors involved in the maintenance of the neuromuscular junction. The underlying hypothesis being that since EOMs are largely spared from this debilitating disease, they may secrete NMJ-protective factors. The results of their transcriptome analysis identified several axon guidance molecules including the chemokine Cxcl12, which are particularly enriched in EOM-derived satellite cells. Transduction of hindlimb-derived satellite cells with AAV encoding Cxcl12 reverted hindlimb-derived myotubes from the G93A mice into myotubes sharing phenotypic characteristics similar to those of EOM-derived satellite cells. Additionally, the authors were able to demonstrate that EOM-derived satellite cell myotube cultures are capable of enhancing axon extensions and innervation in co-culture experiments.

      Strengths:

      The strength of the paper is that the authors successfully isolated and purified different populations of satellite cells, compared their transcriptomes, identified specific factors release by EOM-derived satellite cells, overexpressed one of these factors (the chemokine Cxcl12) by AAV-mediated transduction of hindlimb-derived satellite cells. The transduced cells were then able to support axon guidance and NMJ integrity. They also show that administration of Na butyrate to mice decreased NMJ denervation and satellite cell-depletion of hind limbs. Furthermore, addition of Na Butyrate to hindlimb derived satellite cell myotube cultures increased Cxcl12 expression. These are impressive results providing important insights for the development of therapeutic targets to slow the loss on neuromuscular function characterizing ALS.

      Comments on latest version:

      The authors have sufficiently acknowledged and discussed the limitations of experiments involving NaBu treatment. The authors have also addressed the use of AAV-mediated delivery of Cxcl12.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Ngo et al. report a peculiar effect where a single base mismatch (CC) can enhance the mechanical stability of a nucleosome. In previous studies, the same group used a similar state-of-the-art fluorescence-force assay to study the unwrapping dynamics of 601-DNA from the nucleosome and observed that force-induced unwrapping happens more slowly for DNA that is more bendable because of changes in sequence or chemical modification. This manuscript appears to be a sequel to this line of projects, where the effect of CC is tested. The authors confirmed that CC is the most flexible mismatch using the FRET-based cyclization assay and found that unwrapping becomes slower when CC is introduced at three different positions in the 601 sequence. The CC mismatch only affects the local unwrapping dynamics of the outer turn of nucleosomal DNA.

      Strengths:

      These results are in good agreement with the previously established correlation between DNA bendability and nucleosome mechanical stability by the same group. This well-executed, technically sound, and well-written experimental study contains novel nucleosome unwrapping data specific to the CC mismatch and 601 sequence, the cyclizability of DNA containing all base pair mismatches, and the unwrapping of 601-DNA from xenophus and yeast histones. Overall, this work will be received with great interest by the biophysics community and is definitely worth attention.

      Weaknesses:

      The scope and impact of this study are somewhat limited due to the lack of sequence variation. Whether the conclusion from this study can be generalized to other sequences and other bendability-enhancing mismatches needs further investigation.

      Major questions:

      (1) As pointed out by the authors, the FRET signal is not sensitive to nucleosome position; therefore, the increasing unwrapping force in the presence of CC can be interpreted as the repositioning of the nucleosome upon perturbation. It is then also possible that CC-containing DNA is not positioned exactly the same as normal DNA from the start upon nucleosome assembly, leading to different unwrapping trajectories. What is the experimental evidence that supports identical positioning of the nucleosomes before the first stretch?

      We added the following and refer to our recent publication1 to address this question.

      “This is consistent with a previous single nucleotide resolution mapping of dyad position from of a library of mismatches in all possible positions along the 601 sequence or a budding yeast native sequence which showed that a single mismatch (A-A or T-T) does not affect the nucleosome position27.”

      (2) The authors chose a constant stretching rate in this study. Can the authors provide a more detailed explanation or rationale for why this rate was chosen? At this rate, the authors found hysteresis, which indicates that stretching is faster than quasi-static. But it must have been slow and weak enough to allow for reversible unwrapping and wrapping of a CC-containing DNA stretch longer than one helical turn. Otherwise, such a strong effect of CC at a single location would not be seen. I am also curious about the biological relevance of the magnitude of the force. Can such force arise during nucleosome assembly in vivo?

      To address the comment about the magnitude of force, we added the following paragraph to Introduction. “RNA polymerase II can initiate transcription at 4 pN of hindering force2 and its elongation activity continues until it stalls at ~ 10 pN of hindering force3,4. Therefore, the transcription machinery can generate picoNewtons of force on chromatin as long as both the machinery and the chromatin segment in contact are tethered to stationary objects in the nucleus. Another class of motor protein, chromatin remodeling enzymes, was also shown to induce processive and directional sliding of single nucleosomes when the DNA is under similar amount of tension (~ 5 pN)5. Therefore, measurements of nucleosomes at a few pN of force will expand our knowledge of the physiology roles of nucleosome structure and dynamics.”

      To address the comment about the stretching rate, we added the following to Results. We note that the physiological loading rate has been challenging to determine for any biomolecular interactions, and the only quantitative measurement we are aware of is that of an integrin that we are citing.

      “The force increases nonlinearly and the loading rate, i.e. the rate at which the force increases, was approximately in the range of 0.2 pN/s to 6 pN/s, similar to the cellular loading rates for a mechanosensitive membrane receptor6.”

      (3) In this study, the CC mismatch is the only change made to the 601 sequence. For readers to truly appreciate its unique effect on unwrapping dynamics as a base pair defect, it would be nice to include the baseline effects of other minor changes to the sequence. For example, how robust is the unwrapping force or dynamics against a single-bp change (e.g., AT to GC) at the three chosen positions?

      Unfortunately, we are unable to perform the suggested unwrapping experiment in a timely manner because the instrument has been disassembled during our recent move. However, we previously performed unwrapping experiments not only as a function of sequence but also as a function of cytosine modification and showed that we can detect even more subtle effects7,8. In addition, please note that we are not claiming that simply changing basepair at the chosen sites changes the mechanical stability of a nucleosome so we do not believe the requested experiment is necessary.

      (4) The last section introduces yeast histones. Based on the theme of the paper, I was expecting to see how the effect of CC is or is not preserved with a different histone source. Instead, the experiment only focuses on differences in the unwrapping dynamics. Although the data presented are important, it is not clear how they fit or support the narrative of the paper without the effect of CC.

      We apologize for giving the reviewer a wrong impression. We included the data because we believe that information on how the histone core can determine the translation of DNA mechanics into nucleosome mechanical stability will be of interest to the readers of this manuscript. We now mention explicitly that the observation was made using intact DNA, i.e. no mismatch, in the abstract and elsewhere.

      (5) It is stated that tRNA was excluded in experiments with yeast-expressed nucleosomes. What is the reason for excluding it for yeast nucleosomes? Did the authors rule out the possibility that tRNA causes the measured difference between the two nucleosome types?

      We normally include tRNA because we found that it reduces sticking of beads to the surface over several hours of experiments. In yeast nucleosomes, we found that tRNA causes the nucleosome to disassemble. Therefore, we did not include tRNA in yeast nucleosome experiments. We now mention this in Methods as reproduced below.

      “tRNA, which we normally include to reduce sticking of beads to the surface over the hours of single molecule experiments in a sealed chamber, was excluded in experiments with yeastexpressed nucleosomes because tRNA induced disassembly of nucleosomes assembled using yeast histones.”

      We cannot not formally rule out the possibility that tRNA causes the measured difference between Xenopus - vs Yeast- nucleosomes. However, we have shown in our previous publication7 that the asymmetric unwrapping in Xenopus nucleosomes was modulated by the DNA sequence. When we swapped the sequence of the inner turn between the two sides, while tRNA was included in all experiments, we observed stochastic unwrapping instead. As part of our response to another reviewer’s comments, we also added the following on the relevant differences between the species in Discussion.

      “The crystal structure of the yeast nucleosome suggests that yeast nucleosome architecture is subtly destabilized in comparison with nucleosomes from higher eukaryotes9. Yeast histone protein sequences are not well conserved relative to vertebrate histones (H2A, 77%; H2B, 73%; H3, 90%; H4, 92% identities), and this divergence likely contributes to differences in nucleosome stability. Substitution of three residues in yeast H3 a3-helix (Q120, K121, K125) very near the nucleosome dyad with corresponding human H3.1/H3.3 residues (QK…K replaced with MP…Q) caused severe growth defects, elevated nuclease sensitivity, reduced nucleosome positioning and nucleosome relocation to preferred locations predicted by DNA sequence alone 10. The yeast histone octamer harboring wild type H3 may be less capable of wrapping DNA over the histone core, leading to reduced resistance to the unwrapping force for the more flexible half of the 601positioning sequence.”

      Reviewer #2 (Public Review):

      Summary:

      Mismatches occur as a result of DNA polymerase errors, chemical modification of nucleotides, during homologous recombination between near-identical partners, as well as during gene editing on chromosomal DNA. Under some circumstances, such mismatches may be incorporated into nucleosomes but their impact on nucleosome structure and stability is not known. The authors use the well-defined 601 nucleosome positioning sequence to assemble nucleosomes with histones on perfectly matched dsDNA as well as on ds DNA with defined mismatches at three nucleosomal positions. They use the R18, R39, and R56 positions situated in the middle of the outer turn, at the junction between the outer turn and inner turn, and in the middle of the inner turn, respectively. Most experiments are carried out with CC mismatches and Xenopus histones. Unwrapping of the outer DNA turn is monitored by singlemolecule FRET in which the Cy3 donor is incorporated on the 68th nucleotide from the 5'-end of the top strand and the Cy5 acceptor is attached to the 7th nucleotide from the 5' end of the bottom strand. Force is applied to the nucleosomal DNA as FRET is monitored to assess nucleosome unwrapping. The results show that a CC mismatch enhances nucleosome mechanical stability. Interestingly, yeast and Xenopus histones show different behaviors in this assay. The authors use FRET to measure the cyclization of the dsDNA substrates to test the hypothesis that mismatches enhance the flexibility of the 601 dsDNA fragment and find that CC, CA, CT, TT, and AA mismatches decrease looping time, whereas GA, GG, and GT mismatches had little to no effect. These effects correlate with the results from DNA buckling assays reported by Euler's group (NAR 41, 2013) using the same mismatches as an orthogonal way to measure DNA kinking. The authors discuss that substitution rates are higher towards the middle of the nucleosome, suggesting that mismatches/DNA damage at this position are less accessible for repair, consistent with the nucleosome stability results.

      Strengths:

      The single-molecule data show clear and consistent effects of mismatches on nucleosome stability and DNA persistence length.

      Weaknesses:

      It is unclear in the looping assay how the cyclization rate relates to the reporting looping time. The biological significance and implications such as the effect on mismatch repair or nucleosome remodelers remain untested. It is unclear whether the mutational pattern reflects the behavior of the different mismatches. Such a correlation could strengthen the argument that the observed effects are relevant for mutagenesis.

      Reviewer #3 (Public Review):

      Summary:

      The mechanical properties of DNA wrapped in nucleosomes affect the stability of nucleosomes and may play a role in the regulation of DNA accessibility in eukaryotes. In this manuscript, Ngo and coworkers study how the stability of a nucleosome is affected by the introduction of a CC mismatched base pair, which has been reported to increase the flexibility of DNA. Previously, the group has used a sophisticated combination of single-molecule FRET and force spectroscopy with an optical trap to show that the more flexible half of a 601 DNA segment provides for more stable wrapping as compared to the other half. Here, it is confirmed with a single-molecule cyclization essay that the introduction of a CC mismatch increases the flexibility of a DNA fragment. Consistent with the previous interpretation, it also increased the unwrapping force for the half of the 601 segment in which the CC mismatch was introduced, as measured with single-molecule FRET and force spectroscopy. Enhanced stability was found up to 56 bp into the nucleosome. The intricate role of mechanical stability of nucleosomes was further investigated by comparing force-induced unwrapping profiles of yeast and Xenopus histones. Intriguingly, asymmetric unwrapping was more pronounced for yeast histones.

      Strengths:

      (1) High-quality single-molecule data.

      (2) Novel mechanism, potentially explaining the increased prominence of mutations near the dyads of nucleosomes.

      (3) A clear mechanistic explanation of how mismatches affect nucleosome stability.

      Weaknesses:

      (1) Disconnect between mismatches in nucleosomes and measurements comparing Xenopus and yeast nucleosome stability.

      (2) Convoluted data in cyclization experiments concerning the phasing of mismatches and biotin site. ---

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Specific comments:

      In Figure 1 legend, "the black diamonds on the DNA bends represent the mismatch position with R18 and R39 on minor grooves and R56 on a major groove." Minor and major grooves should be phrased as histone-facing minor and major grooves.

      We fixed the problem.

      In Materials and Methods, the sentence that describes the stretching rate cites reference 1, which does not seem to be relevant.

      We fixed the problem.

      Reviewer #2 (Recommendations For The Authors):

      (1) In the introduction, the authors should also discuss the context of mismatches occurring during homologous recombination in meiosis or somatic cells in non-allelic recombination between near identical repeats.

      Introduction now has the following.

      “DNA base-base mismatches are generated by nucleotide misincorporation during DNA synthesis, meiotic recombination, somatic recombination between nearly identical repeats, or chemical modification such as hydrolytic deamination of cytosine.”

      (2) Generally, it seems counter-intuitive in terms of biology that mismatches containing nucleosomes are more stable, as mismatches require repair and/or detection for heteroduplex rejection during recombination. Some discussion of this apparent paradox should be added.

      To address this comment, we added the following to Discussion.

      “The higher frequency of substitutions in the nucleosomal DNA may be attributed to the difficulty of accessing the extra-stable nucleosomes. We also note that even without an enhanced stability, a mismatch within a nucleosome would be more difficult to detect for mismatch repair machineries compared to a mismatch in a non-nucleosomal DNA. Because mismatch repair machineries accompany the replisome, most of nascent mismatches may be detected for repair before nucleosome deposition. Therefore, the decrease in accessibility predicted based on our data here may be important only in rare cases a mismatch is not detected prior to the deposition of a nucleosome on the nascent DNA or in cases where a mismatch is generated via a non-replicative mechanism.”

      (3) The authors discuss that the substitution rate is higher while the indel (insertion and deletion) rate is lower nearer the center of a positioned nucleosome. Are the differences between individual mismatches reported in Figure 6 reflected in the mutagenic profile?

      We cannot currently compare them because the mutagenic profile even when it is available is a complex convolution of mismatch generation, mismatch repair and selection. Mismatch generation occurs through several different processes and how they are affected by nucleosomes and their mismatch type and sequence context is unknown. Mismatch repair process itself depends on mismatch type and sequence context as recently shown by a high throughput in vivo study11. And because the population genetics does not simply reflect de novo mutation profiles due to selection, comparison between mismatch-induced DNA mechanical changes and mutagenic profiles is further complicated. We added the following to the revision.

      “If and how the mismatch type-dependent DNA mechanics affects the sequence-dependent mismatch repair efficiency in vivo, as recently determined in a high through study in E. coli11, remains to be investigated. Comparison of mismatch-type dependent DNA mechanics to population genetics data is challenging because mutation profiles reflect a combined outcome of mismatch-generation, mismatch repair and selection in addition to other mutational processes.”

      (4) The looping assay should be explained better, especially how the cyclization rate is related to the reported looping time.

      We modified Figure 5 to include examples of looping time determination through fitting of the looped fraction vs time, and added the following to the figure caption.

      “To calculate the looping time, the fraction of looped molecules (high FRET) as a function of time is fitted to an exponential function, 𝑒−𝑡⁄(𝑙𝑜𝑜𝑝𝑖𝑛𝑔 𝑡𝑖𝑚𝑒) (right panel for one run of experiments).

      Furthermore, we added the following sentence to Results.

      “The rate of loop formation, which is the inverse of looping time determined from an exponential fitting of loop fraction vs time, was used as a measure of apparent DNA flexibility influenced by a mismatch 12,13.”

      *Reviewer #3 (Recommendations For The Authors):

      I have some concerns that, when addressed upon revision, would improve the manuscript:

      (1) Page 6 and Supplementary Figure S1C: Though the FRET levels are the same for all nucleosomes, the distribution between the two levels is not. The nucleosomes with CC mismatches appear to have a larger fraction in the low-FRET population. This seems to contradict the higher mechanical stability. A comment on this should clarify it, or make this conundrum explicit.

      Thank you for the comment. The low FRET population also includes the nucleosomes that do not have an active acceptor the fraction of which varies between preparations. We now note this in the supplementary figure caption.

      (2) It is intriguing that a more stable nucleosome forms after several pulling cycles and it is argued that this might be due to shifting of the nucleosome. This seems reasonable and has important consequences both for the interpretation of the current experimental data and for the general mechanisms involved in nucleosome maintenance and remodeling. It is puzzling though how this would work mechanistically since it only seems to happen when nucleosomes are half-wrapped and when the unwrapped half contains the mismatch. From the previous work of the group and the current manuscript, it seems that shift does not occur in DNA without mismatches (Correct?). Does shifting happen for the 601-R18 and 601-R56 nucleosomes as well?

      The mismatch-containing half is the half that is mechanically less stable in an intact, mismatch-free 601 nucleosome. So indeed, that is the half that is unwrapped in an intact nucleosome. But because the introduction of mismatch makes that half more mechanically stable, it can stay wrapped until higher forces, and the resulting structural distortion may cause the shift although we acknowledge that this interpretation remains speculative. Shifting occurs for all three constructs with a mismatch but not for the intact nucleosome without a mismatch.

      (3) Could the shifting be related to the differences in sub-population distribution observed in Supplementary Figure S1C?

      /See our response to comment (1) above.

      (4) The paper would have more impact if the mechanism of possible shifting could be clarified. This can be done experimentally with a fluorescent histone, as suggested in the manuscript. But having a FRET pair on positions in the DNA that would shift to closer proximity upon shifting, either at the ED2 or at the ED1 site will also work, is in line with the current experiments and seems feasible.

      We revised the text as follows in order not to exclude labeling configurations with both fluorophores on the DNA while reporting on the shift. We are also happy to add an appropriate reference if the reviewer can help us identify an existing study that measured dyad position shifts through such a labeling configuration.

      “However, since the FRET values in our DNA construct are not sensitive to the nucleosome position, further experiments with fluorophores conjugated to strategic positions that allow discrimination between different dyad positions14 will be required to test this hypothesis.”

      (5) Figures 5 and 6: To appreciate the quality of the data, state the number of molecules that contributed to the cyclization essay, or better, share a figure of the number of looped molecules as a function of time as supplementary data.

      We added the requested figures to Figure 5 and a new supplementary Figure 2, and added the following to Methods.

      “Approximately 2500 – 3500 molecules were quantified at each timestamp during the experiment, and three independent experiments were performed for each sequence (Supplemental Figure S2).”

      (6) Page 8/9: A control is added to confirm that the phasing of the biotin relative to the end affects the observed cyclization rate. However, the mismatch sites were chosen such that they included 5 bp phase shifts. This convolutes the outcomes, as the direction of flexibility due to the phasing of the mismatch relative to the biotin may also influence the rate. Was this checked?

      We would like to clarify that the phasing of the biotin is not so much as with respect to the end, as it is with respect to the full molecule. Static curvature and poloidal angle associated with the DNA molecule (which is something that is ultimately determined by the full chemical composition of the molecule, including its sequence and the mismatch) could make the molecule prefer a looped configuration where the biotin points towards the “inside” of the molecule. Such a configuration would be sterically unfavoured during the single molecule looping reaction where the biotin is attached to a surface via avidin. However, if the biotin is moved by half the helical repeat (or an off multiple of half the helical repeat, essentially 16 nt as done in the manuscript), it would now point to the “outside” of the molecule. Therefore, to make sure that the difference between the looping rates of any two DNA constructs (say the 601-RH and 601-R18-RH) is a better reflection of differences in dynamic flexibility, we ensure that the difference persists even when the biotin is moved by an odd multiple of half the helical repeat. We revised the section as follows.

      “For example, moving the location of the biotin tether by half the helical repeat (~ 5 bp) can lead to a large change in cyclization rate15, likely due to the preferred poloidal angle of a given DNA16 that determines whether the biotin is facing towards the inside of the circularized DNA, thereby hindering cyclization due to steric hindrance caused by surface tethering.”

      (7) Page 9/10: The comparison of yeast vs Xenopus is interesting, albeit a bit disconnected. Since the single-molecule statistics are relatively small, did the nucleosomes show similar bulk FRET distributions, or did they also show a shift in FRET levels?

      We included the data because we believe that information on how the histone core can determine the translation of DNA mechanics into nucleosome mechanical stability will be of interest to the readers of this manuscript. The FRET values were similarly distributed.

      (8) The discussion calls for a more detailed analysis of the structural differences of the histones of the two species to rationalize the observed asymmetry in flexibility dependence: why would yeast nucleosomes be less sensitive to sequence asymmetries?

      We added the following to Discussion to address this comment.

      “The crystal structure of the yeast nucleosome suggests that yeast nucleosome architecture is subtly destabilized in comparison with nucleosomes from higher eukaryotes9. Yeast histone protein sequences are not well conserved relative to vertebrate histones (H2A, 77%; H2B, 73%; H3, 90%; H4, 92% identities), and this divergence likely contributes to differences in nucleosome stability. Substitution of three residues in yeast H3 3-helix (Q120, K121, K125) very near the nucleosome dyad with corresponding human H3.1/H3.3 residues (QK…K replaced with MP…Q) caused severe growth defects, elevated nuclease sensitivity, reduced nucleosome positioning and nucleosome relocation to preferred locations predicted by DNA sequence alone 10. The yeast histone octamer harboring wild type H3 may be less capable of wrapping DNA over the histone core, leading to reduced resistance to the unwrapping force for the more flexible half of the 601positioning sequence.”

      (9) It would also be interesting if the increased stability due to the introduction of mismatches observed on Xenopus nucleosomes holds in yeast. Or does the reduced stability remove this effect? This is relevant to substantiate the broad claims in the context of evolution and cancer that are discussed in the manuscript.

      Unfortunately, we are unable to perform the suggested unwrapping experiment in a timely manner because the instrument has been disassembled during our recent move. However, in terms of cancer relevance, our mismatch dependence experiments were performed using vertebrate nucleosomes (Xenopus) so repeating this for yeast nucleosomes would not provide relevant information.

      Minor comments:

      (1) Supplementary Figure S1 misses the label '(C)' in its caption.

      We fixed it.

      (2) The supplementary data sequences for the fleezer measurements contain entrees 'R39 construct' and miss the positions of the Cy3 and Cy labels; the color code (levels of grey) is not explained.

      We fixed the labeling mistake and added detailed annotations of the highlighted features.

      References

      (1) Park, S., Brandani, G.B., Ha, T. & Bowman, G.D. Bi-directional nucleosome sliding by the Chd1 chromatin remodeler integrates intrinsic sequence-dependent and ATP-dependent nucleosome positioning. Nucleic Acids Res 51, 10326-10343 (2023).

      (2) Fazal, F.M., Meng, C.A., Murakami, K., Kornberg, R.D. & Block, S.M. Real-time observation of the initiation of RNA polymerase II transcription. Nature 525, 274-7 (2015).

      (3) Galburt, E.A., Grill, S.W., Wiedmann, A., Lubkowska, L., Choy, J., Nogales, E., Kashlev, M. & Bustamante, C. Backtracking determines the force sensitivity of RNAP II in a factor-dependent manner. Nature 446, 820-3 (2007).

      (4) Schweikhard, V., Meng, C., Murakami, K., Kaplan, C.D., Kornberg, R.D. & Block, S.M. Transcription factors TFIIF and TFIIS promote transcript elongation by RNA polymerase II by synergistic and independent mechanisms. Proc Natl Acad Sci U S A 111, 6642-7 (2014).

      (5) Kim, J.M., Carcamo, C.C., Jazani, S., Xie, Z., Feng, X.A., Yamadi, M., Poyton, M., Holland, K.L., Grimm, J.B., Lavis, L.D., Ha, T. & Wu, C. Dynamic 1D Search and Processive Nucleosome Translocations by RSC and ISW2 Chromatin Remodelers. bioRxiv (2024). (6) Jo, M.H., Meneses, P., Yang, O., Carcamo, C.C., Pangeni, S. & Ha, T. Determination of singlemolecule loading rate during mechanotransduction in cell adhesion. Science (in press).

      (7) Ngo, T.T., Zhang, Q., Zhou, R., Yodh, J.G. & Ha, T. Asymmetric unwrapping of nucleosomes under tension directed by DNA local flexibility. Cell 160, 1135-44 (2015).

      (8) Ngo, T.T., Yoo, J., Dai, Q., Zhang, Q., He, C., Aksimentiev, A. & Ha, T. Effects of cytosine modifications on DNA flexibility and nucleosome mechanical stability. Nat Commun 7, 10813 (2016).

      (9) White, C.L., Suto, R.K. & Luger, K. Structure of the yeast nucleosome core particle reveals fundamental changes in internucleosome interactions. EMBO J 20, 5207-18 (2001).

      (10) McBurney, K.L., Leung, A., Choi, J.K., Martin, B.J., Irwin, N.A., Bartke, T., Nelson, C.J. & Howe, L.J. Divergent Residues Within Histone H3 Dictate a Unique Chromatin Structure in Saccharomyces cerevisiae. Genetics 202, 341-9 (2016).

      (11) Kayikcioglu, T., Zarb, J.S., Lin, C.-T., Mohapatra, S., London, J.A., Hansen, K.D., Rishel, R. & Ha, T. Massively parallel single molecule tracking of sequence-dependent DNA mismatch repair in vivo. bioRxiv, 2023.01.08.523062 (2023).

      (12) Jeong, J., Le, T.T. & Kim, H.D. Single-molecule fluorescence studies on DNA looping. Methods 105, 34-43 (2016).

      (13) Jeong, J. & Kim, H.D. Base-Pair Mismatch Can Destabilize Small DNA Loops through Cooperative Kinking. Phys Rev Lett 122, 218101 (2019).

      (14) Blosser, T.R., Yang, J.G., Stone, M.D., Narlikar, G.J. & Zhuang, X. Dynamics of nucleosome remodelling by individual ACF complexes. Nature 462, 1022-7 (2009).

      (15) Basu, A., Bobrovnikov, D.G., Qureshi, Z., Kayikcioglu, T., Ngo, T.T.M., Ranjan, A., Eustermann, S., Cieza, B., Morgan, M.T., Hejna, M., Rube, H.T., Hopfner, K.P., Wolberger, C., Song, J.S. & Ha, T. Measuring DNA mechanics on the genome scale. Nature 589, 462-467 (2021).

      (16) Yoo, J., Park, S., Maffeo, C., Ha, T. & Aksimentiev, A. DNA sequence and methylation prescribe the inside-out conformational dynamics and bending energetics of DNA minicircles. Nucleic Acids Res 49, 11459-11475 (2021).

    2. eLife assessment

      This manuscript reports important data on the stability of nucleosomes with dsDNA substrates containing defined mismatches at three defined nucleosomal positions. Compelling evidence obtained by single-molecule FRET experiments shows that certain mismatches lead to more stable nucleosomes likely because mismatches kink to enhance DNA flexibility leading to higher nucleosome stability. The biological significance and implications of the findings remain unclear.

    3. Reviewer #1 (Public Review):

      In this manuscript, Ngo et al. report a peculiar effect where a single base mismatch (CC) can enhance the mechanical stability of a nucleosome. In previous studies, the same group used a similar state-of-the-art fluorescence-force assay to study the unwrapping dynamics of 601-DNA from the nucleosome and observed that force-induced unwrapping happens more slowly for DNA that is more bendable because of changes in sequence or chemical modification. This manuscript appears to be a sequel to this line of projects, where the effect of CC is tested. The authors confirmed that CC is the most flexible mismatch using the FRET-based cyclization assay and found that unwrapping becomes slower when CC is introduced at three different positions in the 601 sequence. The CC mismatch only affects the local unwrapping dynamics of the outer turn of nucleosomal DNA.

    4. Reviewer #2 (Public Review):

      Mismatches occur as a result of DNA polymerase errors, chemical modification of nucleotides, during homologous recombination between near-identical partners, as well as during gene editing on chromosomal DNA. Under some circumstances, such mismatches may be incorporated into nucleosomes but their impact on nucleosome structure and stability is not known. The authors use the well-defined 601 nucleosome positioning sequence to assemble nucleosomes with histones on perfectly matched dsDNA as well as on ds DNA with defined mismatches at three nucleosomal positions. They use the R18, R39, and R56 positions situated in the middle of the outer turn, at the junction between the outer turn and inner turn, and in the middle of the inner turn, respectively. Most experiments are carried out with CC mismatches and Xenopus histones. Unwrapping of the outer DNA turn is monitored by single-molecule FRET in which the Cy3 donor is incorporated on the 68th nucleotide from the 5'-end of the top strand and the Cy5 acceptor is attached to the 7th nucleotide from the 5' end of the bottom strand. Force is applied to the nucleosomal DNA as FRET is monitored to assess nucleosome unwrapping. The results show that a CC mismatch enhances nucleosome mechanical stability. Interestingly, yeast and Xenopus histones show different behaviors in this assay. The authors use FRET to measure the cyclization of the dsDNA substrates to test the hypothesis that mismatches enhance the flexibility of the 601 dsDNA fragment and find that CC, CA, CT, TT, and AA mismatches decrease looping time, whereas GA, GG, and GT mismatches had little to no effect. These effects correlate with the results from DNA buckling assays reported by Euler's group (NAR 41, 2013) using the same mismatches as an orthogonal way to measure DNA kinking. The authors discuss that substitution rates are higher towards the middle of the nucleosome, suggesting that mismatches/DNA damage at this position are less accessible for repair, consistent with the nucleosome stability results.

    5. Reviewer #3 (Public Review):

      The mechanical properties of DNA wrapped in nucleosomes affect the stability of nucleosomes and may play a role in the regulation of DNA accessibility in eukaryotes. In this manuscript, Ngo and coworkers study how the stability of a nucleosome is affected by the introduction of a CC mismatched base pair, which has been reported to increase the flexibility of DNA. Previously, the group has used a sophisticated combination of single-molecule FRET and force spectroscopy with an optical trap to show that the more flexible half of a 601 DNA segment provides for more stable wrapping as compared to the other half. Here, it is confirmed with a single-molecule cyclization essay that the introduction of a CC mismatch increases the flexibility of a DNA fragment. Consistent with the previous interpretation, it also increased the unwrapping force for the half of the 601 segment in which the CC mismatch was introduced, as measured with single-molecule FRET and force spectroscopy. Enhanced stability was found up to 56 bp into the nucleosome. The intricate role of mechanical stability of nucleosomes was further investigated by comparing force-induced unwrapping profiles of yeast and Xenopus histones. Intriguingly, asymmetric unwrapping was more pronounced for yeast histones.

      Note from Reviewing Editor:

      The authors addressed the points in the reviews by making appropriate text additions and clarifications.

    1. eLife assessment

      This important study identifies the anti-inflammatory function of PEGylated PDZ peptides that are derived from the ZO-1 protein. Results from cellular and in vivo experiments tracking key inflammatory markers are compelling. Although the mechanism of action remains largely unknown, this study provides a proof of concept for developing novel strategies against acute inflammatory conditions such as sepsis.

    2. Reviewer #1 (Public Review):

      Summary:

      In this study, the authors investigate the potential therapeutic effects of the PEGylated PDZ peptide, derived from the ZO-1 protein, in suppressing LPS-induced systemic inflammation. The authors found that the pretreatment of PEGylated PDZ peptide led to a restoration of tissue injuries in the kidney, liver, and lung, and diminished alterations in biochemical plasma markers induced by LPS. This was accompanied by decreased production of pro-inflammatory cytokines in the plasma and lung BALF of the PDZ-administered mice.

      Strengths:

      - The data presented here is solid and the results provide the groundwork for developing novel anti-inflammatory therapeutic strategies.<br /> - The authors employ various cells and in vivo models to test the efficacy of the peptide.

      Weaknesses:<br /> The mechanism of action remains largely unknown.

    3. Reviewer #2 (Public Review):

      Summary:

      The authors investigated systemic inflammation induced by LPS in various tissues and also examined immune cells of the mice using tight junction protein-based PDZ peptide. They explored the mechanism of anti-systemic inflammatory action of PDZ peptides, which enhanced M1/M2 polarization and induced the proliferation of M2 macrophages. Additionally, they insisted on the physiological mechanism that inhibited the production of ROS in mitochondria, thereby preventing systemic inflammation.

      Strengths:<br /> In the absence of specific treatments for septic shock or sepsis, the study demonstrating that tight junction-based PDZ peptides inhibit systemic inflammation caused by LPS is highly commendable. Whereas previous research focused on antibiotics, this study proves that modifying parts of intracellular proteins can significantly suppress symptoms caused by septic shock. The authors expanded the study of localized inflammation caused by LPS or PM2.5 in the respiratory tract, to systemic inflammation, presenting promising results. They not only elucidated the physiological mechanism by identifying the transcriptome through RNA sequencing but also demonstrated that PDZ peptides inhibit the production of ROS in mitochondria and prevent mitochondrial fission. This research is highly regarded as an excellent study with potential as a treatment for septic shock or sepsis.

      Weaknesses<br /> (1) The authors focused intensively on acute inflammation for a short duration instead of chronic inflammation.<br /> (2) LPS was used to induce septic shock, but administrating actual microbes such as E.coli would yield more accurate results.<br /> (3) The authors used pegylated peptides, but future research should utilize the optimized peptides to derive the optimal peptide, and further, PK/PD studies are also necessary.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Recommendations For The Authors):

      Beyond my general review, some descriptions of the results and methods could be further clarified, which I've outlined below:

      (1) Page 3, Line 118-120: Based on results from Fig 1A, the authors reported 15 nanobodies neutralized both delta and BA.1 out of the 41 tested. However, I only counted 14. Could the authors double check?

      We recounted the nanobodies and confirmed there are 15 as follows:

      (1) RBD-15

      (2) RBD-22

      (3) RBD-24

      (4) RBD-9S1-4

      (5) S1-35

      (6) RBD-6

      (7) RBD-5

      (8) RBD-21

      (9) RBD-16

      (10) S1-46

      (11) S1-49dimer

      (12) S2-10dimer

      (13) S2-3

      (14) S2-62

      (2) Page 5, Lines 134-135: the authors described that the heatmap reflects the neutralizing strength of the representative nanobodies from each group. For groups where multiple nanobodies were selected for visualization, how was the neutralization strength calculated? Was the IC50 averaged first before being converted into the neutralization strength?

      This has been made clear in the legend for Fig. 1 as follows “For groups with multiple nanobodies, the average -log10 (IC50) is first calculated for the nanobodies within that group, then normalized to a neutralization score within the 0–100 range using the min and max average -log10 (IC50) for that group. A higher score indicates more potent neutralization of the variant relative to the wild type.”

      (3) Page 5, Lines 138-139: What was the authors' rationale for selecting certain nanobodies over others for structural modeling and visualizing the neutralization heatmap in Fig 1B? Does it introduce bias to the neutralizing epitope map on the spike protein?

      We only focused on nanobodies for which we had enough epitope mapping data to unambiguously generate docked nanobody-spike models, as explained in our previous study (Mast et. al, eLife 2021). When multiple nanobodies within the same group had sufficient epitope mapping data available, we selected only representative candidates that had better binding affinity and/or neutralization potency. As epitope mapping via escape mutants relied largely on random point mutagenesis of Spike, there should be little introduced bias.

      Overall, groups I-VII cover an exhaustive set of target areas on the RBD (including the lone glycan site in Group-II), while groups VII and IX are representative areas on NTD and S2. Using group-average IC50s and suitable normalization as mentioned in point 3 above further prevent potential biases due to unequal number of Nbs modeled from each group.

      We have modified the text with the following:

      “For computational epitope modeling, we selected nanobody candidates using a series of experimentally obtained structural restraints, as described in Mast, Fridy et al. 2021.”

      (4) Page 5, Lines 161-167: It would be good to include Fig S1 as a main figure as it places the epitope landscape of nanobodies being investigated in this manuscript into the broader context of clinically approved monoclonal antibody therapeutics for COVID-19.

      We have amended the Figures to accommodate the reviewers suggestion. Figure S1 is now Figure 2.

      (5) Page 6, Lines 173-175: The neutralization breadth for S1-46 is quite encouraging. Any speculations on why this particular nanobody is so broadly targeting? Any additional thoughts on why its high binding affinity (nM) did not translate into strong neutralization (as it is in the 0.1-1 uM range)?

      S1-46 binds a region on spike that is conserved across all variants observed to date. Its epitope is difficult to access unless the RBD is in the up conformation, which may explain why monoclonal antibodies rarely bind. We state this in the text as follows:

      “S1-46 binds a region on spike that is conserved across all variants to date, but which may be relatively inaccessible and is not targeted by any of the mAbs that previously received EUA by the FDA (Cox, Peacock et al. 2023).”

      Relating neutralization activity to binding activity requires more insight into the mechanisms of binding and activity. Nonetheless, we are also encouraged by S1-46’s breadth and numerous avenues can be pursued to greatly improve its neutralizing activity (e.g. synergistic combinations).

      (6) Page 6, Lines 173-175: For the remaining two nanobodies S1-31 and S1-RBD-11 in group VII, the target epitopes on the spike proteins of either delta or BA.1 do not seem to bear any mutations, at least based on the mutation maps in Fig 1B. Yet their neutralizing capacities against delta and BA.1 variants were abolished. Do the authors have any idea about what is going on here?

      For group VII, only the epitope of S1-46 was mapped whereas S1-31 and S1-RBD-11 were assigned to group VII based on our lower resolution binning experiments. Thus, without knowing precisely where they bind, we can make only limited conclusions at this time. In the absence of supporting structural information, we speculate that the epitopes of RBD-11 and S1-31 may be in a region that overlaps with or is in close proximity to a mutation that could affect the binding of the nanobody enough to result in loss of neutralizing ability.

      (7) Page 7, Line 195-200: Please provide PRNT50 or logPRNT50 for the five nanobodies selected for BA.4/5 PRNT assay.

      We have added this suggested information. Additionally, a supporting table (Table S1) is now provided.

      (8) Page 8, Lines 223-224: Similar to comment 3, what was the rationale here for choosing certain nanobodies over others for structural modeling and visualizing the binding heatmap in Fig 2B?

      The set of nanobodies chosen for structural modeling and visualization of neutralization data is identical to the set of anti-RBD nanobodies chosen for binding.

      (9) Page 11, Lines 326-328: Can the authors include mutation maps as part of Fig 4C to show the mutation distributions on the XBB/BQ.1/BQ/1.1 spikes?

      We have updated and added a supplemental figure to accompany Fig. 5 (called “supplement for Figure 5”) showing the mutation maps.

      (10) Page 14, Line 409-418: This paragraph is well considered. Given the large number of nanobodies assessed in this manuscript, it would be helpful if the authors could highlight some candidate nanobodies as lead candidates for further optimization.

      While our intention in this manuscript was not to provide targeted recommendations for lead candidates, but rather to reiterate the collective potential of a Nb pool originally targeted towards the 2019 Wuhan variant, the reviewers point is interesting. We speculate that any of the Nbs we have demonstrated to show pan-VoC activity, would be prime candidates for further optimization.

      We have added a statement to this effect as follows: “We propose that any of the Nbs we have demonstrated to show pan-VoC activity, would be prime candidates for further optimization.”

      Reviewer #2 (Recommendations For The Authors):

      Major concerns:

      (1) The main message of the article is the prediction that nanobodies that retain binding to the different SARS-CoV-2 variants including early Omicron strains will retain binding and neutralization against currently circulating strains such XBB and BQ. However, no evidence either via modeling or experimental testing has been provided for that prediction. The study will benefit from mapping amino acid mutations in RBD of XBB and BQ lineages compared to BA.4/5 and demonstrating via computation docking that epitopes of the five nanobodies that retain binding to BA.4/5 RBD are not affected. For example, the crystal structure of XBB.1 RBD PDB:8OIV is available. Binding/neutralization experiment with currently circulating SARS-CoV-2 strains would still be the gold standard test given the fact that only five out of 41 nanobodies retained binding and neutralization to BA.4/5 lineage. Loss of neutralization ability against BA.4/5 without a significant decrease in binding affinity for nanobodies S1-46 and S1-RBD-22 further indicates that neutralization of XBB and BQ lineage should be performed.

      The docking protocol used to predict the spike epitopes uses a C-alpha resolution to represent protein residues, and is data-driven, i.e. it assumes that binding happens in the first place, and then utilizes experimentally obtained structural restraints. So, concluding possible binding from such a docking protocol alone would be noisy. In our revised manuscript we have a new Figure 3B, which shows epitopes of 4 out of the 5 pan-VoC nanobodies, i.e. S1-RBD-{9, 22, 40) and S1-46 mapped to the RBD structures of XBB.1 (8IOU) and BQ.1.1 (8FXC), and we have updated Figure 4 with a supplemental showing the mutation maps.

      (2) Described nanobodies are positioned as very potent neutralizers of SARS-CoV-2. However, they are much less potent in neutralization of ancestral strain as well as early VOCs compared to the mAbs that were approved for COVID-19 treatment. For example, IC50 for casirivimab and imdevimab are 37.4 pM and 42.1 pM, respectively. That is about 27-fold more than IC50 for the most potent nanobody reported in the article, S1-RDB-15.

      This comparison is fraught for several reasons. 1. Experimental differences in pseudovirus assay systems usually result in significant differences in reported IC50s, as IC50 is not an absolute measure, or ultimately comparable to clinical IC50 values. For this reason, in our original publication (Mast et al., 2021) we tested other nanobodies in our experimental set-up as benchmarks (Mast et al., 2021). 2. A typical monoclonal has two binding sites with a large structural Fc linker that is combined ~10 times the size of a nanobody. In a therapeutic setting where monoclonal therapy is provided in g per kg of patient body weight, there is a 5-fold excess of Nb binding to antibody binding capacity. 3. We have previously shown that dimerizing our nanobodies (to produce two antigen binding sites) can dramatically increase potency over 100 fold (Mast et al., 2021).

      In order to make this even clearer in the manuscript, we have added the following: “We note that IC50s are not directly comparable across different experimental set-ups because measured values are highly dependent on the experimental conditions. For this reason, we included other published nanobodies as benchmarks in our original publication and have subsequently maintained standard experimental conditions (Mast, Fridy et al. 2021)”.

      (3) Figure 1A. If each dot represents an independent measurement of the same nanobody, IC50 variation seems too high. For some nanobodies it ranges for almost a log of magnitude, e.g S1-RDB-24, S1-RBD-46, S2-3. Why is that?

      We have deliberately explored the full range of effects that could contribute to experimental variability in our pseudovirus assay, using different batches of nanobody and pseudovirus in each replicate to provide as impartial and comprehensive analysis as possible. While the activity of some nanobodies is remarkably stable from batch to batch, others show the variation noticed by the Reviewer, hence why we performed multiple replicates to define the average IC50 value for our nanobodies.

      (4) The drop in IC50 for BA.1 neutralization is about one log for the majority of tested nanobodies. This should be outlined in the text. For example, for the most potent neutralizer, S1-RDB-15, the drop in IC50 for BA.1 is about 100-fold compared to IC50 for the Delta and Wuhan strains. It is important to note that out of 9 nanobodies for that drop in neutralizing capacity against BA.1 and Delta variants less than one log of magnitude 2 have epitopes in the S2 domain of SRS-CoV-2 spike. Resistance of mAbs targeting the S2 part of the spike has been extensively described in the literature as being due to the highly conserved structure of this region that facilitates membrane fusion. Presented data demonstrate that >80% of the nanobody repertoire is affected by mutations on spike protein. Additionally, it can be helpful for readers if the fold-change in IC50 between Wuhan, Delta, and BA.1 is presented in the text or added to Figure 1 or a table.

      We agree with the Reviewer and to make this more explicit we have made the following change: “In comparison, groups I, I/II, I/IV, V, VII, VIII and the anti-S2 nanobodies contained the majority of omicron BA.1 neutralizers, though here the neutralization potency of many nanobodies was generally decreased tenfold compared to wild-type (emphasis added).”

      (5) The authors should either present the results of the formal correlation analysis or avoid using misleading verbiage such as: "the decrease in neutralization potency largely correlates with the accumulation of omicron BA.1 specific mutations throughout the RBD" or "significant decrease in binding affinity correlated to decreases neutralization potency".

      We thank the Reviewer for this constructive feedback. To address this question, we have performed a correlation analysis using Pearson and Spearman's methods to quantitatively assess the relationship between nanobody neutralization potency (IC50) and binding affinity (KD) across SARS-CoV-2 variants, including the wildtype, delta, and omicron BA.1 variants. Our results indicate a statistically significant correlation for the delta variant (Pearson's PCC: 0.71, p-value: 0.01; Spearman's rho: 0.63, p-value: 0.07), supporting our statement regarding the correlation between decreased neutralization potency and reduced binding affinity for this variant. However, for the wildtype and omicron BA.1 variants, the correlations were not statistically significant (wildtype Pearson's: 0.10, p-value: 0.70; omicron BA.1 Pearson's: 0.27, p-value: 0.31), which we acknowledge does not fully align with the verbiage used in the manuscript. Therefore, we have revised the manuscript to present the correlation analysis data accurately and ensure the discussion is reflective of the statistical evidence as follows:

      “SPR binding assessments to the spike S1 domain or RBD of delta revealed a pattern: nanobodies maintaining binding affinity generally also neutralized the virus with a statistically significant correlation between binding affinity and neutralization efficacy (Pearson's Correlation Coefficient: 0.71, p-value: 0.01; Spearman's rho: 0.63, p-value: 0.07). However, this correlation was not statistically significant for omicron BA.1 (Pearson's Correlation Coefficient: 0.27, p-value: 0.31) (Fig. 3A, Table 1). Notably, while some nanobodies bound to the variants, they did not consistently neutralize them, suggesting additional factors influence neutralization beyond mere binding.”

      (6) Figure 3 shows approximated curves for live virus neutralization assay with quite a broad 90% CI. It will be helpful to present, at least, in supplementary, primary data for live-virus neutralization that were used to perform non-linear regression.

      We have added the reviewer’s suggestion.

      (7) It is not clear what are the "variant-specific nanobody groups" exactly? A definition/description of the term is not provided. If the nanobody library was generated with the Wuhan strain, how did strain-specific nanobodies that bind/neutralize only Delta, BA.1 or BA.4/5 appear in the repertoire and were isolated? This statement also contradicts data in Table 4 where all nanobodies listed bind and neutralize Wuhan strain.

      We agree with the reviewer. All nanobodies tested bind/neutralize the Wuhan strain as they were selected from our original repertoire of 116 nanobodies (Mast, et al., 2021). To clarify, variant-specific nanobodies are nanobodies that bind only one variant that arose from the original Wuhan strain. They were categorized into variant-specific groups based on whether they were able to bind each variant (other than Wuhan).

      We have thus added to the manuscript, “we define variant-specific nanobodies as nanobodies that bind a single additional variant alongside the original Wuhan strain...”

      (8) Describing the categorization of nanobody epitope groups presented in Figure 4, the authors state that binding to Wuhan, Delta, BA/1, and BA.4/5 predicts that these nanobodies will be "effective binders against current circulating strains of the virus including XBB and BQ lineages"? How exactly is this conclusion corollary to the data shown?

      The epitopes of XBB and BQ.1 are not divergent enough within the regions we propose the nanobodies to bind, to suggest that nanobodies that bind in those regions will lose binding ability. We hypothesize that the region at which these nanobodies bind represents regions on spike that are vulnerable to our specified nanobodies in Fig. 4. We have generated a new Fig. 3B and added a supporting figure for Fig. 4 to address this.

      (9) Figures 4C and 6 describe how the nanobodies will retain binding to currently circulating strains of XBB lineage. However, epitopes are mapped on the same Wuhan, Delta, BA.1, and BA.4/5 virus strains. The predicted binding of nanobodies to XBB lineage RBD is not actually shown in Figure 6. It is clear from the figure that the nanobody binding footprint (red area) decreases with antigenic distance in every spike projection from Wuhan through the BA.4/5 strain. It is unclear how this indicates that nanobodies will remain active against even more distant XBB, BQ, EU, and CH strains accumulating more mutations in spike protein.

      We have added the following to the manuscript to clarify: “Strikingly, we have in our cohort 8 nanobodies able to bind delta, and the omicron lineages BA.1/BA.4/BA.5/XBB/BQ.1.1 (Fig. 5B). We further predict these 8 nanobodies will be effective binders against current circulating strains of the virus including omicron EG.5 and HV.1 as the epitope regions (or predicted epitopes) of these nanobodies do not vary significantly from omicron lineages XBB and BQ.1.1 (Fig. 5C and Supplement to Fig. 5).”

      (10) Despite major advances in the development of nanobodies as therapeutic molecules there are only a few nanobody-based drugs that have so far been approved for clinical use and all of them are nanobody fusions to immunoglobulin Fc fragment. It is dictated by the small size of the nanobody itself, 15 kDa molecule, that leads to rapid kidney clearance within hours post-injection, and also by the necessity of having antibody effector functions allowing for example killing of malignant cells. It is hard to predict how each individual nanobody will tolerate multimerization and if it will still retain binding ability as its size dramatically increases. It should be noted that IC50 for BA.4/5 is in the submicromolar range for the 5 nanobodies retaining neutralization of this strain. From a therapeutic perspective, this is quite a high IC50 that dictates a high dosage to achieve a therapeutic effect. Furthermore, it can be expected that additional mutations in the SARS-CoV-2 spike will further affect binding affinity and therefore reduce the neutralization ability of these nanobodies resulting in even higher doses required to achieve therapeutic effect. Therefore, authors should discuss the limitations of the nanobody approach as a therapeutic intervention more granularly.

      While Fc fusions are not strictly required for clinical use (for instance Caplacizumab is not an Fc fusion, being a multimer containing an albumin-binding nanobody), we agree that reformulation would indeed be required to optimize pharmacokinetics for eventual clinical use. Increased valency through multimerizeration is in fact one of several strategies, which also includes synergistic combinations, for significantly enhancing effective IC50. Preclinical nanobody engineering is not within the scope of this paper, but we acknowledge this challenge.

      Minor points:

      (1) Table S1 is missing.

      This is an .xlsx file uploaded as Supplementary File 3. Labeled now as “Figure 6–Source data 2. Neutralization data from synergy experiment”.

      (2) Because Table 1 summarizes all neutralization and binding data, it will be helpful to refer to it while describing data presented in Figure 1.

      This has been added to the revised manuscript.

      (3) Live SARS-CoV-2 PRNT is not described in Materials and Methods.

      This has been added to the revised manuscript.

    2. eLife assessment

      This study presents important insights on the impact of SARS-CoV-2 variants on the binding and neutralization of a small library of nanobodies. The authors should be applauded for their comprehensive in vitro and in silico analyses of nanobody targeting of SARS-CoV-2 variants. The evidence supporting the claims of the authors is now convincing. This work will be of great interest to researchers in the fields of antibody/nanobody engineering and SARS-CoV-2 therapeutics.

    3. Reviewer #1 (Public Review):

      Summary:

      In this manuscript, Ketaren, Mast, Fridy et al. assessed the ability of a previously generated llama nanobody library (Mast, Fridy et al. 2021) to bind and neutralize SARS-CoV-2 delta and omicron variants. The authors identified multiple nanobodies that retain neutralizing and/or binding capacity against delta, BA.1 and BA.4/5. Nanobody epitope mapping on spike proteins using structural modeling revealed possible mechanisms of immune evasion by viral variants as well as mechanisms of cross-variant neutralization by nanobodies. The authors additionally identified two nanobody pairs involving non-neutralizing nanobodies that exhibited synergy in neutralization against the delta variant. These results enabled the refinement of target epitopes of the nanobody repertoire and the discovery of several pan-variant nanobodies for further preclinical development.

      Strengths:

      Overall, this study is well executed and provides a valuable framework for assessing the impact of emerging SARS-CoV-2 variants on nanobodies using a combination of in vitro biochemical and cellular assays as well as computational approaches. There are interesting insights generated from the epitope mapping analyses, which offer possible explanations for how delta and omicron variants escape nanobody responses, as well as how some nanobodies exhibit cross-variant neutralization capacity. These analyses laid out a clear path forward for optimizing these promising next-gen therapeutics, particularly in the face of rapidly emerging SARS-CoV-2 variants. This work will be of interest to researchers in the fields of antibody/nanobody engineering, SARS-CoV-2 therapeutics, and host-virus interaction.

      Weaknesses:

      A main weakness of the study is that the efficacy statement is not thoroughly supported. While the authors comprehensively characterized the neutralizing ability of nanobodies in vitro, there is no animal data involving mice or hamsters to demonstrate the real protective efficacy in vivo. Yet, in the title and throughout the manuscript, the authors repeatedly used phrases like "retains efficacy" or "remains efficacious" to describe the nanobodies' neutralization or binding capacities. This claim is not well supported by the data and underestimates the impact of variants on the nanobodies, especially the omicron sublineages. For example, the authors showed that S1-RBD-15 had a ~100-fold reduction in neutralization titer against Omicron, with an IC50 at around 1 uM. This is much higher than the IC50 value of a typical anti-ancestral RBD nanobody reported in the previous study (Mast, Fridy et al. 2021). In fact, the authors themselves ascribe nanobodies with an IC50 above 1 uM as weak neutralizers. And there were many in the range of 0.1-1 uM. Furthermore, many nanobodies selected for affinity measurement against BA.4/5 had no detectable binding. Without providing in vivo protection data or including monoclonal antibodies that are known to be efficacious against variants in the in vitro assays as a benchmark, it is difficult to evaluate the efficacy just with the IC50 values.

      Comments post revision:

      The authors are to be commended for their comprehensive response to the referees' comments. In the revised manuscript, the authors made extensive changes throughout the texts and added new figures that greatly improved their clarity. While the manuscript is still limited in solely relying on in vitro data for efficacy assessment, it nicely demonstrates how the combination of experimental and computational techniques could lead to the discovery of broadly neutralizing nanobody candidates for further lead optimization.

    4. Reviewer #2 (Public Review):

      Summary:

      Interest in using nanobodies for therapeutic interventions in infectious diseases is growing due to their ability to bind hidden or cryptic epitopes that are inaccessible to conventional immunoglobulins. In the presented study, authors posed to characterize nanobodies derived the library produced earlier with Wuhan strain of SARS-CoV-2, map their epitopes on SARS-CoV-2 spike protein and demonstrate that some nanobodies retain binding and even neutralization against antigenically distant, newly emerging Variants of Concern (VOCs).

      Strengths:

      Authors demonstrate that some nanobodies despite being obtained against ancestral virus strain retain high affinity binding to antigenically distant SARS-CoV-2 strains despite majority of the repertoire loses binding. Despite being limited to only two nanobody combinations, demonstration of synergy in virus neutralization between nanobodies targeting different epitopes is compelling. The ability of nanobodies to bind emerging virus strains has been demonstrated and the possible effect of mutations within epitopes has been thoroughly discussed.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Review:

      Reviewer #1:

      Summary:

      The Roco proteins are a family of GTPases characterized by the conserved presence of an ROC-COR tandem domain. How GTP binding alters the structure and activity of Roco proteins remains unclear. In this study, Galicia C et al. took advantage of conformationspecific nanobodies to trap CtRoco, a bacterial Roco, in an active monomeric state and determined its high-resolution structure by cryo-EM. This study, in combination with the previous inactive dimeric CtRoco, revealed the molecular basis of CtRoco activation through GTP-binding and dimer-to-monomer transition.

      Strengths:

      The reviewer is impressed by the authors' deep understanding of the CtRoco protein. Capturing Roco proteins in a GTP-bound state is a major breakthrough in the mechanistic understanding of the activation mechanism of Roco proteins and shows similarity with the activation mechanism of LRRK2, a key molecule in Parkinson's disease. Furthermore, the methodology the authors used in this manuscript - using conformation-specific nanobodies to trap the active conformation, which is otherwise flexible and resistant to single-particle average - is highly valuable and inspiring.

      Weakness:

      Though written with good clarity, the paper will benefit from some clarifications.

      (1) The angular distribution of particles for the 3D reconstructions should be provided (Figure 1 - Sup. 1 & Sup. 2).

      Figure 1 – Figure supplements 1 and 2 now contain particle distribution plots.

      (2) The B-factors for protein and ligand of the model, Map sharpening factor, and molprobity score should be provided (Table 1).

      Table 1 now contains B-factors and molprobity scores.

      The map used to interpret the model was post-processed by density modification, and therefore no data concerning sharpening factors are provided in the output.

      (3) A supplemental Figure to Figure 2B, illustrating how a0-helix interacts with COR-A&LRR before and after GTP binding in atomic details, will be helpful for the readers to understand the critical role of a0-helix during CtRoco activation.

      This is now illustrated in the new Figure 2 – Figure Supplement 1.

      (4) For the following statement, "On the other hand, only relatively small changes are observed in the orientation of the Roc a3 helix. This helix, which was previously suggested to be an important element in the activation of LRRK2 (Kalogeropulou et al., 2022), is located at the interface of the Roc and CORB domains and harbors the residues H554 and Y558, orthologous to the LRRK2 PD mutation sites N1337 and R1441, respectively." It is not surprising the a3-helix of the ROC domain only has small changes when the ROC domain is aligned (Figure 2E). However, in the study by Zhu et al (DOI: 10.1126/science.adi9926), it was shown that a3-helix has a "see-saw" motion when the COR-B domain is aligned. Is this motion conserved in CtRoco from inactive to active state?

      We indeed describe the conformational changes from the perspective of the Roc domain. When using the COR-B domain for structural alignment, a rotational movement of Roc (including a “seesaw”-like movement of the α3-helix helix around His554) with respect to COR-B is correspondingly observed.

      This is now added to Figure 2E. Additionally, the text was adapted to:

      “Interestingly, this rotational movement of CORB seems to use the H554-Y558-Y804 triad on the interface of Roc and CORB as a pivot point (Figure 2E). Mutation of either of the corresponding residues in LRRK2 (N1437, R1441, Y1699, respectively) is associated with PD and leads to LRRK2 activation. Residues H554 and Y558 are located on the Roc a3 helix, which was previously suggested to be an important element in the activation of LRRK2 (Kalogeropulou et al., 2022). Indeed, while the orientation of the a3 helix with respect to the rest of the Roc domain only undergoes small changes upon GTPgS binding, it can be observed that this helix undergoes a “seesaw-like” movement with respect to the CORB domain. A similar rearrangement was previously also observed for Rab29-mediated activation of human LRRK2 (Störmer et al., 2023; Zhu et al., 2022).”

      (5) A supplemental figure showing the positions of and distances between NbRoco1 K91 and Roc K443, K583, and K611 would help the following statement. "Also multiple crosslinks between the Nbs and CtRoco, as well as between both nanobodies were found. ... NbRoco1-K69 also forms crosslinks with two lysines within the Roc domain (K583 and K611), and NbRoco1-K91 is crosslinked to K583".

      A figure displaying these crosslinks is now provided as Figure 4–figure supplement 1. However, in interpreting these crosslinks it should be taken into consideration that the additive length of the DSSO spacer and the lysine side chains leads to a theoretical upper limit of ∼26 Å for the distance between the α carbon atoms of cross-linked lysines (and even a cut-off distance of 35 Å when taking into account protein dynamics).

      (6) It would be informative to show the position of CtRoco-L487 in the NF and GTP-bound state and comment on why this mutation favors GTP hydrolysis.

      L487 is located in Switch 1, which is a critical region for nucleotide binding and hydrolysis. Unfortunately, most probably due to flexibility, the Switch 1 region could not be entirely modeled (in neither nucleotide state). Since L487 is located on the edge of the interpretable portion of the Switch 1 in both structures (see Author response image 1 below), any interpretation regarding the role of this residue would be highly speculative.

      Author response image 1.

      The following text was added to the Results section:

      “Also the Switch 1 loop could not be fully modeled in our structure, presumably indicating some flexibility in this region despite the presence of a GTP analogue. Interestingly, the Switch 1 loop harbors the site of the PD-analogous L487A mutation that leads to a stabilization of the CtRoco dimer with a concomitant decrease in GTPase activity (Deyaert et al., 2019). Unfortunately, an exact interpretation of this effect of the L487A mutation is hampered by the lack of a well resolved Switch 1 loop.”

      Reviewer #2:

      Summary

      The manuscript by Galicia et al describes the structure of the bacterial GTPyS-bound CtRoco protein in the presence of nanobodies. The major relevance of this study is in the fact that the CtRoco protein is a homolog of the human LRRK2 protein with mutations that are associated with Parkinson's disease. The structure and activation mechanisms of these proteins are very complex and not well understood. Especially lacking is a structure of the protein in the GTP-bound state. Previously the authors have shown that two conformational nanobodies can be used to bring/stabilize the protein in a monomerGTPyS-bound state. In this manuscript, the authors use these nanobodies to obtain the GTPyS-bound structure and importantly discuss their results in the context of the mammalian LRRK2 activation mechanism and mutations leading to Parkinson's disease. The work is well performed and clearly described. In general, the conclusions on the structure are reasonable and well-discussed in the context of the LRRK2 activation mechanism.

      Strengths:

      The strong points are the innovative use of nanobodies to stabilize the otherwise flexible protein and the new GTPyS-bound structure that helps enormously in understanding the activation cycle of these proteins.

      Weakness:

      The strong point of the use of nanobodies is also a potential weak point; these nanobodies may have induced some conformational changes in a part of the protein that will not be present in a GTPyS-bound protein in the absence of nanobodies.

      Two major points need further attention.

      (1) Several parts of the protein are very flexible during the monomer-dimer activity cycle. This flexibility is crucial for protein function, but obviously hampers structure resolution. Forced experiments to reduce flexibility may allow better structure resolution, but at the same time may impede the activation cycle. Therefore, careful experiments and interpretation are very critical for this type of work. This especially relates to the influence of the nanobodies on the structure that may not occur during the "normal" monomerdimer activation cycle in the absence of the nanobodies (see also point 2). So what is the evidence that the nanobody-bound GTPyS-bound state is biochemically a reliable representative of the "normal" GTP-bound state in the absence of nanobodies, and therefore the obtained structure can be confidentially used to interpret the activation mechanism as done in the manuscript.

      See below for an answer to remark 1 and 2.

      (2) The obtained structure with two nanobodies reveals that the nanobodies NbRoco1 and NbRoco2 bind to parts of the protein by which a dimer is impossible, respectively to a0helix of the linker between Roc-COR and LRR, and to the cavity of the LRR that in the dimer binds to the dimerizing domain CORB. It is likely the open monomer GTP-bound structure is recognized by the nanobodies in the camelid, suggesting that overall the open monomer structure is a true GTP-bound state. However, it is also likely that the binding energy of the nanobody is used to stabilize the monomer structure. It is not automatically obvious that in the details the obtained nonobody-Roco-GTPyS structure will be identical to the "normal" Roco-GTPyS structure. What is the influence of nanobody-binding on the conformation of the domains where they bind; the binding energy may be used to stabilize a conformation that is not present in the absence of the nanobody. For instance, NbRoco1 binds to the a0 helix of the linker; what is here the "normal" active state of the Roco protein, and is e.g. the angle between RocCOR and LRR also rotated by 135 degrees? Furthermore, nanobody NbRoco2 in the LRR domain is expected to stabilize the LRR domain; it may allow a position of the LRR domain relative to the rest of the protein that is not present without nanobody in the LRR domain. I am convinced that the observed open structure is a correct representation of the active state, but many important details have to be supported by e,g, their CX-MS experiments, and in the end probably need confirmation by more structures of other active Roco proteins or confirmation by a more dynamic sampling of the active states by e.g. molecular dynamics or NMR.

      Recently, nanobodies have increasingly been used successfully to obtain structural insights in protein conformational states (reviewed in Uchański et al, Curr. Opin. Struc. Biol. 2020). As reviewer # 2 points out, the concern is sometimes raised that antibodies could distort a protein into non-native conformations. Here, it is important to note that the nanobodies were raised by immunizing a llama with the fully native CtRoco protein bound to a non-hydrolysable GTP analogue, after which the nanobodies were selected by phage display using the same fully native and functional form of the protein. As clearly explained in Manglik et al. Annu Rev Pharmacol Toxicol. 2017, the probability of an in vivo matured nanobody inducing a non-native conformation of the antigen is low, although it is possible that it selects a high-energy, low-population conformation of a dynamic protein. Immature B cells require engagement of displayed antibodies with antigen to proliferate and differentiate during clonal selection. Antibodies that induce non-native conformations of the antigen pay a substantial energetic penalty in this process, and B cell clones displaying such antibodies will have a significantly lower probability of proliferation and differentiation into mature antibody-secreting B lymphocytes. Hence, many recent experiments and observation give credence to the notion that nanobodies bind antigens primarily by conformational selection and not induced fit (e.g. Smirnova et al. PNAS 2015).

      Extrapolated to the case of CtRoco, which is clearly very flexible in its GTP-bound form, this means that the nanobodies are able to trap and stabilize one conformational state that is representative of the “active state” ensemble of the protein. In this respect, it is clear from our experiments (XL-MS, affinity and effect on GTPase activity) that the effects of NbRoco1 and NbRoco2 are additive (or even cooperative), meaning that both nanobodies recognize different features of the same CtRoco “active state”. Correspondingly, the monomeric, elongated “open” conformation is also observed in the structure of CtRoco bound to NbRoco1 only (Figure1 - supplement 2), albeit that this structure still displays more flexibility. The monomerization and conformational changes that we observe and describe in the current paper at high resolution are also in very good agreement with earlier observations for CtRoco in the GTP-bound form in absence of any nanobodies, including negative stain EM (Deyaert et al. Nature Commun, 2017), hydrogen-deuterium exchange experiments (Deyaert et al. Biochem. J. 2019) and native MS (Leemans et al. Biochem J. 2020).

      In the revised manuscript we added the following text to the discussion:

      “To decrease this flexibility, we have now used two previously developed conformationspecific nanobodies (NbRoco1 and NbRoco2) to stabilize the protein in the GTP-state (Leemans et al., 2020), allowing us to solve its structure using cryo-EM (Figure 1). Recently, Nbs have successfully been used to obtain structural insights in the conformational states of a number of highly dynamic proteins (Uchański et al, 2020). These studies established that Nbs bind antigens primarily by conformational selection rather than by induced fit (Manglik et al., 2017; Smirnova et al.,2015). Since NbRoco1 and NbRoco2 were generated by immunization with fully native CtRoco bound to a nonhydrolysable GTP analogue, and subsequently selected by phase display using the same functional protein, it is thus safe to assume that these Nbs bind to and stabilize a relevant conformation that is present within the “active” CtRoco conformational space (Leemans et al., 2020). Moreover, our current structures are also in very good agreement with previous biochemical studies and data from HDX-MS and negative stain EM (Deyaert et al., 2019; Deyaert, Wauters, et al., 2017).”

      Recommendations for the authors:

      Reviewer #1:

      (1) Figure 2C: please label the residues with meshes (switch 2).

      Labels have been added to figure 2C.

      (2) A supplemental figure for the following statement will be helpful "A remarkable feature of the CtRoco dimer structure was the dimer-stabilized orientation of the P-loop, which would hamper direct nucleotide binding on the dimer. Correspondingly, in the current structure, the P-loop changes orientation, allowing GTPgS to bind, although the EM map does not allow unambiguous placement of the entire P-loop. Surprisingly, also the Switch 1 loop could not be fully modeled, which could indicate some flexibility in this region despite the presence of a GTP analog".

      An additional Figure 2–figure supplement 2 has been added to illustrate this.

      (3) A supplemental figure for the following statement will be helpful "A final important observation in the Roc domain concerns the very C-terminal part of Switch 2 (residues 520 to 533), which could not be modeled in our GTP bound structure due to flexibility, while in the nucleotide-free dimer structure this region is structured and located at the interface of the Roc domain with the LRR-Roc linker and CORA. In this way, the conformational changes induced by GTPgS binding could be relayed via the Switch 2 toward the LRR and CORA domains, and vice versa."

      An additional Figure 2–figure supplement 2 has been added to illustrate this.

      (4) A structural comparison of each domain (LRR, ROC, COR) between NF and GTP-bound states will be greatly useful to understand statements in the manuscript, such as "In addition to the Cterminal dimerization part of CORB that becomes unstructured, also other large conformational changes are observed in the CORA and CORB domains of CtRoco upon GTPgS binding."

      We would like to clarify that with this statement we refer to changes in the relative orientation of the domains between the nucleotide-free and GTPgS-bound states, rather than to conformational changes within each domain. These changes in relative orientation are illustrated in Figure 2 and the associated Figure supplements.

      (5) The statement "to a lesser extent, also between CDR1 and the LRR-Roc linker" is not clearlyillustrated in Figure 3B.

      The reviewer is correct, and we now also show CDR1 in Figure 3B.

      (6) Extra panels can be added in Figure 1 Sup. 4 to illustrate the following statement "In the density map NbRoco2 can easily be identified and placed on the concave side of the LRR domain... Nterminal and C-terminal b-strands interacting with the very C-terminal repeat of the LRR".

      We belief the density map corresponding to NbRoco2 is clearly shown in Figure 1 – supplement 4A. A reference to this figure panel is now added to the main text.

      (7) "In the presence of both Nbs, the hydrolysis rate was increased 4-fold compared to CtRocoL487A alone and 2-fold compared to CtRoco-L487A in the presence of NbRoco1 only, again illustrating a collaboration between the Nbs (Figure 5C)" Here, is it 6-fold instead of 4-fold?

      The reviewer is correct. We changed this accordingly in the manuscript.

      Reviewer #2:

      (1) At many places in the manuscript the lack of structural details is explained by the assumed local flexibility of the protein. This may be true for many cases (such as linker regions), but is probably not always correct; several other explanations are possible to get no local structural details.

      See our answer to point 2, below.

      (2) At several other places in the manuscript the high flexibility is used to explain the lack of structural details (so the reasoning is reversed compared to point 1); this would require that a priori it is known that that the region is flexible and therefore no structure can be expected. An example is found mid-page 8: "A final important observation in the Roc domain concerns the very C-terminal part of Switch 2 (residues 520 to 533), which could not be modeled in our GTP bound structure due to flexibility, while in the nucleotide-free dimer structure this region is structured and located at the interface of the Roc domain with the LRR-Roc linker and CORA." As written there must be a reference to experiments showing the "due to flexibility"

      The reviewer is correct that additional factors might affect the interpretability of the map, such as the small size of the regions used for the focused refinements (around 50 kDa each) or a preferential distribution of orientation of the particles in the grid. Particle distribution plots are now shown in Figure 1 – Figure supplements 1 and 2. However, due to the intrinsic flexible nature of the Switch 1 and Switch 2 regions, we assume this flexibility to be the major cause of lack of features in the EM maps, especially since some of the neighboring regions display well-resolved maps.

      Nevertheless, in the manuscript we reworded our statements to be more careful. For example, on page 8:

      “Also the Switch 1 loop could not be fully modeled in our structure, presumably indicating some flexibility in this region despite the presence of a GTP analogue.”

      “… potentially due to flexibility of this region in the new position of the Switch 2…”

    2. eLife assessment

      The fundamental study by Galicia C. et al. captured the GTP-bound active structure of CtRoco, a homolog of human LRRK2, using conformation-specific nanobodies. This convincing body of work reports the first structure of a GTP-bound ROCO protein, illustrating how GTP facilitates the dimer-to-monomer transition of CtRoco and functional activation.

    3. Reviewer #1 (Public Review):

      Summary:

      The Roco proteins are a family of GTPases characterized by the conserved presence of an ROC-COR tandem domain. How GTP binding alters the structure and activity of Roco proteins remains unclear. In this study, Galicia C et al. took advantage of conformation-specific nanobodies to trap CtRoco, a bacterial Roco, in an active monomeric state and determined its high-resolution structure by cryo-EM. This study, in combination with the previous inactive dimeric CtRoco, revealed the molecular basis of CtRoco activation through GTP-binding and dimer-to-monomer transition.

      Strengths:

      The reviewer is impressed by the authors' deep understanding of the CtRoco protein. Capturing Roco proteins in a GTP-bound state is a major breakthrough in the mechanistic understanding of the activation mechanism of Roco proteins and shows similarity with the activation mechanism of LRRK2, a key molecule in Parkinson's disease. Furthermore, the methodology the authors used in this manuscript - using conformation-specific nanobodies to trap the active conformation, which is otherwise flexible and resistant to single-particle average - is highly valuable and inspiring.

    4. Reviewer #2 (Public Review):

      Summary

      The manuscript by Galicia et al describes the structure of the bacterial GTPyS-bound CtRoco protein in the presence of nanobodies. The major relevance of this study is in the fact that the CtRoco protein is a homolog of the human LRRK2 protein with mutations that are associated with Parkinson's disease. The structure and activation mechanisms of these proteins are very complex and not well understood. Especially lacking is a structure of the protein in the GTP-bound state. Previously the authors have shown that two conformational nanobodies can be used to bring/stabilize the protein in a monomer-GTPyS-bound state. In this manuscript, the authors use these nanobodies to obtain the GTPyS-bound structure and importantly discuss their results in the context of the mammalian LRRK2 activation mechanism and mutations leading to Parkinson's disease. The work is well performed and clearly described. In general, the conclusions on the structure are reasonable and well-discussed in the context of the LRRK2 activation mechanism.

      Strengths:

      The strong points are the innovative use of nanobodies to stabilize the otherwise flexible protein and the new GTPyS-bound structure that helps enormously in understanding the activation cycle of these proteins.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      The aim of the present work is to evaluate the role of BMP9 and BMP10 in liver by depleting Bmp9 and Bmp10 from the main liver cell types (endothelial cells (EC), hepatic stellate cells (HSC), Kupffer cells (KC) and hepatocytes (H)) using cell-specific cre recombinases. They show that HSCs are the main source of BMP9 and BMP10 in the liver. Using transgenic ALK1 reporter mice, they show that ALK1, the high affinity type 1 receptor for BMP9 and BMP10, is expressed on KC and EC. They have also performed bulk RNAseq analyses on whole liver, and cell-sorted EC and KC, and showed that loss of Bmp9 and Bmp10 decreased KC signature and that KC are replaced by monocyte-derived macrophages. EC derived from these Bmp9fl/flBmp10fl/flLratCre mice also lost their identity and transdifferentiated into continuous ECs. Liver iron metabolism and metabolic zonation were also affected in these mice. In conclusion, this work supports that BMP9 and BMP10 produced by HSC play a central role in mediating liver cell-cell crosstalk and liver homeostasis.

      We appreciate the comprehensive summary of reviewer 1.

      Strengths:

      This work further supports the role of BMP9 and BMP10 in liver homeostasis. Using a specific HSC-Cre recombinase, the authors show for the first time that it is the BMP9 and BMP10 produced by HSC that play a central role in mediating liver cell-cell crosstalk to maintain a healthy liver. Although the overall message of the key role of BMP9 in liver homeostasis has been described by several groups, the role of hepatic BMP10 has not been studied before. Thus, one of the novelties of this work is to have used liver cell specific Cre recombinase to delete hepatic Bmp9 and Bmp10. The second novelty is the demonstration of the role of BMP9 and BMP10 in KC Differentiation/homeostasis which has already been slightly addressed by this group by knocking out ALK1, the high affinity receptor of BMP9 and BMP10 (Zhao et al. JCI, 2022).

      We appreciate the positive comment of reviewer 1.

      Weaknesses:

      This work remains rather descriptive and the molecular mechanisms are barely touched upon and could have been more explored. Some references should be added; In particular, a work that has already demonstrated, using a different approach (in situ hybridization RNAscope), that in the liver BMP9 and BMP10 are expressed by HSC (Tillet et al., J Biol Chem 2018). Another publication (Bouvard et al., Cardiovasc Res, 2021) has previously showed that deletion of Bmp9 and Bmp10 leads to liver fibrosis and could have thus been cited. There is also a reference that is not correctly cited. Ref 26 (Herrera et al., 2014) does not say that "BMP10 is mostly expressed in the heart, followed by the liver" or that "BMP9 and BMP10 also bind to ALK2" as cited in the manuscript.

      We agree with the comment of reviewer 1 that the molecular mechanisms were barely investigated in our work. Indeed, it has been reported that BMP9/10 induce the expression of ID1/3 in KCs and GATA4 and Maf in liver ECs in vitro culture system. These master regulators play an important role in the differentiation of the two cell types. Thus, we think that the reduced expression of these master regulators can explain the phenotype in KCs and ECs observed in Bmp9fl/flBmp10fl/flLratCre mice. In addition, according to the reviewer’s suggestion, these references will be added or corrected in our revised manuscript.

      The gating strategies for cell sorting which is used for bulk RNAseq and FACS analyses should be better described in order to better follow the manuscript. This point is particularly important for KC gating as the authors show that Tim4 is very strongly decreased in Bmp9fl/flBmp10fl/flLratCre (Fig 2c), yet, it seems that this marker is used for gating macrophages (Suppl fig4). Same question with F4/80 which is strongly decreased in Bmp9fl/flBmp10fl/flLratCre (Fig 2d) and also used for gating. It is important to show the gating strategy for both Control and Bmp9fl/flBmp10fl/flLratCre mice.

      The authors should explain how they selected the genes shown on each heatmaps and add references that can justify the choice of the genes.

      Thank you for your suggestion. In our study, we used CD45+ Ly6C- F4/80+ CD64+ cells to define liver macrophages. We will delete Tim4 FACS plot from Suppl fig4 to avoid the misunderstanding. Although F4/80 positive cells were reduced in the livers of Bmp9fl/flBmp10fl/flLratCre mice, double staining by anti-F4/80 and anti-CD64 fluorescence antibodies can still clearly distinguish liver macrophages based on above gating strategy. Gating strategy for both control and Bmp9fl/flBmp10fl/flLratCre mice will be presented in our revised manuscript.

      Quantifications of Immunostaining and FACS data should be added as well as statistical analyses.

      Quantitative data will be added in our revised manuscript.

      Reviewer #2 (Public Review):

      Summary:

      The authors characterized the contribution of BMP9/BMP10 expression/secretion from all different hepatic cell types and analysed their impact on the other cell types. They are able to show that HSC derived BMP9/BMP10 controls Kupffer cell and EC differentiation and functions.

      We appreciate the comprehensive summary of reviewer 2.

      Strengths:

      This is the first study to my knowledge to comprehensively analyze the contribution of BMP9/BMP10 expression in such systematic fashion in vivo. This study therefore is a significant contribution to the field and further supports previous studies that have already implied BMP9 and BMP10 in Kupffer cell and EC functions but did not unravel the intercellular cross talk in such detailed fashion.

      We appreciate the positive comment of reviewer 2.

      Weaknesses:

      Several findings such as the impact of BMP9/10 on Kupffer cells and EC were already known. So these findings are not innovative, however I still believe that the elucidation of the cellular crosstalk makes this publication highly interesting to a broad scientific community.

      Overall the authors achieved their aims and the results are well supporting the conclusions and discussion.

      We appreciate the positive comment of reviewer 2. We agree with the comment of reviewer 2 that although some findings in our paper are somehow expected, the detailed investigation of the crosstalk between different liver cell types is still needed and beneficial to this field.

    2. eLife assessment

      This valuable study delineates the cellular contributions of BMP signaling in liver development and function. The findings are convincing, and the study employs state-of-the-art molecular, genetic, and cellular approaches to demonstrate that hepatic stellate cells play a central role in liver health by mediating cell-to-cell crosstalk via the production of specific BMP proteins. This study will be of interest to scientists interested in developmental biology and organ physiology.

    3. Reviewer #1 (Public Review):

      Summary:

      The aim of the present work is to evaluate the role of BMP9 and BMP10 in liver by depleting Bmp9 and Bmp10 from the main liver cell types (endothelial cells (EC), hepatic stellate cells (HSC), Kupffer cells (KC) and hepatocytes (H)) using cell-specific cre recombinases. They show that HSCs are the main source of BMP9 and BMP10 in the liver. Using transgenic ALK1 reporter mice, they show that ALK1, the high affinity type 1 receptor for BMP9 and BMP10, is expressed on KC and EC. They have also performed bulk RNAseq analyses on whole liver, and cell-sorted EC and KC, and showed that loss of Bmp9 and Bmp10 decreased KC signature and that KC are replaced by monocyte-derived macrophages. EC derived from these Bmp9fl/flBmp10fl/flLratCre mice also lost their identity and transdifferentiated into continuous ECs. Liver iron metabolism and metabolic zonation were also affected in these mice. In conclusion, this work supports that BMP9 and BMP10 produced by HSC play a central role in mediating liver cell-cell crosstalk and liver homeostasis.

      Strengths:

      This work further supports the role of BMP9 and BMP10 in liver homeostasis. Using a specific HSC-Cre recombinase, the authors show for the first time that it is the BMP9 and BMP10 produced by HSC that play a central role in mediating liver cell-cell crosstalk to maintain a healthy liver. Although the overall message of the key role of BMP9 in liver homeostasis has been described by several groups, the role of hepatic BMP10 has not been studied before. Thus, one of the novelties of this work is to have used liver cell specific Cre recombinase to delete hepatic Bmp9 and Bmp10. The second novelty is the demonstration of the role of BMP9 and BMP10 in KC Differentiation/homeostasis which has already been slightly addressed by this group by knocking out ALK1, the high affinity receptor of BMP9 and BMP10 (Zhao et al. JCI, 2022).

      Weaknesses:

      This work remains rather descriptive and the molecular mechanisms are barely touched upon and could have been more explored.<br /> Some references should be added; In particular, a work that has already demonstrated, using a different approach (in situ hybridization RNAscope), that in the liver BMP9 and BMP10 are expressed by HSC (Tillet et al., J Biol Chem 2018). Another publication (Bouvard et al., Cardiovasc Res, 2021) has previously showed that deletion of Bmp9 and Bmp10 leads to liver fibrosis and could have thus been cited. There is also a reference that is not correctly cited. Ref 26 (Herrera et al., 2014) does not say that "BMP10 is mostly expressed in the heart, followed by the liver" or that "BMP9 and BMP10 also bind to ALK2" as cited in the manuscript.<br /> The gating strategies for cell sorting which is used for bulk RNAseq and FACS analyses should be better described in order to better follow the manuscript. This point is particularly important for KC gating as the authors show that Tim4 is very strongly decreased in Bmp9fl/flBmp10fl/flLratCre (Fig 2c), yet, it seems that this marker is used for gating macrophages (Suppl fig4). Same question with F4/80 which is strongly decreased in Bmp9fl/flBmp10fl/flLratCre (Fig 2d) and also used for gating. It is important to show the gating strategy for both Control and Bmp9fl/flBmp10fl/flLratCre mice.<br /> The authors should explain how they selected the genes shown on each heatmaps and add references that can justify the choice of the genes.<br /> Quantifications of Immunostaining and FACS data should be added as well as statistical analyses.

    4. Reviewer #2 (Public Review):

      Summary:

      The authors characterized the contribution of BMP9/BMp10 expression/secretion from all different hepatic cell types and analysed their impact on the other cell types. They are able to show that HSC derived BMP9/BMP10 controls Kupffer cell and EC differentiation and functions.

      Strengths:

      This is the first study to my knowledge to comprehensively analyze the contribution of BMP9/BMP10 expression in such systematic fashion in vivo. This study therefore is a significant contribution to the field and further supports previous studies that have already implied BMP9 and BMP10 in Kupffer cell and EC functions but did not unravel the intercellular cross talk in such detailed fashion.

      Weaknesses:

      Several findings such as the impact of BMP9/10 on Kupffer cells and EC were already known. So these findings are not innovative, however I still believe that the elucidation of the cellular crosstalk makes this publication highly interesting to a broad scientific community.

      Overall the authors achieved their aims and the results are well supporting the conclusions and discussion.

    1. Reviewer #2 (Public Review):

      Summary:

      This work proposes a synaptic plasticity rule that explains the generation of learned stochastic dynamics during spontaneous activity. The proposed plasticity rule assumes that excitatory synapses seek to minimize the difference between the internal predicted activity and stimulus-evoked activity, and inhibitory synapses try to maintain the E-I balance by matching the excitatory activity. By implementing this plasticity rule in a spiking recurrent neural network, the authors show that the state-transition statistics of spontaneous excitatory activity agree with that of the learned stimulus patterns, which are reflected in the learned excitatory synaptic weights. The authors further demonstrate that inhibitory connections contribute to well-defined state transitions matching the transition patterns evoked by the stimulus. Finally, they show that this mechanism can be expanded to more complex state-transition structures including songbird neural data.

      Strengths:

      This study makes an important contribution to computational neuroscience, by proposing a possible synaptic plasticity mechanism underlying spontaneous generations of learned stochastic state-switching dynamics that are experimentally observed in the visual cortex and hippocampus. This work is also very clearly presented and well-written, and the authors conducted comprehensive simulations testing multiple hypotheses. Overall, I believe this is a well-conducted study providing interesting and novel aspects of the capacity of recurrent spiking neural networks with local synaptic plasticity.

      Weaknesses:

      This study is very well-thought-out and theoretically valuable to the neuroscience community, and I think the main weaknesses are in regard to how much biological realism is taken into account. For example, the proposed model assumes that only synapses targeting excitatory neurons are plastic, and uses an equal number of excitatory and inhibitory neurons.

      The model also assumes Markovian state dynamics while biological systems can depend more on history. This limitation, however, is acknowledged in the Discussion.<br /> Finally, to simulate spontaneous activity, the authors use a constant input of 0.3 throughout the study. Different amplitudes of constant input may correspond to different internal states, so it will be more convincing if the authors test the model with varying amplitudes of constant inputs.

    2. eLife assessment

      This is an important study that investigates how neural networks can learn to stochastically replay presented sequences of activity according to learned transition probabilities. The authors use error-based excitatory plasticity to minimize the difference between internally predicted activity and stimulus-driven activity, and inhibitory plasticity to maintain E-I balance. The approach is solid but the choice of learning rules and parameters is not always always justified, lacking a formal derivation and concrete experimental predictions.

    3. Reviewer #1 (Public Review):

      In the presented manuscript, the authors investigate how neural networks can learn to replay presented sequences of activity. Their focus lies on the stochastic replay according to learned transition probabilities. They show that based on error-based excitatory and balance-based inhibitory plasticity networks can self-organize towards this goal. Finally, they demonstrate that these learning rules can recover experimental observations from song-bird song learning experiments.

      Overall, the study appears well-executed and coherent, and the presentation is very clear and helpful. However, it remains somewhat vague regarding the novelty. The authors could elaborate on the experimental and theoretical impact of the study, and also discuss how their results relate to those of Kappel et al, and others (e.g., Kappel et al (doi.org/10.1371/journal.pcbi.1003511)). Overall, the work could benefit if there was either (A) a formal analysis or derivation of the plasticity rules involved and a formal justification of the usefulness of the resulting (learned) neural dynamics; and/or (B) a clear connection of the employed plasticity rules to biological plasticity and clear testable experimental predictions. Thus, overall, this is a good work with some room for improvement.

    4. Reviewer #3 (Public Review):

      Summary:

      Asabuki and Clopath study stochastic sequence learning in recurrent networks of Poisson spiking neurons that obey Dale's law. Inspired by previous modeling studies, they introduce two distinct learning rules, to adapt excitatory-to-excitatory and inhibitory-to-excitatory synaptic connections. Through a series of computer experiments, the authors demonstrate that their networks can learn to generate stochastic sequential patterns, where states correspond to non-overlapping sets of neurons (cell assemblies) and the state-transition conditional probabilities are first-order Markov, i.e., the transition to a given next state only depends on the current state. Finally, the authors use their model to reproduce certain experimental songbird data involving highly-predictable and highly-uncertain transitions between song syllables.

      Strengths:

      This is an easy-to-follow, well-written paper, whose results are likely easy to reproduce. The experiments are clear and well-explained. The study of songbird experimental data is a good feature of this paper; finches are classical model animals for understanding sequence learning in the brain. I also liked the study of rapid task-switching, it's a good-to-know type of result that is not very common in sequence learning papers.

      Weaknesses:

      While the general subject of this paper is very interesting, I missed a clear main result. The paper focuses on a simple family of sequence learning problems that are well-understood, namely first-order Markov sequences and fully visible (no-hidden-neuron) networks, studied extensively in prior work, including with spiking neurons. Thus, because the main results can be roughly summarized as examples of success, it is not entirely clear what the main point of the authors is.

      Going into more detail, the first major weakness I see in this paper is the heuristic choice of learning rules. The paper studies Poisson spiking neurons (I return to this point below), for which learning rules can be derived from a statistical objective, typically maximum likelihood. For fully-visible networks, these rules take a simple form, similar in many ways to the E-to-E rule introduced by the authors. This more principled route provides quite a lot of additional understanding on what is to be expected from the learning process. For instance, should maximum likelihood learning succeed, it is not surprising that the statistics of the training sequence distribution are reproduced. Moreover, given that the networks are fully visible, I think that the maximum likelihood objective is a convex function of the weights, which then gives hope that the learning rule does succeed. And so on. This sort of learning rule has been studied in a series of papers by David Barber and colleagues [refs. 1, 2 below], who applied them to essentially the same problem of reproducing sequence statistics in recurrent fully-visible nets. It seems to me that one key difference is that the authors consider separate E and I populations, and find the need to introduce a balancing I-to-E learning rule.

      Because the rules here are heuristic, a number of questions come to mind. Why these rules and not others - especially, as the authors do not discuss in detail how they could be implemented through biophysical mechanisms? When does learning succeed or fail? What is the main point being conveyed, and what is the contribution on top of the work of e.g. Barber, Brea, et al. (2013), or Pfister et al. (2004)?

      The use of a Poisson spiking neuron model is the second major weakness of the study. A chief challenge in much of the cited work is to generate stochastic transitions from recurrent networks of deterministic neurons. The task the authors set out to do is much easier with stochastic neurons; it is reasonable that the network succeeds in reproducing Markovian sequences, given an appropriate learning rule. I believe that the main point comes from mapping abstract Markov states to assemblies of neurons. If I am right, I missed more analyses on this point, for instance on the impact that varying cell assembly size would have on the findings reported by the authors.

      Finally, it was not entirely clear to me what the main fundamental point in the HVC data section was. Can the findings be roughly explained as follows: if we map syllables to cell assemblies, for high-uncertainty syllable-to-syllable transitions, it becomes harder to predict future neural activity? In other words, is the main point that the HVC encodes syllables by cell assemblies?

      (1) Learning in Spiking Neural Assemblies, David Barber, 2002. URL: https://proceedings.neurips.cc/paper/2002/file/619205da514e83f869515c782a328d3c-Paper.pdf

      (2) Correlated sequence learning in a network of spiking neurons usingmaximum likelihood, David Barber, Felix Agakov, 2002. URL: http://web4.cs.ucl.ac.uk/staff/D.Barber/publications/barber-agakov-TR0149.pdf

    1. eLife assessment

      This useful study investigates the impact of disrupting the interaction of RAS with the PI3K subunit p110α in macrophage function in vitro and inflammatory responses in vivo. Solid data overall supports a role for RAS-p110α signalling in regulating macrophage activity and so inflammation, however for many of the readouts presented the magnitude of the phenotype is not particularly pronounced. Further analysis would be required to substantiate the claims that RAS-p110α signalling plays a key role in macrophage function. Of note, the molecular mechanisms of how exactly p110α regulating the functions in macrophage have not yet been established.

    2. Reviewer #1 (Public Review):

      In this study, Alejandro Rosell et al. uncovers the immunoregulation functions of RAS-p110α pathway in macrophages, including the extravasation of monocytes from the bloodstream and subsequent lysosomal digestion. Disrupting RAS-p110α pathway by mouse genetic tools or by pharmacological intervention, hampers the inflammatory response, leading to delayed resolution and more severe acute inflammatory reactions. The authors proposed that activating p110α using small molecules could be a promising approach for treating chronic inflammation. This study provides insights into the roles and mechanisms of p110α on macrophage function and the inflammatory response, while some conclusions are still questionable because of several issues described below.

      (1) Fig. 1B showed that disruption of RAS-p110α causes the decrease in the activation of NF-κB, which is a crucial transcription factor that regulates the expression of proinflammatory genes. However, the authors observed that disruption of RAS-p110α interaction results in an exacerbated inflammatory state in vivo, in both localized paw inflammation and systemic inflammatory mediator levels. Also, the authors introduced that "this disruption leads to a change in macrophage polarization, favouring a more proinflammatory M1 state" in introduction according to reference 12. The conclusions drew from the signaling and the models seemed contradictory and puzzling. Besides, it is not clear why the protein level of p65 was decreased at 10' and 30'. Was it attributed to the degradation of p65 or experimental variation?

      (2) In Fig 3, the authors used bone-marrow derived macrophages (BMDMs) instead of isolated monocytes to evaluate the ability of monocyte transendothelial migration, which is not sufficiently convincing. In Fig. 3B, the authors evaluated the migration in Pik3caWT/- BMDMs, and Pik3caWT/WT BMDMs treated with BYL-719'. Given that the dose effect of gene expression, the best control is Pik3caWT/- BMDMs treated with BYL-719.

      (3) In Fig. 4E-4G, the authors observed that elevated levels of serine 3 phosphorylated Cofilin in Pik3caRBD/- BMDMs both in unstimulated and in proinflammatory conditions, and phosphorylation of Cofilin at Ser3 increase actin stabilization, it is not clear why disruption of RAS-p110α binding caused a decrease in the F-actin pool in unstimulated BMDMs?

    3. Reviewer #2 (Public Review):

      Summary:

      Cell intrinsic signaling pathways controlling the function of macrophages in inflammatory processes, including in response to infection, injury or in the resolution of inflammation are incompletely understood. In this study, Rosell et al. investigate the contribution of RAS-p110α signaling to macrophage activity. p110α is a ubiquitously expressed catalytic subunit of PI3K with previously described roles in multiple biological processes including in epithelial cell growth and survival, and carcinogenesis. While previous studies have already suggested a role for RAS-p110α signaling in macrophages function, the cell intrinsic impact of disrupting the interaction between RAS and p110α in this central myeloid cell subset is not known.

      Strengths:

      Exploiting a sound previously described genetically mouse model that allows tamoxifen-inducible disruption of the RAS-p110α pathway and using different readouts of macrophage activity in vitro and in vivo, the authors provide data consistent with their conclusion that alteration in RAS-p110α signaling impairs the function of macrophages in a cell intrinsic manner. The study is well designed, clearly written with overall high-quality figures.

      Weaknesses:

      My main concern is that for many of the readouts, the difference between wild-type and mutant macrophages in vitro or between wild-type and Pik3caRBD mice in vivo is rather modest, even if statistically significant (e.g. Figure 1A, 1C, 2A, 2F, 3B, 4B, 4C). In other cases, such as for the analysis of the H&E images (Figure 1D-E, S1E), the images are not quantified, and it is hard to appreciate what the phenotype in samples from Pik3caRBD mice is or whether this is consistently observed across different animals. Also, the authors claim there is a 'notable decrease' in Akt activation but 'no discernible chance' in ERK activation based on the western blot data presented in Figure 1A. I do not think the data shown supports this conclusion.

      To further substantiate the extent of macrophage function alteration upon disruption of RAS-p110α signaling, the manuscript would benefit from testing macrophage activity in vitro and in vivo across other key macrophage activities such as bacteria phagocytosis, cytokine/chemokine production in response to titrating amounts of different PAMPs, inflammasome function, etc. This would be generally important overall but also useful to determine whether the defects in monocyte motility or macrophage lysosomal function are selectively controlled downstream of RAS-p110α signaling.

      Furthermore, given the key role of other myeloid cells besides macrophages in inflammation and immunity it remains unclear whether the phenotype observed in vivo can be attributed to impaired macrophage function. Is the function of neutrophils, dendritic cells or other key innate immune cells not affected?

      Compelling proof of concept data that targeting RAS-p110α signalling constitutes indeed a putative approach for modulation of chronic inflammation is lacking. Addressing this further would increase the conceptual advance of the manuscript and provide extra support to the authors' suggestion that p110α inhibition or activation constitute promising approaches to manage inflammation.

      Finally, the analysis by FACS should also include information about the total number of cells, not just the percentage, which is affected by the relative change in other populations. On this point, Figure S2B shows a substantial, albeit not significant (with less number of mice analysed), increase in the percentage of CD3+ cells. Is there an increase in the absolute number of T cells or does this apparent relative increase reflect a reduction in myeloid cells?

    1. eLife assessment

      The study is useful by attempting to present a new approach of combining two measurements (pHLA binding and pHLA-TCR binding) in order to refine predictions of which patient mutations are likely presented to and recognized by the immune system, but the evidence is incomplete. Whereas the novel methodology proposed is compelling, this article lacks a detailed explanation of the chosen model. The experimental validation confirming the computational predictions with actual immune responses is limited due to sample constraints.

    2. Reviewer #1 (Public Review):

      Summary:

      This paper reports a number of somewhat disparate findings on a set of colorectal tumour and infiltrating T-cells. The main finding is a combined machine-learning tool which combines two previous state-of-the-art tools, MHC prediction, and T-cell binding prediction to predict immunogenicity. This is then applied to a small set of neoantigens and there is a small-scale validation of the prediciton at the end.

      Strengths:

      The prediction of immunogenic neoepitopes is an important and unresolved question.

      Weaknesses:

      The paper contains a lot of extraneous material not relevant to the main claim. Conversely, it lacks important detail on the major claim.

      (1) The analysis of T cell repertoire in Figure 2 seems irrelevant to the rest of the paper. As far as I could ascertain, this data is not used further.

      (2) The key claim of the paper rests on the performance of the ML algorithm combining NETMHC and pmtNET. In turn, this depends on the selection of peptides for training. I am unclear about how the negative peptides were selected. Are they peptides from the same databases as immunogenic petpides but randomised for MHC ? It seems as though there will be a lot of overlap between the peptides used for testing the combined algorithm, and the peptides used for training MHCNet and pmtMHC. If this is so, and depending on the choice of negative peptides, it is surely expected that the tools perform better on immunogenic than on non-immunogenic peptides in Figure 3. I don't fully understand panel G, but there seems very little difference between the TCR ranking and the combined. Why does including the TCR ranking have such a deleterious effect on sensitivity?

      (3) The key validation of the model is Figure 5. In 4 patients, the authors report that 6 out 21 neo-antigen peptides give interferon responses > 2 fold above background. Using NETMHC alone (I presume the tool was used to rank peptides according to bding to the respecitve HLAs in each individual, but this is not clear), identified 2; using the combined tool identified 4. I don't think this is significant by any measure. I don't understand the score shown in panel E but I don't think it alters the underlying statistic.

      In conclusion, the paper demonstrates that combining MHCNET and pmtMHC results in a modest increase in the ability to discriminate 'immunogenic' from 'non-immunogenic' peptide; however, the strength of this claim is difficult to evaluate without more knowledge about the negative peptides. The experimental validation of this approach in the context of CRC is not convincing.

    3. Reviewer #2 (Public Review):

      Summary:

      This paper introduces a novel approach for improving personalized cancer immunotherapy by integrating TCR profiling with traditional pHLA binding predictions, addressing the need for more precise neoantigen CRC patients. By analyzing TCR repertoires from tumor-infiltrating lymphocytes and applying machine learning algorithms, the authors developed a predictive model that outperforms conventional methods in specificity and sensitivity. The validation of the model through ELISpot assays confirmed its potential in identifying more effective neoantigens, highlighting the significance of combining TCR and pHLA data for advancing personalized immunotherapy strategies.

      Strengths:

      (1) Comprehensive Patient Data Collection: The study meticulously collected and analyzed clinical data from 27 CRC patients, ensuring a robust foundation for research findings. The detailed documentation of patient demographics, cancer stages, and pathology information enhances the study's credibility and potential applicability to broader patient populations.

      (2) The use of machine learning classifiers (RF, LR, XGB) and the combination of pHLA and pHLA-TCR binding predictions significantly enhance the model's accuracy in identifying immunogenic neoantigens, as evidenced by the high AUC values and improved sensitivity, NPV, and PPV.

      (3) The use of experimental validation through ELISpot assays adds a practical dimension to the study, confirming the computational predictions with actual immune responses. The calculation of ranking coverage scores and the comparative analysis between the combined model and the conventional NetMHCpan method demonstrate the superior performance of the combined approach in accurately ranking immunogenic neoantigens.

      (4) The use of experimental validation through ELISpot assays adds a practical dimension to the study, confirming the computational predictions with actual immune responses.

      Weaknesses:

      (1) While multiple advanced tools and algorithms are used, the study could benefit from a more detailed explanation of the rationale behind algorithm choice and parameter settings, ensuring reproducibility and transparency.

      (2) While pHLA-TCR binding displayed higher specificity, its lower sensitivity compared to pHLA binding suggests a trade-off between the two measures. Optimizing the balance between sensitivity and specificity could be crucial for the practical application of these predictions in clinical settings.

      (3) The experimental validation was performed on a limited number of patients (four), which might affect the generalizability of the findings. Increasing the number of patients for validation could provide a more comprehensive assessment of the model's performance

    4. Reviewer #3 (Public Review):

      Summary:

      This study presents a new approach of combining two measurements (pHLA binding and pHLA-TCR binding) in order to refine predictions of which patient mutations are likely presented to and recognized by the immune system. Improving such predictions would play an important role in making personalized anti-cancer vaccinations more effective.

      Strengths:

      The study combines data from pre-existing tools pVACseq and pMTNet and applies them to a CRC patient population, which the authors show may improve the chance of identifying immunogenic, cancer-derived neoepitopes. Making the datasets collected publicly available would expand beyond the current datasets that typically describe caucasian patients.

      Weaknesses:

      It is unclear whether the pNetMHCpan and pMTNet tools used by the authors are entirely independent, as they appear to have been trained on overlapping datasets, which may explain their similar scores. The pHLA-TCR score seems to be driving the effects, but this not discussed in detail.

      Due to sample constraints, the authors were only able to do a limited amount of experimental validation to support their model; this raises questions as to how generalisable the presented results are. It would be desirable to use statistical thresholds to justify cutoffs in ELISPOT data.

      Some of the TCR repertoire metrics presented in Figure 2 are incorrectly described as independent variables and do not meaningfully contribute to the paper. The TCR repertoires may have benefitted from deeper sequencing coverage, as many TCRs appear to be supported only by a single read.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Valk and Engert et al. examined the potential relations between three different mental training modules, hippocampal structure and functional connectivity, and cortisol levels over a 9-month period. They found that among the three types of mental training: Presence (attention and introspective awareness), Affect (socio-emotional - compassion and prosocial motivation), and Perspective (socio-cognitive - metacognition and perspective taking) modules; Affect training most consistently related to changes in hippocampal structure and function - specifically, CA1-3 subfields of the hippocampus. Moreover, decreases in diurnal cortisol correlated to bilateral increases in volume, and decreases in diurnal and chronic cortisol left CA1-3 functional connectivity. Chronic cortisol levels also related to right CA4/DG volume and left subiculum function. The authors demonstrate that mindfulness training programs impact hippocampus and are a potential avenue for stress interventions, a potential avenue to improve health. The data contribute to the literature on plasticity of hippocampal subfields during adulthood, the impact of mental training interventions on the brain, and the link between CA1-3 and both short- and long-term stress changes. Additional clarification and extension of the methods is needed to strengthen the authors' conclusions.

      We thank the Reviewer for their positive evaluation and summary of our findings and work. We made additional changes as suggested by the Reviewer and hope this clarified any open points.

      (1) The authors thoughtfully approached the study of hippocampal subfields, utilizing a method designed for T1w images that outperformed Freesurfer 5.3 and that produced comparable results to an earlier version of ASHS. However, given the use of normalized T1-weighted images to delineate hippocampal subfield volume, some caution may be warranted (Wisse et al. 2020). While the authors note the assessment of quality control processes, the difficulty in ensuring valid measurement is an ongoing conversation in the literature. This also extends to the impact of functional co-registration using segmentations. I appreciate the inclusion of Table 5 in documenting reasons for missing data across subjects. Providing additional details on the distribution of quality ratings across subfields would help contextualize the results and ensure there is equal quality of segmentations across subfields.

      We thank the Reviewer for bringing up this point. In the current work, we assessed the overall segmentation of all six subfields per individual. Thus, unfortunately, we have no data of quality of segmentation of individual subfields beyond our holistic assessment. Indeed, registration of hippocampal subfields remains a challenge and we have further highlighted this limitation in the Discussion of the current work.

      “It is of note that the current work relies on a segmentation approach of hippocampal subfields including projection to MNI template space, an implicit correction for total brain volume through the use of a stereotaxic reference frame. Some caution for this method may be warranted, as complex hippocampal anatomy can in some cases lead to over- as well as underestimation of subfield volumes, as well as subfield boundaries may not always be clearly demarcated (1). Future work, studying the hippocampal surface at higher granularity, for example though unfolding the hippocampal sheet (2-5), may further help with both alignment and identification of not only subfield-specific change but also alterations as a function of the hippocampal long axis, a key dimension of hippocampal structural and functional variation that was not assessed in the current work (6, 7).”

      (2) Given the consistent pattern of finding results with CA1-3, in contrast to other subfields, it would help to know if the effects of the different training modules on subfields differed from each other statistically (i.e., not just that one is significant, and one is not) to provide an additional context of the strength of results focused on Affect training and CA1-3 (for example, those shown in Figure 3).

      Our work investigated i) whether the effects of the individual Training Modules differed from each other statistically. We found that the Affect Training Module showed increases in CA1-3 volume, and that these increases remained when testing effects relative to changes in this subfield following Perspective training and in retest controls. Moreover, in CA1-3 we found changes in functional connectivity when comparing the Affect to Perspective training Module. These changes were only present in this contrast, but not significant in each of the Training Modules per se. To test for specificity, we additionally evaluated whether subfield-specific changes were present above and beyond changes in the other ipsilateral hippocampal subfields. Relative to other subfields, right CA1-3 showed increases in the Affect vs Perspective contrast (left: t-value: 2.298, p=0.022, Q>0.1; right: t-value: 3.045, p=0.0025, Q=0.015). No other subfield showed significant changes. We now include this statement in the revised Results and Supplementary Tables.

      “Moreover, associations between CA1-3 and Affect, relative to Perspective, seemed to go largely above and beyond changes in the other subfields (left: t-value: 2.298, p=0.022, Q>0.1; right: t-value: 3.045, p=0.0025, Q=0.015, see further Supplementary File 1h).”

      Author response table 1.

      Subfield-specific changes following the Training Modules, controlling for the other two ipsilateral subfields

      Reviewer #1 (Recommendations For The Authors):

      (1) In Figure 1, using different colors for subfields versus the modules (yellow, red, green) would help as it could lead the reader to try to draw connections between the two when it is namely a depiction of the delineations.

      As suggested, we updated Figure 1 accordingly and present the subfields in different shades of purple for clarity. Please find the updated figure below.

      Author response image 1.

      (2) In the Results, it was at times hard to follow when Affect off Perspective where the focus of the results. Perhaps the authors could restructure or add additional context for clarity.

      We are happy to clarify. For the first analysis on Module-specific changes in hippocampal subfield volume, we compared effects across Training Modules. Here, main contrasts were ran between subjects: Presence vs active control and within subjects: Affect versus Perspective. In additional secondary contrasts, we studied training effects vs retest control. After observing consistent increases in bilateral CA1-3 following Affect, in the following analysis, we evaluated 1) intrinsic functional networks in main and supplementary contrasts and 2) diurnal cortisol measures within the Training modules only and all three Training Modules combined, and also adopted 3) a multivariate approach (PLS) (see comments Reviewer 2). We now also report effects of cortisol change on structural and functional subfield change in Presence and Perspective, for additional completeness and clarity.

      “To study whether there was any training module-specific change in hippocampal subfield volumes following mental training, we compared training effects between all three Training Modules (Presence, Affect, and Perspective). Main contrasts were: Presence vs Active control (between subjects) and Affect vs Perspective (within subjects). Supplementary comparisons were made vs retest controls and within training groups.”

      “Overall, for all hippocampal subfields, findings associated with volume increases in CA1-3 fol-lowing the Affect training were most consistent across timepoints and contrasts (Supplementary File 1a-f).”

      “Subsequently, we studied whether hippocampal CA1-3 would show corresponding changes in intrinsic function following the Affect mental training.”

      “In particular, the moderately consistent CA1-3 volume increases following Affect training were complemented with differential functional connectivity alterations of this subfield when comparing Affect to Perspective training”

      “Last, we probed whether group-level changes in hippocampal subfield CA1-3 volume would correlate with individual-level changes in diurnal cortisol indices (Presence: n= 86; Affect: n=92; Perspective: n=81), given that the hippocampal formation is a nexus of the HPA-axis (8). We took a two-step approach. First, we studied associations between cortisol and subfield change, particularly focusing on the Affect module and CA1-3 volume based on increases in CA1-3 volume identified in our group-level analysis.”

      “We observed that increases in bilateral CA1-3 following Affect showed a negative association with change in total diurnal cortisol output […]”

      “We did not observe alterations in CA1-3 volume in relation to change in cortisol markers in Presence or Perspective. Yet, for Presence, we observed association between slope and LCA4/DG change (t=-2.89, p=0.005, q=0.03), (Supplementary File 1uv).”

      “In case of intrinsic function, we also did not observe alterations in CA1-3 in relation to change in cortisol markers in Presence or Perspective, nor in other subfields (Supplementary File 1wx).”

      Author response table 2.

      Correlating change in subfield volume and diurnal cortisol indices in Presence. Main focus was on CA1-3 based on volumetric observations and are highlighted in bold.

      Author response table 3.

      Correlating change in subfield volume and diurnal cortisol indices in Perspective. Main focus was on CA1-3 based on volumetric observations and are highlighted in bold.

      Author response table 4.

      Association between stress-markers and within functional network sub-regions in Affect and Perspective.

      Author response table 5.

      Correlating change in subfield function and diurnal cortisol indices in Presence. Main focus was on CA1-3 based on volumetric observations and are highlighted in bold. For these multiple comparisons (FDRq, corrected for two subfields) values are reported if uncorrected p values are below p<.05.

      Author response table 6.

      Correlating change in subfield function and diurnal cortisol indices in Perspective. Main focus was on CA1-3 based on volumetric observations and are highlighted in bold. For these multiple comparisons (FDRq, corrected for two subfields) values are reported if uncorrected p values are below p<.05.

      (3) In the Methods, the authors note that corrections for multiple comparisons were used where needed, throughout the manuscript there is some switching between corrected and uncorrected p-values. At times, this made it difficult to follow in terms of when these corrections were needed.

      For clarity, we added explicit multiple comparisons information a) in main and supplementary results, and b) wherever extra information was needed. Also, we only included main contrasts in Table 1-3 to avoid confusion and moved the information on changes in SUB and CA4/DG to the Supplementary tables.

      (4) Typically, when correcting for intracranial volume the purpose is the ensure that sexual dimorphism in the size of the brain is accounted for. I would recommend the authors assess whether sex differences are accounted for by the MNI normalization approach taken. In the reading of the original Methods paper for the patch-based algorithm used, ICV was used to transform to MNI152 space. It would help to have additional information on how the normalization was done in the current study in order to draw comparisons to other findings in the literature.

      We are happy to further clarify. In the current work, we used the same approach as in the original paper. Volumes were linearly registered to the MNI template using FSL flirt. We now provided this additional information in the revised methods.

      “Hippocampal volumes were estimated based on T1w data that were linearly registered to MNI152 using FSL flirt (http://www.fmrib.ox.ac.uk/fsl/), such that intracranial volume was implicitly controlled for.”

      We agree with the Reviewer that sex differences may still be present, and investigated this. At baseline, sex differences were found in all subfields in the left hemisphere, and right CA4/DG (FDRq<0.05). Regressing out ICV resolved remaining sex differences. We then evaluated whether main results of volumetric subfield change were impacted by ICV differences. Differences between Affect and Perspective remained stable. We have now added this additional analysis in the Supplementary Materials.

      “Although stereotaxic normalization to MNI space would in theory account for global sex differences in intra-cranial volume, we still observed sex differences in various subfield volumes at baseline. Yet, accounting for ICV did not impact our main results suggesting changes in CA1-3 following Affect were robust to sex differences in overall brain volume (Supplementary File1j).”

      Author response table 7.

      Sex differences (female versus male) in hippocampal subfield volumes.

      Reviewer #2 (Public Review):

      In this study, Valk, Engert et al. investigated effects of stress-reducing behavioral intervention on hippocampal structure and function across different conditions of mental training and in relation to diurnal and chronic cortisol levels. The authors provide convincing multimodal evidence of a link between hippocampal integrity and stress regulation, showing changes in both volume and intrinsic functional connectivity, as measured by resting-state fMRI, in hippocampal subfield CA1-3 after socio-affective training as compared to training in a socio-cognitive module. In particular, increased CA1-3 volume following socio-affective training overlapped with increased functional connectivity to medial prefrontal cortex, and reductions in cortisol. The conclusions of this paper are well supported by the data, although some aspects of the data analysis would benefit from being clarified and extended.

      A main strength of the study is the rigorous design of the behavioral intervention, including test-retest cohorts, an active control group, and a previously established training paradigm, contributing to an overall high quality of included data. Similarly, systematic quality checking of hippocampal subfield segmentations contributes to a reliable foundation for structural and functional investigations.

      We thank the Reviewer for the thoughtful summary and appreciation of our work, as well as requests for further clarification and analyses. We addressed each of them in a point by point fashion below.

      Another strength of the study is the multimodal data, including both structural and functional markers of hippocampal integrity as well as both diurnal and chronic estimates of cortisol levels.

      (1) However, the included analyses are not optimally suited for elucidating multivariate interrelationships between these measures. Instead, effects of training on structure and function, and their links to cortisol, are largely characterized separately from each other. This results in the overall interpretation of results, and conclusions, being dependent on a large number of separate associations. Adopting multivariate approaches would better target the question of whether there is cortisol-related structural and functional plasticity in the hippocampus after mental training aimed at reducing stress.

      We thank the Reviewer for this suggestion. Indeed, our project combined different univariate analyses to uncover the association between hippocampal subfield structure, function, and cortisol markers. While systematic, a downside of this approach is indeed that interpretation of our results depend on a large number of analyses. To further explore the question whether there is cortisol-related structural and functional plasticity in the hippocampus, we followed the Reviewer’s suggestion and additionally adopted a multivariate partial least squares (PLS) model. We ran two complementary models. One focusing on the bilateral CA1-3, as this region showed increases in volume following Affect training and differential change between Affect and Perspective training in our resting state analyses and one model including all subfields. Both models included all stress markers. We found that both models could significantly relate stress markers to brain measures, and that in particular Affect showed strong associations with significant the latent markers. Both analyses showed inverse effects of structure and function in relation to stress markers and both slope and AUC changes showed strongest loadings. We now include these analyses the revised manuscript.

      Abstract

      “Of note, using a multivariate approach we found that other subfields, showing no group-level changes, also contributed to alterations in cortisol levels, suggesting circuit-level alterations within the hippocampal formation.”

      Methods

      “Partial least squares analysis

      To assess potential relationships between cortisol change and hippocampal subfield volume and functional change, we performed a partial least squares analysis (PLS) (9, 10). PLS is a multivariate associative model that to optimizes the covariance between two matrices, by generating latent components (LCs), which are optimal linear combinations of the original matrices (9, 10). In our study, we utilized PLS to analyze the relationships between change in volume and intrinsic function of hippocampal subfields and diurnal cortisol measures. Here we included all Training Modules and regressed out effects of age, sex, and random effects of subject on the brain measures before conducting the PLS analysis. The PLS process involves data normalization within training groups, cross-covariance, and singular value decomposition. Subsequently, subfield and behavioral scores are computed, and permutation testing (1000 iterations) is conducted to evaluate the significance of each latent factor solution (FDR corrected). We report then the correlation of the individual hippocampal and cortisol markers with the latent factors. To estimate confidence intervals for these correlations, we applied a bootstrapping procedure that generated 100 samples with replacement from subjects’ RSFC and behavioral data.”

      Results

      “Last, to further explore the question whether there is concordant cortisol-related structural and functional plasticity in the hippocampus we adopted a multivariate partial least square approach, with 1000 permutations to account for stability (9, 10) and bootstrapping (100 times) with replacement. We ran two complementary models including all Training Modules whilst regressing out age, sex and random effects of subject. First, we focused on the bilateral CA1-3, as this region showed increases in volume following Affect training and differential change between Affect and Perspective training in our resting state analyses. In the second model included structural and functional data of all subfields. Both models included all stress markers. We found that both models could identify significant associations between cortisol stress markers and hippocampal plasticity (FDRq<0.05), and that in particular Affect showed strongest associations with the latent markers for CA1-3 (Table 5). Both analyses showed inverse effects of subfield structure and function in relation to stress markers and both slope and AUC changes showed strongest associations with the latent factor.”

      Author response table 8.

      Multivariate PLS analyses linking cortisol markers to hippocampal subfield volume and function.

      Discussion

      “Last, performing multivariate analysis, we again observed associations between CA1-3 volume and function plasticity and stress change, strongest in Affect. Yet combining all subfields in a single model indicated that other subfields also link to stress alterations, indicating that ultimately circuit-level alterations within the hippocampal formation relate to latent changes in diurnal stress markers across Training Modules.”

      “This interpretation is also supported by our multivariate observations.”

      “In line with our observations in univariate analysis, we found multivariate associations between hippocampal subfield volume, intrinsic function and cortisol markers. Again, the contribution of volume and intrinsic function was inverse. This may possibly relate to the averaging procedure of the functional networks. Combined, outcomes of our univariate and multivariate analyses point to an association between change in hippocampal subfields and stress markers, and that these changes, at the level of the individual, ultimately reflect complex interactions within and across hippocampal subfields and may capture different aspects of diurnal stress. Future work may more comprehensively study the plasticity of the hippocampal structure, and link this to intrinsic functional change and cortisol to gain full insights in the specificity and system-level interplay across subfields, for example using more detailed hippocampal models (3). Incorporating further multivariate, computational, models is needed to further unpack and investigate the complex and nuanced association between hippocampal structure and function, in particular in relation to subfield plasticity and short and long-term stress markers.”

      “…based on univariate analysis. Our multivariate analysis further nuanced this observation, but again pointed to an overall association between hippocampal subfield changes and cortisol changes, but this time more at a systems level.”

      “Lastly, our multivariate analyses also point to a circuit level understanding of latent diurnal stress scores.”

      Author response image 2.

      Multivariate associations between changes in structure and function of hippocampal subfield volume and markers of stress change in Affect. A) Multivariate associations between bilateral CA1-3 volume and intrinsic function and stress markers. Left: Scatter of loadings, colored by Training Module; Right upper: individual correlations of stress markers; Right lower: individual correlation of subfields; B). Multivariate associations between all subfields’ volume and intrinsic function and stress markers. Left: Scatter of loadings, colored by Training Module; Right upper: individual correlations of stress markers; Right lower: individual correlation of subfields.

      (2) The authors emphasize a link between hippocampal subfield CA1-3 and stress regulation, and indeed, multiple lines of evidence converge to highlight a most consistent role of CA1-3. There are, however, some aspects of the results that limit the robustness of this conclusion. First, formal comparisons between subfields are incomplete, making it difficult to judge whether the CA1-3, to a greater degree than other subfields, display effects of training.

      We thank the Reviewer for this comment. To further test for specificity, we additionally evaluated subfield-specific changes relative to other subfields for our main contrasts (Presence versus Active Control and Affect versus Perspective). Relative to other subfields, right CA1-3 showed increases in the Affect vs Perspective contrast (left: t-value: 2.298, p=0.022, Q>0.1; right: t-value: 3.045, p=0.0025, Q=0.015); no other subfield showed significant changes. We now include this statement in Results and Supplementary Tables.

      “Moreover, associations between CA1-3 and Affect, relative to Perspective, seemed to go largely above and beyond changes in the other subfields (left: t-value: 2.298, p=0.022, Q>0.1; right: t-value: 3.045, p=0.0025, Q=0.015, see further Supplementary File 1h).”

      Author response table 9.

      Subfield-specific changes following the Training Modules, controlling for the other two ipsilateral subfields

      (3) Relatedly, it would be of interest to assess whether changes in CA1-3 make a significant contribution to explaining the link between hippocampal integrity and cortisol, as compared to structure and functional connectivity of the whole hippocampus.

      We thank the Reviewer for this comment. Please see the PLS analysis performed above (R2Q1). Indeed, not only CA1-3 but also other subfields seem to show a relationship with cortisol, in line with circuit level accounts on stress regulation and hippocampal circuit alterations (8, 11-15).

      (4) Second, both structural and functional effects (although functional to a greater degree), were most pronounced in the specific comparison of "Affect" and "Perspective" training conditions, possibly limiting the study's ability to inform general principles of hippocampal stress-regulation.

      We agree with the Reviewer that the association between stress and hippocampal plasticity, on the one hand, and mental training and hippocampal plasticity, on the other hand, make it not very straightforward to inform general principles on hippocampal stress regulation. However, as underscored in the discussion, in previous work we could also link mental training to stress reductions(16-18). We hope that the additional analyses and explanations further explain the multilevel insights of the current work, on the one hand using group-level analysis to investigate and illustrate the association between mental training and hippocampal subfield volume and intrinsic function, and on the other hand using individual level analysis to unpack the association between cortisol change and hippocampal subfield change.

      Reviewer #2 (Recommendations For The Authors):

      (1) In the Results, the description of how the hippocampal subfields' functional networks were defined would benefit from some clarification. It is also somewhat unclear what is meant by (on page 10): "Evaluating functional connectivity changes, we found that connectivity of the right CA1-3 functional network showed differential changes when comparing Affect training to Perspective training (2.420, p=0.016, FDRq=0.032, Cohens D =0.289), but not versus retest control (Table 1 and Supplementary Table 8-14)." Were there significant changes in CA1-3 FC following both training conditions (but these differed from each other)? A description of what this difference reflected would increase the reader's understanding.

      We are happy to clarify. We included information of change of individual modules in the Supplementary materials, Supplementary Table 1 and 2, 9 and 10. Changes for functional connectivity were largely due to the differences in Modules, but did not show strong effects in one Module alone. We now include information on Affect and Perspective un-contrasted change in the main results text:

      “… which could be attributed to decreases in right CA1-3 mean FC following Perspective (t=-2.012, p=0.045, M:-0.024, std: 0.081, CI [-0.041 -0.006]), but not Affect (t=1.691, p=0.092, M: 0.010, std: 0.098, CI [-0.01 0.031]); changes were not present when comparing Affect training versus retest control (Table 1 and Supplementary File 1k-q).”

      (2) As described in the Public Review, the lack of multivariate assessments may risk selling the data short. Including analyses of concomitant functional and structural changes, in relation to cortisol, seems like an approach better adapted to characterize meaningful interrelationships between these measures.

      We thank the Reviewer for suggesting multivariate assessments. To understand the interrelation between behavioral intervention, hippocampal plasticity, and cortisol changes, the current work first evaluates a simpler operationalization of the relationship between hippocampal subfield structure and volume, and cortisol as a function of mental training. Thus, given the complex nature of the study, we initially opted for a model where we assess structural and functional changes independently, with structural changes as the basis of our investigations. Now we have also included a multivariate approach (PLS) to further test the association between hippocampal subfields and cortisol markers, please see our additions to the manuscript above. We now highlighted multivariate associations in the Discussion as well, and suggest this as an important next step for more detailed, future investigations.

      “Incorporating further multivariate, computational, models is needed to further unpack and investigate the complex and nuanced association between hippocampal structure and function, in particular in relation to subfield plasticity and short and long-term stress markers.”

      (3) A minor comment regards the Figures. Some main effects should be visualized in a clearer manner. For instance, the scatterplots in Figure 1, panel D. Also, some of the current headings within the figures could be made more intuitive to the reader.

      We thank the Reviewer for this comment. To improve clarity, we updated figure headings. For Figure 1D, the challenge is that the data are quite scattered and we aimed to visualize our observations in a naturalistic way. Therefore, we added additional y-axis information to further clarify the figures. Creating more overlap or differentiation would make other elements of the figure less clear, hence we remained with the current set-up detailing the intra- and inter-individual alterations of the current model.

      (1) Wisse LEM, Chetelat G, Daugherty AM, de Flores R, la Joie R, Mueller SG, et al. (2021): Hippocampal subfield volumetry from structural isotropic 1 mm(3) MRI scans: A note of caution. Hum Brain Mapp. 42:539-550.

      (2) DeKraker J, Kohler S, Khan AR (2021): Surface-based hippocampal subfield segmentation. Trends Neurosci. 44:856-863.

      (3) DeKraker J, Haast RAM, Yousif MD, Karat B, Lau JC, Kohler S, et al. (2022): Automated hippocampal unfolding for morphometry and subfield segmentation with HippUnfold. Elife. 11.

      (4) Vos de Wael R, Lariviere S, Caldairou B, Hong SJ, Margulies DS, Jefferies E, et al. (2018): Anatomical and microstructural determinants of hippocampal subfield functional connectome embedding. Proc Natl Acad Sci U S A. 115:10154-10159.

      (5) Bernhardt BC, Bernasconi A, Liu M, Hong SJ, Caldairou B, Goubran M, et al. (2016): The spectrum of structural and functional imaging abnormalities in temporal lobe epilepsy. Ann Neurol. 80:142-153.

      (6) Vogel JW, La Joie R, Grothe MJ, Diaz-Papkovich A, Doyle A, Vachon-Presseau E, et al. (2020): A molecular gradient along the longitudinal axis of the human hippocampus informs large-scale behavioral systems. Nat Commun. 11:960.

      (7) Genon S, Bernhardt BC, La Joie R, Amunts K, Eickhoff SB (2021): The many dimensions of human hippocampal organization and (dys)function. Trends Neurosci. 44:977-989.

      (8) McEwen BS (1999): Stress and hippocampal plasticity. Annu Rev Neurosci. 22:105-122.

      (9) Kebets V, Holmes AJ, Orban C, Tang S, Li J, Sun N, et al. (2019): Somatosensory-Motor Dysconnectivity Spans Multiple Transdiagnostic Dimensions of Psychopathology. Biol Psychiatry. 86:779-791.

      (10) McIntosh AR, Lobaugh NJ (2004): Partial least squares analysis of neuroimaging data: applications and advances. Neuroimage. 23 Suppl 1:S250-263.

      (11) Paquola C, Benkarim O, DeKraker J, Lariviere S, Frassle S, Royer J, et al. (2020): Convergence of cortical types and functional motifs in the human mesiotemporal lobe. Elife. 9.

      (12) DeKraker J, Ferko KM, Lau JC, Kohler S, Khan AR (2018): Unfolding the hippocampus: An intrinsic coordinate system for subfield segmentations and quantitative mapping. Neuroimage. 167:408-418.

      (13) McEwen BS, Nasca C, Gray JD (2016): Stress Effects on Neuronal Structure: Hippocampus, Amygdala, and Prefrontal Cortex. Neuropsychopharmacology. 41:3-23.

      (14) Sapolsky RM (2000): Glucocorticoids and hippocampal atrophy in neuropsychiatric disorders. Arch Gen Psychiatry. 57:925-935.

      (15) Jacobson L, Sapolsky R (1991): The role of the hippocampus in feedback regulation of the hypothalamic-pituitary-adrenocortical axis. Endocr Rev. 12:118-134.

      (16) Engert V, Hoehne K, Singer T (2023): Specific reduction in the cortisol awakening response after socio-affective mental training. Mindfulness.

      (17) Puhlmann LMC, Vrticka P, Linz R, Stalder T, Kirschbaum C, Engert V, et al. (2021): Contemplative Mental Training Reduces Hair Glucocorticoid Levels in a Randomized Clinical Trial. Psychosom Med. 83:894-905.

      (18) Engert V, Kok BE, Papassotiriou I, Chrousos GP, Singer T (2017): Specific reduction in cortisol stress reactivity after social but not attention-based mental training. Sci Adv. 3:e1700495.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Responses to Reviewer 1:

      It wouldn't be very surprising to identify the association between PhenoAgeAccel and cancer risk, since the PhenoAgeAccel was constructed as a predictor for mortality which attributed a lot to cancer. Although cancer is an essential mediator for the association, sensitivity analyses using cancer-free mortality may provide an additional angle.

      As suggested, we retrained the PhenoAge in cancer-free participants based on mortality and recalculated PhenoAgeAccel in the UK Biobank. As expected, the re-calculated PhenoAgeAccel was still significantly associated with an increased risk of overall cancer in both men and women. The relevant results have been added to Appendix 1-table6.

      It would be interesting to see, to what extent, PhenoAgeAccel could be reversed by environmental or lifestyle factors. G by E for PhenoAgeAccel might be worth a try.

      As suggested, we performed interaction analysis between genetic and lifestyle factors on PhenoAgeAccel, and added the methods and results in the revision as follows:

      “55 independent PhenoAgeAccel-associated SNPs (P < 5 × 10-8) and corresponding effect sizes were derived from a large-scale PhenoAgeAccel GWAS including 107,460 individuals of European ancestry (Kuo, Pilling, Liu, Atkins, & Levine, 2021). A PhenoAgeAccel PRS was created using an additive model as previously described (Dai et al., 2019). In short, the genotype dosage of each risk allele for each individual was summed after multiplying by its respective effect size of PhenoAgeAccel.” (Page 6)

      “We performed additive interaction analysis between genetic risk (defined by CPRS) and PhenoAgeAccel on overall cancer risk, as well as genetic risk (defined by PhenoAgeAccel PRS) and lifestyle on PhenoAgeAccel using two indexes: the relative excess risk due to interaction (RERI) and the attributable proportion due to interaction (AP).” (Page 9)

      “However, we did not observe any interaction between genetic risk and lifestyle on PhenoAgeAccel in both men and women (Appendix 1-table 11).” (Page 13)

      Responses to Reviewer 2:

      Since the UK biobank has a large sample size, it should have enough power to split the dataset into discovery and validation sets. Why did the authors use 10-fold cross-validation instead of splitting the dataset?

      There may have been some misunderstandings in the interpretation of methods that 10-fold cross-validation was applied to select biomarkers when calculating PhenoAge in the previous manuscript (Levine et al., 2018). In this study, we analyzed the association between PhenoAgeAccel and incident cancer risk by dividing participants into ten groups based on the deciles of PhenoAgeAccel and assessed the associations of each group compared to the lowest decile. To avoid any confusion, we have removed the description of 10-fold cross-validation from the Methods section (Page 5).

      Recommendations for the authors:

      In addition, there is extant literature on the role of Phenotypic Age Acceleration in cancer risk and mortality that should be reviewed. Please also address possible overlap with previous work that used the UK Biobank cohort study (PMCID: PMC9958377).

      As suggested, we have reviewed the association of Phenotypic Age Acceleration with cancer risk, and added it into the Discussion section as follows:

      “Recently, several studies have confirmed the associations between PhenoAgeAccel and cancer risk. Mak et al. explored three measures of biological age, including PhenoAge, and assessed their associations with the incidence of overall cancer and five common cancers (breast, prostate, lung, colorectal, and melanoma) (Mak et al., 2023). In our previous study, we investigated the association between PhenoAgeAccel and lung cancer risk and analyzed the joint and interactive effects of PhenoAgeAccel and genetic factors on the risk of lung cancer (Ma et al., 2023). In comparison to these studies, our analysis expanded the range of cancers to 20 types and further explored the associations in different genetic and lifestyle contexts. Moreover, we also evaluated the potential implications of PhenoAge in population-level cancer screening.” (Page 15).

      Other minor comments:

      Line 216, "-4.35 to -1.25" or "-4.35, -1.25" may be better.

      As suggested, we have adjusted text accordingly.

      Line 260, please clarify the PRS used for G by E interaction testing. It could be site-specific PRS or CPRS.

      We used CPRS for G by E interaction testing, and we have changed the description of our methods as follows:

      “We performed additive interaction analysis between genetic risk (defined by CPRS) and PhenoAgeAccel on overall cancer risk, as well as genetic risk (defined by PhenoAgeAccel PRS) and lifestyle on PhenoAgeAccel using two indexes: the relative excess risk due to interaction (RERI) and the attributable proportion due to interaction (AP).” (Page 9)

      Line 223, The discussion/interpretation for "while negatively associated with risk of prostate cancer" is lacking.

      As suggested, we have discussed this as follows:

      “In addition, we observed a negative association between PhenoAgeAccel and prostate cancer risk. The unexpected association may have been confounded by diabetes and altered glucose metabolism, both of which are closely linked to aging. When we removed HbA1c and serum glucose from the biological age algorithms, the association became non-statistically significant. Similar findings were also reported by Mak et al. (Mak et al., 2023) and Dugue et al. (Dugue et al., 2021).” (Page 15).

      It is not clear how to define "biologically older" and "biologically younger". Whether the individuals fall in the "middle area" will impact the results.

      We defined "biologically older" and "biologically younger" based on Phenotypic Age Acceleration (PhenoAgeAccel), which was defined as the residual obtained from a linear model when regressing Phenotypic Age on chronological age. We categorized individuals with PhenoAgeAccel > 0 as biologically older and those with PhenoAgeAccel < 0 as biologically younger.

      Compared with individuals at low accelerated aging (the bottom quintile of PhenoAgeAccel), we found those in the "middle area" (quintiles 2 to 4) and high accelerated aging (the top quintile) had a significantly higher risk of overall cancer (Table 2). Individuals fall in the "middle area" also had a moderate risk of overall cancer, when reclassified accelerated aging levels according to quartiles or tertiles of the PhenoAgeAccel (Appendix 1-table 2).

      Do men and women have distinct biological ages, so they were analyzed separately?

      We found that men (median PhenoAgeAccel: 0.34, IQR: -2.42 to 3.53) have higher biological ages than women (median PhenoAgeAccel: -1.38, IQR: -4.26 to 1.96) (P < 0.0001). In addition, men and women have different cancer incidence patterns (Rubin, 2022). Therefore, we conducted separate analyses to investigate the associations of PhenoAgeAccel with cancer risk in men and women.

      Dai, J., Lv, J., Zhu, M., Wang, Y., Qin, N., Ma, H., . . . Shen, H. (2019). Identification of risk loci and a polygenic risk score for lung cancer: a large-scale prospective cohort study in Chinese populations. Lancet Respir Med, 7(10), 881-891. doi: 10.1016/S2213-2600(19)30144-4

      Dugue, P. A., Bassett, J. K., Wong, E. M., Joo, J. E., Li, S., Yu, C., . . . Milne, R. L. (2021). Biological Aging Measures Based on Blood DNA Methylation and Risk of Cancer: A Prospective Study. JNCI Cancer Spectr, 5(1). doi: 10.1093/jncics/pkaa109

      Kuo, C. L., Pilling, L. C., Liu, Z., Atkins, J. L., & Levine, M. E. (2021). Genetic associations for two biological age measures point to distinct aging phenotypes. Aging Cell, 20(6), e13376. doi: 10.1111/acel.13376

      Levine, M. E., Lu, A. T., Quach, A., Chen, B. H., Assimes, T. L., Bandinelli, S., . . . Horvath, S. (2018). An epigenetic biomarker of aging for lifespan and healthspan. Aging (Albany NY), 10(4), 573-591. doi: 10.18632/aging.101414

      Ma, Z., Zhu, C., Wang, H., Ji, M., Huang, Y., Wei, X., . . . Shen, H. (2023). Association between biological aging and lung cancer risk: Cohort study and Mendelian randomization analysis. iScience, 26(3), 106018. doi: 10.1016/j.isci.2023.106018

      Mak, J. K. L., McMurran, C. E., Kuja-Halkola, R., Hall, P., Czene, K., Jylhava, J., & Hagg, S. (2023). Clinical biomarker-based biological aging and risk of cancer in the UK Biobank. Br J Cancer, 129(1), 94-103. doi: 10.1038/s41416-023-02288-w

      Rubin, J. B. (2022). The spectrum of sex differences in cancer. Trends Cancer, 8(4), 303-315. doi: 10.1016/j.trecan.2022.01.013

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      We wish to thank the Reviewers for their critical analysis of the article and for their suggestions and comments.

      In addition and beside the point-by-point answer to the Reviewers, we wish here to emphasize on three essential points that have been raised: First, we never intended (nor pretended) to address the incidence of the two EHT cell emergence processes on downstream fate, after release from the aortic floor (see for example the last paragraph of our initially submitted manuscript). We only wished to bring evidence on cell biological heterogeneity of the HE, particularly relying on cell polarity control and polarity reestablishment/reinforcement in the case of EHT pol+ cells, thus leading to emergence morphodynamic complexity. In the general context of cell extrusion in which all polarity features are generally downregulated, these are remarkable features.

      Second, we inform the Reviewers that we have performed a major revision of the work on the Pard3 proteins issue the outcome of which, hopefully, substantiates significantly the idea of a tuning of cell polarity features in the HE and all along the EHT time-window, for supporting EHT pol- and EHT pol+ types of emergence. To achieve this, we entirely revised the experimental strategy to increase specificity and sensitivity of detection of Pard3 protein isoforms expressed in the vascular system, based on endothelial FACS-sorting, qRT-PCR and single-molecule whole mount in situ hybridization using RNAscope. Importantly, we wish to stress that, by addressing Pard3 proteins, we initially aimed at substantiating our observations on the localization of our podxl2 construct (del-podxl2) used to label apical membranes. Hence, we sought to bring correlative evidence on the variation of expression of polarity proteins at early and later time points of the EHT time-window (suggesting tightly regulated expression control of polarity determinants, possibly at the mRNA level). This was clearly written and justified in the text, lines 227 or 303 of the initial manuscript. Also, this may have led to identify (a) specific isoform(s), including splicing variants as initially addressed.

      As the Reviewers will see, while performing the revision of our work, we now have been able to point at a specific isoform of Pard3, namely Pard3ba, whose mRNA expression level, in aortic cells and at the single cell resolution, is uniquely and specifically enhanced in cells contacting emergence ‘hot spots’. Using our Runx1 mutant fish line (dt-Runx1), we also show that expression of Pard3ba mRNAs, in these specific aortic regions, is sensitive to interference with Runx1 activity (i.e dt-Runx1 increases Pard3ba expression). Altogether, our new results strongly support our idea, initially proposed, on the regulation of polarity features during EHT; they indicates intercellular coordination, throughout cooperative cross-talk between aortic and HE/EHT cells. This is compatible with the idea of a ‘tuning’ of apico-basal polarity during the entire EHT time-window (including maturation of the HE to become competent for emergence and the emergence process per se whose morphodynamic complexity relies on regulating apico-basal polarity associated functions (ex: for controlling the specific junctional recycling modes of EHT pol+ and EHT pol- cells, as we suggest using JAM proteins that we have chosen owing to their function in the recruitment of Pard3 proteins for apico-basal polarity establishment)). This complements nicely our work and highlights the relevance of studying the interplay between aortic and HE/EHT cells (which we have started to dissect in the second part of our manuscript). Further work is obviously required to address local, dynamic variations of mRNAs encoding for this specific isoform of Pard3 as well as specific interference with its functions at the spatial and temporal levels (hence on live tissues), which is far beyond the scope of our currently submitted work.

      Finally, this emphasizes the importance of the aortic context, at the mesoscopic level, in the regulation of the EHT.

      Third, based on these major points and Reviewers suggestions, we propose to take into account the fact that the heterogeneity in emergence morphodynamics was not highlighted and propose the following title:

      ‘Tuning apicobasal polarity and junctional recycling in the hemogenic endothelium orchestrates the morphodynamic complexity of emerging pre-hematopoietic stem cells’

      Regarding Results and Figures, the previous Figures 3 and 4 have been entirely revised, with the support of Supplement Figures (3 and 4 supplement figures, respectively as well as a supplement video to Figure 3). Supplement Figures have also been included to the revised version, for nearly all results that appeared as data not shown (Figure 1 – figure supplement 2: illustrating the maintenance of EHT pol+ and EHT pol- cells after division; Figure 1 – figure supplement 3: illustrating the expression of the hematopoietic marker CD41 by EHT pol+ and EHT pol- cells). Also, a new supplemental figure, Figure 7 – figure supplement 7, has been added to substantiate the impact of interfering with ArhGEF11/PDZ-RhoGEF alternative splicing on hematopoiesis. Finally, a Figure for the Reviewers is added at the end of this file that shows that virtually 100% of aortic floor cells that we consider as hemogenic cells are positive for the hematopoietic marker Gata2b which is upstream of Runx1 (using RNAscope which allows achieving cellular resolution unambiguously).

      Reviewer #1 (Public Review):

      Summary:

      In this research article, the authors utilized the zebrafish embryo to explore the idea that two different cell types emerge with different morphodynamics from the floor of the dorsal aorta based on their apicobasal polarity establishment. The hypothesis that the apical-luminal polarity of the membrane could be maintained after EHT and confer different functionality to the cell is exciting, however, this could not be established. There is a general lack of data supporting several of the main statements and conclusions. In addition, the manuscript is difficult to follow and needs refinement. We present below some questions and suggestions with the goal of guiding the authors to improve the manuscript and solidify their findings.

      Here, we wish to emphasize that we do not make the hypothesis that ‘…the apical-luminal polarity of the membrane could be maintained after EHT …’ but that the apico-basal polarity establishment/maintenance controls the type of emergence and their associated cell biological features (EHT pol+ and EHT pol- cellular morphodynamics, establishment of membrane domains). Hence, our work suggests that these emergence modes, as a consequence of their intrinsic characteristics and differences, might have an impact on cellular behavior after the release (to place the work in the broader context of hematopoietic cell fate and differentiation). More specifically, the difference in the biological features of the luminal versus abluminal membrane for the two EHT types (ex: membrane signaling territories, membrane pools devoted to specific functions), might endow the cells with specific functional properties, after the release. What happens to those cells thereafter, except for illustrating the evolution of the luminal membrane for pol+ EHT cells, is beyond the scope of this paper. Here, we analyze and characterize some of the cell biological features of the EHT process per se (the emergence from the aortic floor), including the dynamic interface with adjoining endothelial cells.

      Strengths:

      New transgenic zebrafish lines developed. Challenging imaging.

      Weaknesses:

      (1) The authors conclude that the truncated version of Podxl2 fused to a fluorophore is enriched within the apical site of the cell. However, based on the images provided, an alternative interpretation is that the portion of the membrane within the apical side is less stretched than in the luminal side, and therefore the fluorophore is more concentrated and easier to identify by confocal. This alternative interpretation is also supported by data presented later in the paper where the authors demonstrate that the early HE is not polarized (membranes are not under tension and stretched yet). Could the authors confirm their interpretation with a different technique/marker like TEM?

      The argument of the apparent enrichment, or exclusion, of a marker depending on membrane stretching (and hence molecular packing) would be valid for any type of molecule embedded in these membranes, including of course endogenous ones (this is one of the general biophysical principles leading to the establishment of membrane domains, structurally and functionally speaking); hence, using another marker would not solve the issue because it would depends on its behavior in regard to packing (in particular lipid packing), which is difficult to anticipate and is a topic in its own (especially in this system that has been poorly investigated in regard to its biophysical and biochemical properties in vivo (including its exposure to the hemodynamics)).

      If we follow the logic of the Reviewer, it appears that it is not consistent with our results on the maturing HE. Indeed, in our dt-Runx1 mutants, mKate2-podxl2 is enriched at the luminal membrane of HE cells (HE cells are elongated, and the two membrane domains have a relative equal surface and bending); in comparison, HE cells have the same morphology in control animals than in mutants but, in controls, eGFP-podxl2 and mKate2-podxl2 are equally partitioned between the luminal and abluminal membranes (see Figure 3 – figure supplement 2 (for mKate2-podxl2) and Figure 2 – figure supplement 1 and 2 (for eGFP-podxl2)). In addition, we took care while designing the eGFP and mKate2 fusions to keep the natural podxl2 sequence containing critical cysteine residues to maintain assembly properties and distance from the transmembrane segment (hence the fluorescent protein per se is not directly exposed to membrane stretching).

      Finally, electron microscopy is not the approach to use for this issue because requiring tissue fixation which is always at risk because modifying significantly membrane properties. On this line, when we fix embryos (and hence membranes, see our new Figure 4 and its Supplemental Figures), we do not appear to maintain obvious EHT pol+ and pol- cell shapes. In addition, to be conclusive, the work would require not TEM but immuno-EM to be able to visualize the marker(s), which is another challenge with this system.

      (2) Could the authors confirm that the engulfed membranes are vacuoles as they claimed, using, for example, TEM? Why is it concluded that "these vacuoles appear to emanate from the abluminal membrane (facing the sub-aortic space) and not from the lumen?" This is not clear from the data presented.

      The same argument regarding electron microscopy mentioned on the point before is valid here (in addition, it would require serial sectioning in the case it would be technically feasible to make sure not to miss the very tinny connection that may only suggest ultimate narrowing down of the facing adjacent bilayers, which is quite challenging). The term vacuole which we use with caution (in fact, more often, we use the term pseudo-vacuoles in the initial manuscript, lines 140, 146, 1467 (legend to Figure 1 – figure supplemental 1 or apparent vacuole-like in the same legend lines 1465 and 1476) is legitimate here because we cannot say that they are portions of the invaginated luminal membrane as we could be accused not to show that these membranes are still connected to the luminal surface; we are here at the limit of the resolution that in vivo imaging is allowing for the moment with this system, and we drive the attention of the Reviewer on the fact that we are reaching here a sub-cellular level which is already a challenge by itself.

      In addition, if there would not be at some point vacuoles (or pseudo-vacuoles) formed in this system (membrane-bounded organelles), it would be difficult to conceive how, after release of the cell, the fluid inherited from the artic lumen would efficiently be chased from these membranes/organelles (see also our model Figure 1 – figure Supplement 1B).

      Why is it concluded that "these vacuoles appear to emanate from the abluminal membrane (facing the sub-aortic space) and not from the lumen?" This is not clear from the data presented.

      This is not referring to our data but to the Sato et al 2023 work. For EHT undergoing cells leading to aortic clusters in mammals and avians, vacuolar structures indeed appear to emanate from the ab-luminal side facing the sub-aortic space (we cannot call it basal because we do not know the polarity status of these cells). In the Revised version of the manuscript, we have moved this paragraph referring to the Sato et al work to the Discussion, which gives the possibility to expand a bit on this issue, for more clarity (see the second paragraph of our new Discussion).

      (3) It is unclear why the authors conclude that "their dynamics appears to depend on the activity of aquaporins and it is very possible that aquaporins are active in zebrafish too, although rather in EHT cells late in their emergence and/or in post-EHT cells, for water chase and vacuolar regression as proposed in our model (Figure 1 - figure supplement 1B)." In our opinion, these figures do not confirm this statement.

      This part of the text has been upgraded and moved to the Discussion (see our answer to point 2), to take Reviewers concern about clarity of the Results text section and allowing elaborating a bit more on this issue. We only wished to drive the attention on the described presence of intracellular vacuolar structures recently addressed in the Sato el al 2023 paper showing EHTcell vacuoles that are proposed to contribute to cellular deformation during the emergence. We take this example to rationalize the regression of the vacuolar structures described Figure 1 - figure supplement 1B, which is why we have written ‘… it is very possible that aquaporins are active in zebrafish too’; the first part of the sentence refers to the Sato et al 2023 paper.

      (4) Could the authors prove and show data for their conclusions "We observed that both EHT pol+ and EHT pol- cells divide during the emergence"; "both EHT pol+ and EHT pol- cells express reporters driven by the hematopoietic marker CD41 (data not shown), which indicates that they are both endowed with hematopoietic potential"; and "the full recovery of their respective morphodynamic characteristics (not shown)?".

      To the new version of our manuscript, we have added new Supplemental information to Figure 1 (two new Supplemental Figures):

      • Figure 1 - figure Supplement 2 that illustrates that both EHT pol+ and EHT pol- cells divide during the emergence as well as the maintenance of morphology for both EHT cell types. We wish also to add here that the maintenance of the EHT pol+ morphology is the most critical point, showing that dividing cells in this system do not necessarily lead to EHT pol- cells.

      • Figure 1 - figure Supplement 3 that shows that both EHT cell types express CD41.

      (5) The authors do not demonstrate the conclusion traced from Fig. 2B. Is there a fusion of the vacuoles to the apical side in the EHT pol+ cells? Do the cells inheriting less vacuoles result in pol- EHT? It looks like the legend for Fig. 2-fig supp is missing.

      As said previously, showing fusion here is not technically possible, but indeed, this is the idea, which fits with the images corresponding to timing points 0-90 minutes (Figure 2A), showing (in particular for the right cell) a large pseudo-vacuole whose membrane is heavily enriched with the polarity marker podxl2 (based on fluorescence signal in a membrane-bounded organelle that, based on its curvature radius, should be more under tension then the more convoluted EHT pol+ cell luminal membrane). Also, EHT pol – cells may be born from HE cells that either inherit from less intracellular vesicles after division (or that are derived from HE cells that are less – or not - exposed to polarity-dependent signaling (see our data presented in the new Figure 4 and the new version of the Discussion (see paragraphs ‘Characteristics of the HE and complexity of pre-hematopoietic stem cell emergence’ and ‘Spatially restricted control of Pard3ba mRNAs by Runx1’).

      Finally, the cartoon Figure 2B is a hypothetical model, consistent with our data, and that is meant to help the reader to understand the idea extrapolated from images that may not be so easy to interpret for people not working on this system. In legend of Figure 2 that describes this issue in the first version of our manuscript (lines 1241-1243), we were cautious and wrote, in parentheses: ‘note that exocytosis of the large vacuolar structure may have contributed to increase the surface of the apical/luminal membrane (the green asterisk labels the lumen of the EHT pol + cell’.

      The legend to Figure 2 – figure supplement 1 is not missing (see lines 1492 – 1499 of the first manuscript). The images of this supplement are not extracted from a time-lapse sequence and show that as early as 30hpf (shortly after the beginning of the EHT time-window – around 28hpf), cells on the aortic floor already exhibit podxl2-containing pseudo-vacuolar structures (which we propose is a prerequisite for HE cell maturation into EHT competent cells; see also Figure 2 – figure supplement 2).

      (6) The title of the paper "Tuning apico-basal polarity and junctional recycling in the hemogenic endothelium orchestrates pre-hematopoietic stem cell emergence complexity" could be interpreted as functional heterogeneity within the HSCs, which is not demonstrated in this work. A more conservative title denoting that there are two types of EHT from the DA could avoid misinterpretations and be more appropriate.

      There was no ambiguity, throughout our initial manuscript, on what we meant when using the word ‘emergence’; it refers only to the extrusion process from the aortic floor.

      Reducing our title only to the 2 types of EHT cells would be very reductionist in regard to our work that also addresses essential aspects of the interplay between hemogenic cells, cells undergoing extrusion (EHT pol+ and pol- cells), and their endothelial neighbors (not to mention what we show in terms of the cell biology for the maturing HE and the regulation of its interface with endothelial cells (evidence for vesicular trafficking, specific regulation of HE-endothelial cell intercalation required for EHT progression etc … ). However, and to take this specific comment into account, we propose a slightly changed title saying that there are emergences differentially characterized by their morphodynamic characteristics:

      ‘Tuning apicobasal polarity and junctional recycling in the hemogenic endothelium orchestrates the morphodynamic complexity of emerging pre-hematopoietic stem cells’

      (7) There are several conclusions not supported by data: "Finally, we have estimated that the ratio between EHT pol+ and EHT pol- cells is of approximately 2/1". "We observed that both EHT pol+ and EHT pol- cells divide during the emergence and remain with their respective morphological characteristics". "We also observed that both EHT pol+ and EHT pol- cells express reporters driven by the hematopoietic marker CD41 (data not shown), which indicates that they are both endowed with hematopoietic potential." These conclusions are key in the paper, and therefore they should be supported by data.

      Most of the requests of the Reviewer in this point have already been asked in point 4 and were added to the revised version.

      Regarding the EHT pol+/pol- ratio, we will keep the ratio to approximately 2/1. The Reviewer should be aware that quantification of EHT cells is a tricky issue and a source of important variability, as can be assessed by the quantifications that we have been performing (see for example figures in which we compare the dt-Runx1 phenotype with Ctrl). This is inherent to this system, more specifically because the EHT process is asynchronous, ranging from approx. 28 hpf to 3 days post fertilization (we have even observed EHT at 5 dpf). We systematically observed heterogeneity in EHT numbers and EHT types between animals and also between experiments (some days we observe EHTs at 48 hpf, others more around 55 hpf or even later). In addition, emergence also proceeds on the lateral side of the aorta and, while it is relatively easy to identify EHT pol+ cells because of their highly characterized morphology, it is more difficult for EHT pol- cells that can be mistaken to round HE cells preparing for division. In the current revision of our work, we provide additional facts and potential explanations on the mechanisms that control this asynchrony and the apparent stochasticity of the EHT process (see results of new Figures 3 and 4).

      Reviewer #2 (Public Review):

      In this study, Torcq and colleagues make careful observations of the cellular morphology of haemogenic endothelium undergoing endothelial to haematopoietic transition (EHT) to become stem cells, using the zebrafish model. To achieve this, they used an extensive array of transgenic lines driving fluorescent markers, markers of apico-basal polarity (podocalixin-FP fusions), or tight junction markers (jamb-FP fusions). The use of the runx truncation to block native Runx1 only in endothelial cells is an elegant tool to achieve something akin to tissuespecific deletion of Runx1. Overall, the imaging data is of excellent quality. They demonstrate that differences in apico-basal polarity are strongly associated with different cellular morphologies of cells undergoing EHT from HE (EHT pol- and EHT pol+) which raises the exciting possibility that these morphological differences reflect the heterogeneity of HE (and therefore HSCs) at a very early stage. They then overexpress a truncated form of Runx1 (just the runt domain) to block Runx1 function and show that more HE cells abort EHT and remain associated with the embryonic dorsal aorta. They identify pard3aa and pard3ab as potential regulators of cell polarity. However, despite showing that loss of runx1 function leads to (late) decreases in the expression of these genes, no evidence for their role in EHT is presented. The FRAP experiments and the 2d-cartography, albeit very elegant, are difficult to interpret and not very clearly described throughout the text, making interpretation difficult for someone less familiar with the techniques. Finally, while it is clear that ArhGEF11 is playing an important role in defining cell shapes and junctions between cells during EHT, there is very little statistical evidence to support the limited data presented in the (very beautiful) images.

      As mentioned in the response to reviewer 1, we revised our whole strategy for the analysis of the role of Pard3 proteins in regulating the emergence of hematopoietic precursors. Our new data, obtained using refined gene expression analysis by qRT-PCR on FACS sorted populations and by in situ gene expression analysis at the single-cell resolution using RNAscope, show first that a unique Pard3 isoform (Pard3ba) is sensitive to runx1 activity, and that its expression is specifically localized in aortic cells contacting hemogenic(HE)/EHT cells. We show a clear correlation between the densification of Pard3ba mRNAs and the presence of contacting HE/EHT cells, suggesting a key role for Pard3ba in a cross talk between aortic and hemogenic cells. Furthermore, we show that our dt-runx1 mutant impacts on the maturation of HE cells; when this mutant is expressed, we observe, in comparison to control, an accumulation of HE cells that are abnormally polarized as well as unusually high numbers of EHT pol+ cells. This strongly suggests that the polarity status of HE cells controls the mode of emergence. Overall, our work shows that regulation of apico-basal polarity features is essential for the maturation of the HE and the proper proceeding of the EHT.

      We made efforts to explain more clearly the FRAP experiments as well as the analysis of 2Dcartography throughout the text to facilitate readers comprehension. 2D-cartography are an invaluable tool to precisely discriminate between endothelial and hemogenic cells, and their usage was essential during the FRAP sessions, to point at specific junctional complexes accurately. Performing FRAP at cellular junctions during aortic development was extremely challenging technically and the outcome subjected to quite significant variability (which often leads to quantitative results at the limit of the statistical significance, which is why we speak of tendencies in our results section reporting on this type of experiments). Apart from constant movement and drifting of the embryos which are sources of variability, the EHT process per se is evolving over time and does so at heterogeneous pace (for example, the apical closure of EHT pol+ cells is characterized by a succession of contraction and stabilization phases, see Lancino et al. 2018) which is an additional source of variability in the measurements. Despite all this, our data collectively and consistently suggest a differential regime of junctional dynamics between EHT cell types and support the critical function of ArhGEF11/PDZ-RhoGEF in the control of junctional turnover at the interface between HE and aortic cells as well as between HE cells to regulate cell-cell intercalation.

      There is a sense that this work is both overwhelming in terms of the sheer amount of imaging data, and the work behind it to generate all the lines they required, and at the same time that there is very little evidence supporting the assertion that pard3 (and even ArhGEF11) are important mediators of cell morphology and cell fate in the context of EHT. For instance, the pard3 expression data, and levels after blocking runx1 (part of Figure 3 and Figure 4) don't particularly add to the manuscript beyond indicating that the pard3 genes are regulated by Runx1.

      We thank the reviewer for the comment on the Pard3 data particularly because it led us to reconsider our strategy to address with more precision and at the cellular resolution the potential function of this protein family during the time-window of the EHT. As summarized in the header of the Public Review, we identified one specific isoform of Pard3 in the zebrafish - Pard3ba – whose sensitivity to runx1 interference and spatial restriction in expression reinforce the idea of a fine control of apico-basal polarity features and associated functions while EHT is proceeding. Our new data also reinforce the interplay between HE/EHT cells and their direct endothelial neighbors.

      Weaknesses

      The writing style is quite convoluted and could be simplified for clarity. For example, there is plenty of discussion and speculation throughout the presentation of the results. A clearer separation of the results from this speculation/discussion would help with understanding. Figures are frequently presented out of order in the text; modifying the figures to accommodate the flow of the text (or the other way around) - would make it much easier to follow the narrative. While the evidence for the different cellular morphologies of cells undergoing EHT is strong, the main claim (or at least the title of the manuscript) that tuning apico-basal polarity and junctional recycling orchestrate stem cell emergence complexity is not well supported by the data.

      We refined our text when necessary, in particular taking care of transferring and substantiating the arguments that appeared in the Results section, to the Discussion. We also made efforts, on several occasions and for clarity, to describe more precisely the results presented in the different panels of the Figures.

      As mentioned in the header of the text of the Public Review and the response to the 6th point of the Public Review of Reviewer 1, we modified slightly the title to avoid ambiguity. In addition, we added a new paragraph to the beginning of our discussion that summarizes the impact of our findings and, we believe, legitimates our title.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Embryonic stages should be indicated in all images presented for clarification.

      We thank the reviewer for this point, we added stages when missing on the figures (Figure 1, Figure 1 - Figure supplement 1, Figure 2, Figure 2 - Figure supplement 1, Figure 5, Figure 6, Figure 6 - Figure supplement 1, Figure 7 - Figure supplement 3, Figure 7 - Figure supplement 5, Figure 7 - Figure supplement 6)

      (2) In which anatomical site/s were images from Fig 1C and D taken? The surrounding environment looks different, for example, cells in Fig1D seem to be surrounded by other cells, resembling the endothelial plexus at the CHT, while the cells in Fig. 1C seem to be in the dorsal aorta. Is there a spatial difference depending on where cells are budding off? The authors state that there are no differences, but no quantification or data demonstrating that statement is provided.

      As mentioned in the figure legend (lines 1206-1209 of the original manuscript), images for Figure 1C and 1D were both taken at the boundary between the end of the AGM and the entry in the caudal hematopoietic tissue. As the images were acquired from different embryos, the labelling of the underlying vein differs between the two panels, with veinous tissues being more sparsely labelled in panel C than in panel D. These images were chosen to illustrate the clearly opposite morphology between the two EHT types that we describe. However, for the rest of the paper, all images and all analysis were exclusively acquired / performed in the dorsal aorta in the AGM, in a region spanning over approximately 10-12 inter-segmentary vessels, starting from the end of the elongated yolk up to the start of the balled yolk. In light of the work from the lab of Zilong Wen showing that only cells emerging anteriorly exhibit long-term replenishment potential (Tian et al. 2017), we specifically chose to limit our comparative analysis to the AGM region and did not quantitatively investigate emergences occurring in the caudal region of the aorta. Additionally, although we routinely observe both types of emergences occurring in the caudal region of the dorsal aorta, we did not quantify the frequency of either EHT events in this region.

      Finally, the EHT pol+ cells that we show Figure 1C are of the highest quality obtained ever; one reason is that these two cells emerge at the entry of the CHT which is a region a lot easier to image at high resolution in comparison to the trunk because the sample is less thick and because we are less perturbed by heart beats.

      (3) Which figure shows "EHT pol- cells were observed in all other Tg fish lines that we are routinely imaging, including the Tg(Kdrl:Gal4;UAS:RFP) parental line that was used for transgenesis, thus excluding the possibility that these cells result from an artefact due to the expression of a deleted form of Podxl2 and/or to its overexpression."? It would be informative to include this figure.

      Other examples of EHT pol- cells were shown Figure 5C as well as Figure 6B using the Tg(kdrl:Jam3b-eGFP; kdrl:nls-mKate2) fish line, that was routinely used for junctional dynamic analyses by FRAP. Furthermore, we add now a new figure (New Figure 1 – figure supplement 3), to illustrate the presence of EHT pol- cells using the Tg(CD41:eGFP) transgenic background, additionally illustrating that EHT pol- cells are CD41 positive.

      (4) Are the spinning disk confocal images a single plane? Or maximum projections? Sometimes this is not specified.

      We made sure to take into account this remark and went through all figures legends to specify the type of images presented (Figure 1 – figure supplement 1, Figure 2, Figure 2 – figure supplement 1, Figure 2 – figure supplement 2, Figure 7 – figure supplement 3) and also, when relevant, we added this information directly to the figure panels (Figure 6A – 6B).

      (5) Could the expression data by RT-qPCR for the Pard3 isoforms be shown? Additionally, it would be appreciated if this expression data could be complemented using Daniocell (https://daniocell.nichd.nih.gov/).

      As mentioned in the first paragraph of our response to Public Reviews, and based on reviewers’ comments, we revised our strategy for the investigation of pard3 proteins expression in the vascular system, for their potential role in EHT and sensitivity to runx1. First, we used FACS sorting as well as tissue dissection to enrich in aortic endothelial cells and perform our qPCR analyses (see the new Figure 4 – figure supplement 1A and Figure 4 – figure supplement 3A for the strategy). As asked by the reviewers and for more transparency, we show the expression relative to the housekeeping gene ef1a in our different control samples (new Figure 4 – figure supplement 1C). Furthermore, we used single-molecule FISH to precisely characterise in situ the expression of several of the Pard3 isoforms (Pard3aa, Pard3ab and Pard3ba, which, based on qPCR, were the most relevant for our investigation in the vascular system) (see lines 386 to 412 in text relative to Figure 4 – figure supplement 2). This new addition nicely shows the different pattern of expression of 3 of the Pard3 zebrafish isoforms in the trunk of 2dpf embryos, outlining interesting specificities of each isoform expression in different tissues.

      We thank the reviewer for this suggestion to complement our data with the published Daniocell dataset. However, and potentially due to the poor annotation of the different pard3 genes on public databases, gene expression information was absent for two of our isoforms of interest (pard3aa and pard3ba), that we ultimately show to be the most enriched in the vascular system in the trunk. Daniocell gene expression data for the Pard3ab isoform at 48hpf show expression in pronephric duct at 48-58hpf, as well as in intestine progenitors and neuronal progenitors, which is consistent with our in situ observations using RNAscope. However, pard3ab is poorly detected within the hematopoietic and vascular clusters. This observation is coherent with our data that do not show any enrichment of this isoform in vascular tissues compared to other structures. On the other hand, pard3bb does not seem to be particularly enriched in vascular/hematopoietic clusters at 48-58hpf in the Daniocell dataset, in accordance to what we observe with our qPCR. Finally, in the Daniocell dataset, all of the pard3 variants (pard3ab, pard3bb, PARD3 and PARD3 (1 of many)) seem to be either scarcely or not detected in the hematopoietic/vascular system. In our case, for all the isoforms we studied in control condition (pard3aa, pard3ab and pard3ba), and although the technic is only semi-quantitative due to the presence of an amplification step, RNAscope assays seem to indicate a very low expression in aortic cell (with sometime as little as one mRNA copy per cell; this explains low detection in single-cell RNAseq datasets and is coherent with the Daniocell dataset.

      (6) It would be informative to add in the introduction some information on apico-basal polarity, tight junctions, JAMs (ArhGEF11/PDZ-RhoGEF).

      We modified the introduction so as to add relevant information on Pard3 proteins, their link with our JAMs reporters in the context of polarity establishment, as well as the role of ArhGEF11/PDZ-RhoGEF and its alternative splicing variants in regulating junctional integrity in the context of epithelial-to-mesenchymal transition (lines 99 to 127). This modification of the introduction also allowed us to lighten some parts of the result section (lines 222 to 224, 345 to 349 and 454 to 456 of the original manuscript).

      Reviewer #2 (Recommendations For The Authors):

      (1) There is lots of data (and lots of work) in this paper; I feel that the pard3 data doesn't substantially add to the paper, and at the same time there is data missing (see point 10, point 11 below for an example).

      To add to the clarity and substantiate our findings on Pard3, we revised entirely our investigation strategy as mentioned in previous paragraphs. We refined the characterization of Pard3 isoforms expression in the vascular tissue, using both cell enrichment by FACS for gene expression analysis as well as single-molecule FISH (RNAscope) to access to spatial information on the expression of pard3 isoforms, reaching sub-cellular resolution.

      This new strategy allowed us to show the unexpected localization of Pard3ba mRNAs in mRNAs enriched regions in the vicinity of HE/EHT cells (new Figure 4, and paragraph Interfering with Runx1 activity unravels its function in the control of Pard3ba expression and highlights heterogeneous spatial distribution of Pard3ba mRNAs along the aortic axis, see the new manuscript). Overall, the new spatial analysis we performed allowed us to substantiate our findings on Pard3ba and suggests a direct interplay between hemogenic cells and their endothelial aortic neighbors; this interplay supposedly relies on apico-basal polarity features that is at least in part regulated by runx1 in the context of HE maturation and EHT.

      (2) Labelling of the figures could be substantially improved. In many instances, the text refers to a figure (e.g. Fig 6A), but it has several panels that are not well annotated (in the case of Fig 6A, four panels) or labelled sparsely in a way that makes it easy to follow the text and identify the correct panel in the figure. Even supplementary figures are sparsely labelled. Labelling to include embryonic stages, which transgenic is being used, etc should be added to the panels to improve clarity for the reader.

      We revised the figures to added relevant information, including stages, types of images and annotations to facilitate the comprehension, including Figure 6A – 6B, Figure 5B – 5C (see response to Reviewer 1, first comment, for a more complete list of all revised figures, transgenic fish lines and embryonic stages annotations). Furthermore, we revised the integrality of the manuscript to fit as much as possible to the figures and added some annotations to more easily link the text to the figures and panels.

      (3) The current numbering of supplementary figures is quite confusing to follow.

      We revised the manuscript so as to make sure all principal and supplementary figures were called in the right order and that supplementary figures appearance was coherent with the unfolding of the text. For Figure 7 only, the majority of the supplemental figures are called before the principal figure, as they relate to our experimental strategy that we comment on before describing the results.

      (4) Graphs in Fig 4, Fig 7 supplement 1 and some of the supplementary figures miss statistical info for some comparison (I assume when non-significant), and sometimes present a p-value of a statistical test being done between samples across stages - but these are not dealt with in the text. Throughout all graphs, the font size used in graphs for annotation (labelling of samples, x-axis, and in some cases the p values) is very small and difficult to read.

      For Figure 7 - figure supplement 1, non-significant p-values of statistical tests were not displayed (as mentioned in the Figure legend, line 1614 of the original manuscript). For the new Figure 4, all p-values are displayed. For new Figure 4 - figure Supplement 1, statistical tests were only performed to compare RFP+ and RFP- cells in the trunk condition (3 biological replicates) and not in the whole embryo condition, for which we did not perform enough replicates for statistical analysis (biological duplicates).

      (5) The results are generally very difficult to follow, with a fair amount of discussion included but then very little detail of the experiments per se.

      We thank the reviewers for these comments that helped us improve the clarity of the manuscript.

      The Results section was revised to move some of the paragraphs to the introduction (see response to Reviewer 1, 6th comment), and some of them to the Discussion (such as lines 149 to 156 or 410 to 416 in the first version of the manuscript referring to vacuolar structures or to the recycling modes of JAMs in EHT pol+ and EHT pol- cells).

      (6) The truncated version of runx1 is introduced but its expected effect is not explained until the discussion. Related to this, is it expected that blocking runx1 with this construct (leading to accumulation of cells in the aorta before they undergo EHT) then leads to increased numbers of T-cell progenitors in the thymus? Abe et al (2005, J Immunol) have used the same strategy to overexpress the runt domain in thymocytes and found a decrease in these cells, rather than an increase. Can you explain this apparent discrepancy?

      We thank the reviewer for this interesting point on the effect of runx1 interference. This phenotype (increased number of thymic cells) seems to be in agreement with the phenotype that was described in zebrafish using homozygous runx1 mutants (Sood et al. 2010 PMID: 20154212), in which the authors show an increase of lymphoid progenitors in the kidney marrow of adult runx1W84X/W84X mutants compared to controls as well as a similar number of intra-thymic lck:eGFP cells in mutants and controls. Notably, the T-lymphoid lineage seems to be the only lineage spared by the mutation of runx1. This could suggest that in this case either the T-lymphoid lineage can develop independently of runx1 or that a compensation phenomenon (for example by another protein of the runx family) occurs to rescue the generation of T-lymphocytes.

      Although our data shows an impact on T-lymphopoiesis, we do not elucidate the exact mechanism leading to an increased number of thymic cells. In our case, we do not know the half-life of our dt-runx1 protein in newly generated hematopoietic cells when our transgene, expressed under the control of the kdrl vascular promoter, ceases to be produced after emergence. The effect we observe could be direct, due to the presence of our mutant protein after 3 days in thymic cells, or indirect, due to the impact of our mutant on the HE, that could lead to the preferential generation of lymphoid-biased progenitors. Similarly, we do not know whether the cells we observe at this stage in the thymus are generated from long-term HSC or short-term progenitors. Indeed, cell tracing analysis from the lab of Zilong Wen (Tian et al. 2017, see our Ref list) show the simultaneous presence of short-term PBI derived and longterm AGM derived thymic cells at 5dpf. Based on this, we can imagine for example that the sur-numerous cells we observe in the thymus are transient populations that could multiply faster in the absence of definitive populations. Conversely, based on our observation of an accumulation of EHT pol+ events, we can imagine that the EHT pol+ and EHT pol- cells are indeed differentially fated and that EHT pol+ may be biased toward a lymphoid lineage. We also know that at the stage we observe (5dpf), RNAscope assay of runx1 show that a vast majority of thymic cells do not express runx1 (our preliminary data), suggesting that the effect we observe would be an indirect one caused by upstream events rather than by direct interference with the endogenous expression of runx1 in thymic cells.

      The article referred to by the reviewer (Sato et al. 2005, PMID: 16177090) investigates on the role of runx1 during TCR selection for thymic cell maturation and shows that runx1 signaling lowers the apoptotic sensitivity of double-positive thymocytes when artificially activated, leading to a reduced number of single-positive thymic cells. Furthermore, this paper references another study from the same lab (Hayashi et al. 2000, PMID: 11120804) that used the same strategy to study the role of runx1 on the positive and negative selection steps of T lymphocytes maturation. This paper, although showing that runx1 is important for later stages of T lymphocytes differentiation — the double-positive to single-positive stage maturation —, also shows a relative increase in the amount of double-negative and double-positive thymocytes, that could be coherent with our observations. Indeed, in our case, although we show an increased number of thymic cells, we do not know the relative proportion of the different thymocyte subsets. We could explain the increased number of thymic cells by increased number of DN/DP thymocytes that would not preclude a decrease in single-positive thymocytes. Finally, the cells we observe in the thymus of our dt-runx1 mutants may also be different lymphoid populations, namely ILCs, that would react differently to runx1 interference.

      (7) Lines 154-155 refer to aquaporins but are missing a reference. This is a bit of speculation right in the results section and I struggled to understand what the point of it was.

      To clarify the argument and ease the flow of the text, as suggested by the reviewers, we transferred this paragraph (lines 149 to 156 of the initial manuscript) to the Discussion section lines 763-789). We additionally made sure to add the missing reference (Sato et al. 2023, see our Ref list).

      (8) Lines 173-175, indicating that both EHTpol+ and pol- express the CD41 transgenic marker - would be useful to show this data.

      We provide a new supplement Figure (Figure 1 – figure supplement 3), where, using an outcross of the CD41:eGFP and kdrl:mKate2-podxl2 transgenic lines, we show unambiguously and for multiple cells that both polarized EHT pol+ cells and non-polarized EHT pol- cells are CD41 positive. In addition, but not commented on in the main text, we can also see that an HE cell, characterized by its elongated morphology (in the middle of the field), its thickened nucleus and its position on the aortic floor, is also CD41 positive.

      (9) Lines 181-201 - it's not clear how HE cells were identified in the first place - was it just morphology? Or were they identified retrospectively?

      HE cells were identified solely on morphology and spatial criteria (as mentioned in the Methods section, lines 1073-1082 and 1108-1111 of the first manuscript). Furthermore, a recent investigation by the lab of Zilong Wen (Zhao et al. 2022, see our Ref list) questioning the common origin of HE cells and of endothelial cells as well as their respective capacity to extrude from the aorta to generate hematopoietic cells showed, by single-cell tracing, that 96% of floor cells are indeed hemogenic endothelial cells. Furthermore, as mentioned in the response to the 8th point, we show in Figure 1 – figure supplement 3 that all floor cells express CD41. Finally, we also used an alternative method to validate the true hemogenic identity of aortic floor cells and show, using RNAscope, that virtually 100% of floor cells that we consider as typical HE cells are indeed expressing an hematopoietic transcription factor upstream of Runx1, namely Gata2b (see Author response image 1).

      Author response image 1.

      All cells from the aortic floor, at 48hpf, express the hematopoietic marker Gata2b. 48 hpf Tg(Kdrl:eGFP) fixed embryos were used for RNAscope using a probe designed to detect Gata2b mRNAs. Subsequently, images were taken using spinning disk confocal microscopy. The image in the top panel is a z-projection of the entire aortic volume of one embryo and shows the full portion of the dorsal aorta from the anterior part (left side, at the limit of the balled yolk) down to the urogenital orifice (UGO, right side). The 4 boxes (1 - 4) delineate regions that have been magnified beneath (2X). The 2X images corresponding to each box are z-projections (top views) or z-sections (bottom views). The bottom views allow to visualize the aortic floor and to mark its position on top views). Pink arrows point at HE cells (elongated in the anteroposterior direction) and at EHT cells (ovoid/round cells; EHT pol+ cell morphology is not preserved after fixation and RNAscope; thus, it cannot be distinguished from ovoid/round EHT pol- cells). Pink dots = RNAscope spots of various sizes. The green cells in the subaortic space that are marked by RNAscope spots are newly born hematopoietic stem and progenitor cells (see for example box 1). This embryo is representative of n = 5 embryos treated and imaged.

      (1) Line 276 - the difference between the egfp-podxl2 and mKate-podxl2 - could that be due to the fluorophore used? Also, it would be good to label Fig 3 supplement 2 better and to see a control alongside the runt overexpression.

      Line 276 does not point at a difference in control conditions between eGFP-podxl2 and mKatepodxl2 (see in new Figure 1 – figure supplement 3, Figure 2 or in new Figure 3 - figure supplement 2 several examples of non-polarized HE cells in control conditions using both fluorophores) but between control and dt-runx1 conditions, both expressing the mKate2podxl2 transgene. Similarly, the new example that we provide now in the CD41 figure (Figure 1 – figure supplement 3) clearly shows that mKate-podxl2 is enriched at the apical/luminal membrane of EHT pol+ cells while no such enrichment is observed for EHT pol- cells. The Reviewer should be informed that EHT cells are not always the most typical in shape, in particular because cells can be squeezed by underlying tissues and for example the vein; or from the luminal side by flow and tensions on the aortic wall because of heart beat (the more we image up in the trunk, the more difficult the imaging and the stability of cell shape during long time-lapse sequences). To also take into account the reviewer’s comments, we added for the new Figure 3 – figure supplement 2A a control condition next to the dt-runx1 condition.

      (2) There is no quantitation data on the number of excess EHT pol+ cells in the DA, or in the thymus data (Figs 3 Supp1 and Fig 3 Supp 3). Can you quantify this data? This would better support the claim that tunin apico-basal polarity alters the morphology of the emerging HE cells.

      We added quantifications relative to both the emergence process itself, showing the accumulation of HE and EHT pol+ cells (new Figure 3B), and on hematopoiesis per se (new Figure 3 – figure supplement 1). Indeed, we show a diminution in the number of newly generated cmyb+ cells in the sub-aortic space. Furthermore, we improved our quantification of the later phenotype on the thymus (new Figure 3 – figure supplement 3), using improved segmentation methods, that indeed validate the increase number of thymic cells that we described.

      (3) The observed changes in pard3 isoforms are just reading out changes in their expression in the runt1 transgenics, rather than demonstrating a role in apico-basal polarity.

      We entirely revised our strategy regarding Pard3 expression analyses (see also the text at the beginning of this file, for the Public Review). But we wish to stress on the point that we did not intend initially to show directly a role of Pard3 proteins in controlling apico-basal polarity in the system, we just intended to provide correlative evidence supporting our observations with the polarity marker podxl2 (by interfering with their function, as written in the text, apico-basal polarity - which is essential for aortic lumenization and maintenance -, would have been impaired, blurring interpretations).

      During the revision, we obtained the unexpected finding, using RNAscope, that one Pard3 isoform, namely Pard3ba, is the one Pard3 that is expressed non-homogenously along the aortic axis and, in vast majority, by aortic cells and in the direct vicinity of emergence domains of the aortic floor (see the new Figure 4 and Figure 4 – figure supplements 2, 3).

      This correlative relation between expression of Pard3ba in aortic endothelial cells neighbouring HE/EHT cells suggests, as we propose, that a cross talk occurs between hemogenic and aortic cells, and that this cross talk relies, at least in part, on the expression of key components of apico-basal polarity and their associated functional features. In addition, we show that junctional recycling differs between both EHT types, based on our observations on the different dynamics in the turnover of JAM molecules, in the two EHT types. As JAM molecules are also required for the recruitment of Pard3, which initiates the establishment of apico-basal polarity, these different dynamics suggest that the control of apico-basal polarity is involved in supporting the morphodynamic complexity of EHT cell types.

      (4) There is a Fig 5, Supp 2 that is neither mentioned nor described anywhere in the manuscript.

      Figure 5 - figure Supplement 2 is mentioned lines 366-370 of the original manuscript, to describe the initial validation that was performed for our eGFP-JAM constructs in multiple cell types using an ubiquitous heat-shock promoter. We developed our description of this supplemental figure in the new manuscript (lines 504 to 514).

      (5) Lines 445-456 - these read like a bit of discussion, not results. There are other similar parts of the results section that also read like a discussion (e.g. 526-533)

      Although we decided to keep this paragraph in the Results section, as it justifies the rationale behind the choice of ArhGEF11/PDZ-RhoGEF, we took the reviewers comment into account and, as mentioned in the response to reviewer 1 6th comment, lightened the Results section by transferring some of the paragraphs to the Introduction or Discussion sections.

      (6) The description of Fig 7A (from line 505) is missing the stages at which the experiments were performed (also not labelled on the figure).

      The stages at which the experiments were performed is stated in the figure legend (line 1366) as well as in the Methods section of the original manuscript (line 1033). We added the information on top of the panels A and B for more clarity.

      (7) Some figures have multiple panels (e.g. Fig 7Aa'), so when referred to in the text, it remains unclear which panel is being referred to.

      We modified the text so as to refer more clearly to the different panels when mentioned in the text, particularly with regards to Figure 7 and 8 but also for all the other figures.

    2. eLife assessment

      This important study presents a detailed characterization of two distinct cellular morphologies of haematopoietic stem cells undergoing endothelial to haematopoietic transition in zebrafish. It brings new information on how regulation of apico-basal polarity influences cellular behaviour, shape, and interaction with neighbouring cells. The evidence supporting the existence of these two distinct morphologies is convincing, using state-of-the-art confocal microscopy and image analysis of 2D-cartography.

    3. Reviewer #2 (Public Review):

      In this study, Torcq and colleagues make carefull observations of the cellular morphology of haemogenic endothelium undergoing endothelial to haematopoietic transition (EHT) to become stem cells, using the zebrafish model. To achieve this, the used an extensive array of transgenic lines driving fluorescent markers, markers of apico-basal polarity (podocalixin-FP fusions) or tight junction markers (jamb-FP fusions). The use of the runx truncation to block native Runx1 only in endothelial cells is an elegant tool to achieve something akin to tissue-specific deletion of Runx1. Overall, the imaging data is of excellent quality. They demonstrate that differences in apico-basal polarity are strongly associated with different cellular morphologies of cells undergoing EHT from HE (EHT pol- and EHT pol+) which raises the exciting possibility that these morphological differences reflect heterogeneity of HE (and potentially HSCs, but this is not addressed in this manuscript) at a very early stage. They then overexpress a truncated form of Runx1 (just the runt domain) to block Runx1 function and show that more HE cells abort EHT and remain associated with the embryonic dorsal aorta. The revised version identifies pard3ab as differentially distributed in dtRunx mutants and correlates that distribution with a potential regulatory role on cell polarity. No direct evidence for their role in EHT is presented.

      The manuscript has now been streamlined and reference to figures made much clearer. It provides for a clearer reading, and clearly a well thought out discussion of HE, polarity and the regulation of the EHT process. The evidence for the different cellular morphologies of cells undergoing EHT is strong, and the main claim that tuning apico-basal polarity and junctional recycling underlie morphological complexity of EHT (rather than of HSCs) is well supported by the data.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This study presents valuable data on the antigenic properties of neuraminidase proteins of human A/H3N2 influenza viruses sampled between 2009 and 2017. The antigenic properties are found to be generally concordant with genetic groups. Additional analysis have strengthened the revised manuscript, and the evidence supporting the claims is solid.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary

      The authors investigated the antigenic diversity of recent (2009-2017) A/H3N2 influenza neuraminidases (NAs), the second major antigenic protein after haemagglutinin. They used 27 viruses and 43 ferret sera and performed NA inhibition. This work was supported by a subset of mouse sera. Clustering analysis determined 4 antigenic clusters, mostly in concordance with the genetic groupings. Association analysis was used to estimate important amino acid positions, which were shown to be more likely close to the catalytic site. Antigenic distances were calculated and a random forest model used to determine potential important sites.

      This revision has addressed many of my concerns of inconsistencies in the methods, results and presentation. There are still some remaining weaknesses in the computational work.

      Strengths

      (1) The data cover recent NA evolution and a substantial number (43) of ferret (and mouse) sera were generated and titrated against 27 viruses. This is laborious experimental work and is the largest publicly available neuraminidase inhibition dataset that I am aware of. As such, it will prove a useful resource for the influenza community.

      (2) A variety of computational methods were used to analyse the data, which give a rounded picture of the antigenic and genetic relationships and link between sequence, structure and phenotype.

      (3) Issues raised in the previous review have been thoroughly addressed.

      Weaknesses

      (1). Some inconsistencies and missing data in experimental methods Two ferret sera were boosted with H1N2, while recombinant NA protein for the others. This, and the underlying reason, are clearly explained in the manuscript. The authors note that boosting with live virus did not increase titres. Additionally, one homologous serum (A/Kansas/14/2017) was not generated, although this would not necessarily have impacted the results.

      We agree with the reviewer and this point was addressed in the previous rebuttal.

      (2) Inconsistency in experimental results

      Clustering of the NA inhibition results identifies three viruses which do not cluster with their phylogenetic group. Again this is clearly pointed out in the paper and is consistent with the two replicate ferret sera. Additionally, A/Kansas/14/2017 is in a different cluster based on the antigenic cartography vs the clustering of the titres

      We agree with the reviewer and this point was addressed in the previous rebuttal.

      (3) Antigenic cartography plot would benefit from documentation of the parameters and supporting analyses

      a. The number of optimisations used

      We used 500 optimizations. This information is now included in the Methods section.

      b. The final stress and the difference between the stress of the lowest few (e.g. 5) optimisations, or alternatively a graph of the stress of all the optimisations. Information on the stress per titre and per point, and whether any of these were outliers

      The stress was obtained from 1, 5, 500, or even 5000 optimizations (resulting in stress values of respectively, 1366.47, 1366.47, 2908.60, and 3031.41). Besides limited variation or non-conversion of the stress values after optimization, the obtained maps were consistent in multiple runs. The map was obtained keeping the best optimization (stress value 1366.47, selected using the keepBestOptimization() function).

      Author response image 1.

      The stress per point is presented in the heat map below.

      The heat map indicates stress per serum (x-axis) and strain (y-axis) in blue to red scale.

      c. A measure of uncertainty in position (e.g. from bootstrapping)

      Bootstrap was performed using 1000 repeats and 100 optimizations per repeat. The uncertainty is represented in the blob plot below.

      Author response image 2.

      (4) Random forest

      The full dataset was used for the random forest model, including tuning the hyperparameters. It is more robust to have a training and test set to be able to evaluate overfitting (there are 25 features to classify 43 sera).

      Explicit cross validation is not necessary for random forests as the out of bag process with multiple trees implicitly covers cross validation. In the random forest function in R this is done by setting the mtry argument (number of variables randomly sampled as candidates at each split). R samples variables with replacement (the same variable can be sampled multiple times) of the candidates from the training set. RF will then automatically take the data that is not selected as candidates as test set. Overfit may happen when all data is used for training but the RF method implicitly does use a test set and does not use all data for training.

      Code:

      rf <- randomForest(X,y=Y,ntree=1500,mtry=25,keep.forest=TRUE,importance=TRUE)

      Reviewer #2 (Public Review):

      Summary:

      The authors characterized the antigenicity of N2 protein of 43 selected A(H3N2) influenza A viruses isolated from 2009-2017 using ferret and mice immune sera. Four antigenic groups were identified, which the authors claimed to be correlated with their respective phylogenic/ genetic groups. Among 102 amino acids differed by the 44 selected N2 proteins, the authors identified residues that differentiate the antigenicity of the four groups and constructed a machine-learning model that provides antigenic distance estimation. Three recent A(H3N2) vaccine strains were tested in the model but there was no experimental data to confirm the model prediction results.

      Strengths:

      This study used N2 protein of 44 selected A(H3N2) influenza A viruses isolated from 2009-2017 and generated corresponding panels of ferret and mouse sera to react with the selected strains. The amount of experimental data for N2 antigenicity characterization is large enough for model building.

      Weaknesses:

      The main weakness is that the strategy of selecting 43 A(H3N2) viruses from 2009-2017 was not explained. It is not clear if they represent the overall genetic diversity of human A(H3N2) viruses circulating during this time. In response to the reviewer's comment, the authors have provided a N2 phylogenetic tree using180 randomly selected N2 sequences from human A(H3N2) viruses from 2009-2017. While the 43 strains seems to scatter across the N2 tree, the four antigenic groups described by the author did not correlated with their respective phylogenic/ genetic groups as shown in Fig. 2. The authors should show the N2 phylogenic tree together with Fig. 2 and discuss the discrepancy observed.

      The discrepancies between the provided N2 phylogenetic tree using 180 selected N2 sequences was primarily due to visualization. In the tree presented in Figure 2 the phylogeny was ordered according to branch length in a decreasing way. Further, the tree represented in the rebuttal was built with PhyML 3.0 using JTT substitution model, while the tree in figure 2 was build in CLC Workbench 21.0.5 using Bishop-Friday substitution model. The tree below was built using the same methodology as Figure 2, including branch size ordering. No discrepancies are observed.

      Phylogenetic tree representing relatedness of N2 head domain. N2 NA sequences were ordered according to the branch length and phylogenetic clusters are colored as follows: G1: orange, G2: green, G3: blue, and G4: purple. NA sequences that were retained in the breadth panel are named according to the corresponding H3N2 influenza viruses. The other NA sequences are coded.

      Author response image 3.

      The second weakness is the use of double-immune ferret sera (post-infection plus immunization with recombinant NA protein) or mouse sera (immunized twice with recombinant NA protein) to characterize the antigenicity of the selected A(H3N2) viruses. Conventionally, NA antigenicity is characterized using ferret sera after a single infection. Repeated influenza exposure in ferrets has been shown to enhance antibody binding affinity and may affect the cross-reactivity to heterologous strains (PMID: 29672713). The increased cross-reactivity is supported by the NAI titers shown in Table S3, as many of the double immune ferret sera showed the highest reactivity not against its own homologous virus but to heterologous strains. In response to the reviewer's comment, the authors agreed the use of double-immune ferret sera may be a limitation of the study. It would be helpful if the authors can discuss the potential effect on the use of double-immune ferret sera in antigenicity characterization in the manuscript.

      Our study was designed to understand the breadth of the anti-NA response after the incorporation of NA as a vaccine antigens. Our data does not allow to conclude whether increased breadth of protection is merely due to increased antibody titers or whether an NA boost immunization was able to induce antibody responses against epitopes that were not previously recognized by primary response to infection. However, we now mention this possibility in the discussion and cite Kosikova et al. CID 2018, in this context.

      Another weakness is that the authors used the newly constructed a model to predict antigenic distance of three recent A(H3N2) viruses but there is no experimental data to validate their prediction (eg. if these viruses are indeed antigenically deviating from group 2 strains as concluded by the authors). In response to the comment, the authors have taken two strains out of the dataset and use them for validation. The results is shown as Fig. R7. However, it may be useful to include this in the main manuscript to support the validity of the model.

      The removal of 2 strains was performed to illustrate the predictive performance of the RF modeling. However, Random Forest does not require cross-validation. The reason is that RF modeling already uses an out-of-bag evaluation which, in short, consists of using only a fraction of the data for the creation of the decision trees (2/3 of the data), obviating the need for a set aside the test set:

      “…In each bootstrap training set, about one-third of the instances are left out. Therefore, the out-of-bag estimates are based on combining only about one- third as many classifiers as in the ongoing main combination. Since the error rate decreases as the number of combinations increases, the out-of-bag estimates will tend to overestimate the current error rate. To get unbiased out-of-bag estimates, it is necessary to run past the point where the test set error converges. But unlike cross-validation, where bias is present but its extent unknown, the out-of-bag estimates are unbiased…” from https://www.stat.berkeley.edu/%7Ebreiman/randomforest2001.pdf

      Reviewer #3 (Public Review):

      Summary:

      This paper by Portela Catani et al examines the antigenic relationships (measured using monotypic ferret and mouse sera) across a panel of N2 genes from the past 14 years, along with the underlying sequence differences and phylogenetic relationships. This is a highly significant topic given the recent increased appreciation of the importance of NA as a vaccine target, and the relative lack of information about NA antigenic evolution compared with what is known about HA. Thus, these data will be of interest to those studying the antigenic evolution of influenza viruses. The methods used are generally quite sound, though there are a few addressable concerns that limit the confidence with which conclusions can be drawn from the data/analyses.

      Strengths:

      • The significance of the work, and the (general) soundness of the methods. -Explicit comparison of results obtained with mouse and ferret sera

      Weaknesses:

      • Approach for assessing influence of individual polymorphisms on antigenicity does not account for potential effects of epistasis (this point is acknowledged by the authors).

      We agree with the reviewer and this point was addressed in the previous rebuttal.

      • Machine learning analyses neither experimentally validated nor shown to be better than simple, phylogenetic-based inference.

      We respectfully disagree with the reviewer. This point was addressed in the previous rebuttal as follows.

      This is a valid remark and indeed we have found a clear correlation between NAI cross reactivity and phylogenetic relatedness. However, besides achieving good prediction of the experimental data (as shown in Figure 5 and in FigureR7), machine Learning analysis has the potential to rank or indicate major antigenic divergences based on available sequences before it has consolidated as new clade. ML can also support the selection and design of broader reactive antigens. “

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      (1) Discuss the discrepancy between Fig. 2 and the newly constructed N2 phylogenetic tree with 180 randomly selected N2 sequences of A(H3N2) viruses from 2009-2017. Specifically please explain the antigenic vs. phylogenetic relationship observed in Fig. 2 was not observed in the large N2 phylogenetic tree.

      Discrepancies were due to different method and visualization. A new tree was provided.

      (2) Include a sentence to discuss the potential effect on the use of double-immune ferret sera in antigenic characterization.

      We prefer not to speculate on this.

      (3) Include the results of the exercise run (with the use of Swe17 and HK17) in the manuscript as a way to validate the model.

      The exercise was performed to illustrate predictive potential of the RF modeling to the reviewer. However, cross-validation is not a usual requirement for random forest, since it uses out-of-bag calculations. We prefer to not include the exercise runs within the main manuscript.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      In the manuscript titled "Disease modeling and pharmacological rescue of autosomal dominant Retinitis Pigmentosa associated with RHO copy number variation" the authors describe the use of patient iPSC-derived retinal organoids to evaluate the pathobiology of a RHO-CNV in a family with dominant retinitis pigmentosa (RP). They find significantly increased expression of rhodopsin, especially within the photoreceptor cell body, and defects in photoreceptor cell outer segment formation/maturation. In addition, they demonstrate how an inhibitor of NR2E3 (a rod transcription factor required for inducing rhodopsin expression), can be used to rescue the disease phenotype.

      Strengths:

      The manuscript is very well written, the illustrations and data presented are compelling, and the authors' interpretation/discussion of their findings is logical.

      Weaknesses:

      A weakness, which the authors have addressed in the discussion section, is the lack of an isogenic control, which would allow for direct analysis of the RHO-CNV in the absence of the other genetic sequence contained within the duplicated region. As the authors suggest, CRISPR correction of a large CNV in the absence of inducing unwanted on-target editing events in patient iPSCs is often very challenging. Given that they have used a no-disease iPSC line obtained from a family member, controlled for organoid differentiation kinetics/maturation state, and that no other complete disease-causing gene is contained within the duplicated region, it is unlikely that the addition of an isogenic control would yield significantly different results.

      Aims and conclusions:

      This reviewer is of the opinion that the authors have achieved their aims and that their results support their conclusions.

      Discussion:

      The authors have provided adequate discussion on the utility of the methods and data as well as the impact of their work on the field.

      We thank the reviewer for their insightful, and encouraging review of our work that has taken several years to get to current stage.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Kandoi et al. describes a new 3D retinal organoid model of a mono-allelic copy number variant of the rhodopsin gene that was previously shown to induce autosomal dominant retinitis pigmentosa via a dominant negative mechanism in patients. With advancements in the low-cost genomics application to detect copy number variations, this is a timely article that highlights a potential disease mechanism that goes beyond the retina field. The evidence is relatively strong that the rod photoreceptor phenotype observed in an adult patient with RP in vivo is similar to that phenotype observed in human stem cell-derived retinal organoids. Increases in RHO expression detected by qPCR, RNA-seq, and IHC support this phenotype. Importantly, the amelioration of photoreceptor rhodopsin mislocalization and related defects using the small molecule drug photoregulin demonstrates an important potential clinical application.

      Overall, the authors succeeded in providing solid evidence that copy number variation via a genomic RHO duplication leads to abnormalities in rod photoreceptors that can be partially blocked by photoregulin. However, there are several points that should be addressed that will enhance this paper.

      Strengths:

      • The use of patient-derived organoids from patients that have visual defects is a major strength of this work and adds relevance to the disease phenotype.

      • The rod phenotype assessed by qPCR, RNA-seq, and IHC supports a phenotype that shares similarities with the patient.

      • The use of a small molecule drug that selectively targets rod photoreceptors, as opposed to cones, is a noteworthy strength.

      We thank the reviewers for highlighting the key strengths of the paper.

      Weaknesses:

      (1) The chromosomal segment that was duplicated had 3 copies of RHO in addition to three copies of each of the flanking genes (IFT122, HIF100, PLXND1). Discussion of the involvement of these genes would be helpful. Would duplication of any of these genes alone cause or contribute to adRP? As an example, a missense mutation in IFT122 was previously implicated in photoreceptor loss (PMID: 33606121 PMCID: PMC8519925).

      Thank you for your comment. It is an interesting question on the contribution of the other duplicated genes. Of these, IFT122 is particularly interesting as pointed out. We did a thorough survey through literature and our genetic testing partner’s database, BluePrint Genetics. We did not find any human retinal degeneration cases with variants in IFT122. IFT122 has been shown to cause recessive phenotype in dogs and in complete knockout zebrafish model but dominant or overexpression has not been shown to have a phenotype. Interestingly, recessive biallelic IFT122 mutation can cause Cranioectodermal Dysplasia (Sensenbrenner syndrome, PMID: 24689072) and none of these patient exhibited retinal dystrophy. HIF100 is an epigenetic modifier gene while PLXND1 is expressed in endothelial cells. We will include a discussion on this in the revised manuscript.

      (2) Related to #1, have the authors considered inserting extra copies of RHO (and/or the flanking genes) of these at a genomic safe harbor site? Although not required, this would allow one to study cells with isogenic-matched genetic backgrounds and would partially address the technical challenge of repairing a 188kb duplication, which as the authors note would be difficult to do. Demonstrating that excess copy numbers in different genetic backgrounds would be a huge contribution to the field. At a minimum, a discussion of the role of the nearby genes should be included. 


      Thank you for your suggestion. We plan to test the relative role of 1-3 extra copies of RHO driven off a NRL promoter in order to drive it only in rods in our future mechanistic analysis studies. We will include a discussion on the potential role of the other genes in the revised manuscript.

      (3) In the patient, the central foveal region was spared suggesting that cones were normal. Was there a similar assessment that cones are unaffected in retinal organoids? 


      We will include this data in our revised manuscript but overall did not see a cone defect in RHO CNV organoids. Additionally, although it is true that the central foveal region was relatively spared in this patient, the cones are definitely not normal. The macular cones that remain have been damaged by chronic edema, and photoreceptor and RPE atrophy has progressed into the macula, sparing only the foveal cones.

      (4) Pathway analysis indicated that glycosylation was perturbed and this was proposed as an explanation as to why rhodopsin was mislocalized. Have the authors verified that there is an actual decrease in glycosylation? 


      These studies are ongoing. We are currently looking into the details of cellular pathophysiology focusing on RHO trafficking in RHO-CNV including role of glycosylation and other post-translational modifications defects.

      (5) Line 182: by what criteria are the authors able to state that " there were no clear visible anatomical changes in apical-basal retinal cell type distribution during the early differentiation timeframe (data not shown)." Was this based on histological staining with antibodies, nuclear counter-staining, or some other evaluation?


      This was based on both IHC for various cell type markers and nuclear (DAPI) staining.

      (6) Figure 2C - the appearance of the inner segments in RC and RM looks very different from one another. Have the authors ruled out the possibility that the RC organoid cell isn't a cone? In addition, the RM structure has what appears to be a well-defined OLM which would suggest well-formed Muller glia. Do these structures also exist in RC organoids? Typically the OLM does form in older organoids. In addition, was this representative in numerous EM preparations?


      For clarification on EM data, we will include additional images in the revision as supplementary data. We have not carefully compared OLM between the patient and control organoids but do observe them in both conditions in the older organoids. The EM preparations were made from multiple organoids from two different batches with consistent results.

      (7) What criteria were used to assess cell loss? Has any TUNEL labeling been performed to confirm cell loss? From the existing data, it seems that rod outer segments appear to be affected in organoids. However, it's not clear if the photoreceptors themselves actually die in this model.

      TUNEL was used to assess cell loss and it was not significantly different between the control and patient organoids at the timepoints examined. We did not expect a change as the disease in the patient developed over decades.

      (8) Figure 5B. The RHO staining in the vehicle-treated sample is perturbed relative to the PR3 treatments as indicated in the text. In the vehicle-treated sample, the number of DAPI-positive cells that are completely negative proximal to the inner segments suggests that there might be non-rod cells there. Have the authors confirmed whether these are cones? Labels would be helpful in the left vehicle panel as the morphology looks very different than the treated samples.


      Thank you very much for the various suggestions and these will be included in the revised manuscript version. A number of the cells in the negative regions are OTX2+/NRL- and likely to be cones (Figure 4 A and B). Unfortunately, we do not have a very good cone nuclear marker as RXRγ does not consistently stain mature cones.

      (9) It is interesting that in addition to increases in RHO, and photo-transduction, there are also increases in PTPRT which is related to synaptic adhesion. Is there evidence of ectopic neurites that result from PTPRT over-expression?

      You are absolutely correct that PTPRT data is very interesting. PTPRT requires similar PTMs like RHO in photoreceptors for its synaptic localization. We did not specifically look at ectopic neurites and test that in the revision. It will interesting to follow-up on its expression pattern to see if it gets processed or localized normally if we can find a working antibody. It is also possible that the gene-expression increase due to feedback upregulation secondary to improper protein processing.

      Reviewer #3 (Public Review):

      This manuscript reports a novel pedigree with four intact copies of RHO on a single chromosome which appears to lead to overexpression of rhodopsin and a corresponding autosomal dominant form of RP. The authors generate retinal organoids from patient- and control-derived cells, characterize the phenotypes of the organoids, and then attempt to 'treat' aberrant rhodopsin expression/mislocalization in the patient organoids using a small molecule called photoregulin 3 (PR3). While this novel genetic mechanism for adRP is interesting, the organoid work is not compelling. There are multiple problems related to the technical approaches, the presentation of the results, and the interpretations of the data. I will present my concerns roughly in the order in which they appear in the manuscript.

      Major concerns:

      (1) Individual human retinal organoids in culture can show a wide range of differentiation phenotypes with respect to the expression of specific markers, percentages of given cell types, etc. For this reason, it can be very difficult to make rigorous, quantitative comparisons between 'wild-type' and 'mutant' organoids. Despite this difficulty, the author of the present manuscript frequently presents results in an impressionistic manner without quantitation. Furthermore, there is no indication that the investigator who performed the phenotypic analyses was blind with respect to the genotype. In my opinion, such blinding is essential for the analysis of phenotypes in retinal organoids. To give an example, in lines 193-194 the authors write "we observed that while the patient organoids developing connecting cilium and the inner segments similar to control organoids, they failed to extend outer segments". Outer segments almost never form normally in human retinal organoids, even when derived from 'wild-type' cells. Thus, I consider it wholly inadequate to simply state that outer segment formation 'failed' without a rigorous, quantitative, and blinded comparison of patient and control organoids.

      We agree it is challenging to generate outer segments in retinal organoids but we are not the first to show this. This has been demonstrated by multiple independent labs (Mayerl et al (PMID: 36206764), Wahlin et al (PMID: 28396597), West at al (PMID: 35334217) including ours (Chirco et al (PMID: 34653402). To clarify, we did not observe any OS like tissue in the patient organoids across multiple EM preps of a number of organoids from two independent 300+ day experiments which matched the phase microscopy data presented in Fig2B.

      (2) The presentation of qPCR results in Figure 3A is very confusing. First, the authors normalize expression to that of CRX, but they don't really explain why. In lines 210-211, they write "CRX, a ubiquitously expressing photoreceptor gene maintained from development to adulthood." Several parts of this sentence are misleading or incomplete. First, CRX is not 'ubiquitously expressed' (which usually means 'in all cell types') nor is it photoreceptor-specific: CRX is expressed in rods, cones, and bipolar cells. Furthermore, CRX expression levels are not constant in photoreceptors throughout development/adulthood. So, for these reasons alone, CRX is a poor choice for the normalization of photoreceptor gene expression.

      As you are aware, all housekeeping genes have shortcomings when used for normalizing PCR data. We went with CRX as within the timepoints chosen, it is not expected to change much and thus represent a good equalizer for relative photoreceptor numbers between the organoids and conditions. While we agree that CRX is weakly expressed in bipolar cells (Yamamoto et al 2020), it is not expected to bias the data too much as we have not seen nor have other reported a huge relative difference in bipolar cell number in organoids. We also confirm this by showing equivalent expression of OTX2, RCVRN and NRL between all conditions.

      Second, the authors' interpretation of the qPCR results (lines 216-218) is very confusing. The authors appear to be saying that there is a statistically significant increase in RHO levels between D120 and D300. However, the same change is observed in both control and patient organoids and is not unexpected, since the organoids are more mature at D300. The key comparison is between control and patient organoids at D300. At this time point, there appears to be no difference between control and patient. The authors don't even point this out in the main text.

      Thank you for the comment and we apologize if this confused you. However, as can been seen in the graph in Figure 3A, we do compare expression of genes including RHO between control and patient organoids at two different time points. There are four conditions: D120-RC, D120-RM, D300-RC and D300-RM with individual data points and error bars for each condition. There is a statistically significant increase at both time points upon comparing the control and patient organoids for RHO. We compared RHO expression between patient organoids at the two time points and it was not statistically different.

      Third, the variability in the number of photoreceptor cells in individual organoids makes a whole-organoid comparison by qPCR fraught with difficulty. It seems to me that what is needed here is a comparison of RHO transcript levels in isolated rod photoreceptors.

      We agree that this makes it challenging. This was the exact reasoning for using CRX for normalization since it is predominantly present in photoreceptors. This was validated by the data showing no difference in expression of photoreceptor markers OTX2, RCVRN or NRL between the organoids.

      (3) I cannot understand what the authors are comparing in the bulk RNA-seq analysis presented in the paragraph starting with line 222 and in the paragraph starting with line 306. They write "we performed bulk-RNA sequencing on 300-days-old retinal organoids (n=3 independent biological replicates). Patient retinal organoids demonstrated upregulated transcriptomic levels of RHO... comparable to the qRT-PCR data." From the wording, it suggests that they are comparing bulk RNA-seq of patients and control organoids at D300. However, this is not stated anywhere in the main text, the figure legend, or the Methods. Yet, the subsequent line "comparable to the qRT-PCR data" makes no sense, because the qPCR comparison was between patient samples at two different time points, D120 and D300, not between patient and control. Thus, the reader is left with no clear idea of what is even being compared by RNA-seq analysis.

      We apologize if the conditions were not obvious and will clarify this in the revised version. The conditions compared are control and patient organoids at D300. Regarding comparison to RT-PCR, as stated above, the comparison shown is between patient and control organoids at two different timepoints.

      Remarkably, the exact same lack of clarity as to what is being compared is found in the second RNA-seq analysis presented in the paragraph starting with line 306. Here the authors write "We further carried out bulk RNA-sequencing analysis to comprehensively characterize three different groups of organoids, 0.25 μM PR3-treated and vehicle-treated patient organoids and control (RC) organoids from three independent differentiation experiments. Consistent with the qRT-PCR gene expression analysis, the results showed a significant downregulation in RHO and other rod phototransduction genes." Here, the authors make it clear that they have performed RNA-seq on three types of samples: PR3-treated patient organoids, vehicle-treated patient organoids, and control organoids (presumably not treated). Yet, in the next sentence, they state "the results showed a significant downregulation in RHO", but they don't state what two of the three conditions are being compared! Although I can assume that the comparison presented in Fig. 6A is between patient vehicle-treated and PR3-treated organoids, this is nowhere explicitly stated in the manuscript.

      Thank you for the comment and we will explicitly state various comparisons in the revised version.

      (4) There are multiple flaws in the analysis and interpretation of the PR3 treatment results. The authors wrote (lines 289-2945) "We treated long-term cultured 300-days-old, RHO-CNV patient retinal organoids with varying concentrations of PR3 (0.1, 0.25 and 0.5 μM) for one week and assessed the effects on RHO mRNA expression and protein localization. Immunofluorescence staining of PR3-treated organoids displayed a partial rescue of RHO localization with optimal trafficking observed in the 0.25 μM PR3-treated organoids (Figure 5B). None of the organoids showed any evidence of toxicity post-treatment."

      There are multiple problems here. First, the results are impressionistic and not quantitative. Second, it's not clear that the investigator was blinded with respect to the treatment condition. Third, in the sections presented, the organoids look much more disorganized in the PR3-treated conditions than in the control. In particular, the ONL looks much more poorly formed. Overall, I'd say the organoids looked considerably worse in the 0.25 and 0.5 microM conditions than in the control, but I don't know whether or not the images are representative. Without rigorously quantitative and blinded analysis, it is impossible to draw solid conclusions here. Lastly, the authors state that "none of the organoids showed any evidence of toxicity post-treatment," but do not explain what criteria were used to determine that there was no toxicity.

      Thank you for your critical insight. The RHO localization data is qualitative as it is very difficult to accurately quantify rhodopsin trafficking within the cell in the organoid. Thus, for quantitative comparison, we have provided expression level changes. Regarding toxicity, we analyzed the organoids by morphology and TUNEL and did not observe significant difference between the conditions. This closely mimics mouse data on PR3 which suppressed rod function in mice following IP injection without any obvious toxicity.

      (5) qPCR-based quantitation of rod gene expression changes in response to PR3 treatment is not well-designed. In lines 294-297 the authors wrote "PR3 drove a significant downregulation of RHO in a dose-dependent manner. Following qRT-PCR analysis, we observed a 2-to-5 log2FC decrease in RHO expression, along with smaller decreases in other rod-specific genes including NR2E3, GNAT1 and PDE6B." I assume these analyses were performed on cDNA derived from whole organoids. There are two problems with this analysis/interpretation. First, a decrease in rod gene expression can be caused by a decrease in the number of rods in the treated organoids (e.g., by cell death) or by a decrease in the expression of rod genes within individual rods. The authors do not distinguish between these two possibilities. Second, as stated above, the percentage of cells that are rods in a given organoid can vary from organoid to organoid. So, to determine whether there is downregulation of rod gene expression, one should ideally perform the qPCR analysis on purified rods.

      The reviewer is correct in pointing the potential reasons for reduction in RHO levels following PR3 treatment. Thus, we have provided NRL expression levels in the graph to show that this key rod-specific gene does not change suggesting equivalent number of rod photoreceptor cells. The suggestion of using purified rods is not practical here, as we do not have any way to sort human rods due to the lack of a rod-specific cell surface marker.

      (6) In Figure 4B 'RM' panels, the authors show RHO staining around the somata of 'rods' but the inset images suggest that several of these cells lack both NRL and OTX2 staining in their nuclei. All rods should be positive for NRL. Conversely, the same image shows a layer of cells scleral to the cells with putative RHO somal staining which do not show somal staining, and yet they do appear to be positive for NRL and OTX2. What is going on here? The authors need to provide interpretations for these findings.

      Since RHO is a cytoplasmic marker and photoreceptor are tightly packed, it is difficult to make a 1:1 comparison to NRL/OTX2 nuclear marker to RHO. Additionally, as the RHO+ cytoplasm moves towards scleral surface, it is expected to pass adjacent to other nuclei. Few of the rods do still have normal Rhodopsin trafficking and it is likely these will not have somal RHO similar to control conditions. We do rarely observe these cells as highlighted by the occasional RHO in IS/OS of RM organoids in the figure. We do agree that the NRL staining in the figure 4B (>D250) is not extremely crisp and we will include an updated figure in the revised version.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a new and valuable theoretical account of spatial representational drift in the hippocampus. The evidence supporting the claims is convincing, with a clear and accessible explanation of the phenomenon. Overall, this study will likely attract researchers exploring learning and representation in both biological and artificial neural networks.

      We would like to ask the reviewers to consider elevating the assessment due to the following arguments. As noted in the original review, the study bridges two different fields (machine learning and neuroscience), and does not only touch a single subfield (representational drift in neuroscience). In the revision, we also analysed data from four different labs, strengthening the evidence and the generality of the conclusions.

      Public Reviews:

      Reviewer #1 (Public Review):

      The authors start from the premise that neural circuits exhibit "representational drift" -- i.e., slow and spontaneous changes in neural tuning despite constant network performance. While the extent to which biological systems exhibit drift is an active area of study and debate (as the authors acknowledge), there is enough interest in this topic to justify the development of theoretical models of drift.

      The contribution of this paper is to claim that drift can reflect a mixture of "directed random motion" as well as "steady state null drift." Thus far, most work within the computational neuroscience literature has focused on the latter. That is, drift is often viewed to be a harmless byproduct of continual learning under noise. In this view, drift does not affect the performance of the circuit nor does it change the nature of the network's solution or representation of the environment. The authors aim to challenge the latter viewpoint by showing that the statistics of neural representations can change (e.g. increase in sparsity) during early stages of drift. Further, they interpret this directed form of drift as "implicit regularization" on the network.

      The evidence presented in favor of these claims is concise. Nevertheless, on balance, I find their evidence persuasive on a theoretical level -- i.e., I am convinced that implicit regularization of noisy learning rules is a feature of most artificial network models. This paper does not seem to make strong claims about real biological systems. The authors do cite circumstantial experimental evidence in line with the expectations of their model (Khatib et al. 2022), but those experimental data are not carefully and quantitatively related to the authors' model.

      We thank the reviewer for pushing us to present stronger experimental evidence. We now analysed data from four different labs. Two of those are novel analyses of existing data (Karlsson et al, Jercog et al). All datasets show the same trend - increasing sparsity and increasing information per cell. We think that the results, presented in the new figure 3, allow us to make a stronger claim on real biological systems.

      To establish the possibility of implicit regularization in artificial networks, the authors cite convincing work from the machine-learning community (Blanc et al. 2020, Li et al., 2021). Here the authors make an important contribution by translating these findings into more biologically plausible models and showing that their core assumptions remain plausible. The authors also develop helpful intuition in Figure 4 by showing a minimal model that captures the essence of their result.

      We are glad that these translation efforts are appreciated.

      In Figure 2, the authors show a convincing example of the gradual sparsification of tuning curves during the early stages of drift in a model of 1D navigation. However, the evidence presented in Figure 3 could be improved. In particular, 3A shows a histogram displaying the fraction of active units over 1117 simulations. Although there is a spike near zero, a sizeable portion of simulations have greater than 60% active units at the end of the training, and critically the authors do not characterize the time course of the active fraction for every network, so it is difficult to evaluate their claim that "all [networks] demonstrated... [a] phase of directed random motion with the low-loss space." It would be useful to revise the manuscript to unpack these results more carefully. For example, a histogram of log(tau) computed in panel B on a subset of simulations may be more informative than the current histogram in panel A.

      The previous figure 3A was indeed confusing. In particular, it lumped together many simulations without proper curation. We redid this figure (now Figure 4), and added supplementary figures (Figures S1, S2) to better explain our results. It is now clear that the simulations with a large number of active units were either due to non-convergence, slow timescale of sparsification or simulations featuring label noise in which the fraction of active units is less affected. Regarding the log(tau) calculation, while it could indeed be an informative plot, it could not be calculated in a simple manner for all simulations. This is because learning curves are not always exponential, but sometimes feature initial plateaus (see also Saxe et al 2013, Schuessler et al 2020). We added a more detailed explanation of this limitation in the methods section, and we believe the current figure exemplifies the effect in a satisfactory manner.

      Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Representational drift as a result of implicit regularization" the authors study the phenomenon of representational drift (RD) in the context of an artificial network that is trained in a predictive coding framework. When trained on a task for spatial navigation on a linear track, they found that a stochastic gradient descent algorithm led to a fast initial convergence to spatially tuned units, but then to a second very slow, yet directed drift which sparsified the representation while increasing the spatial information. They finally show that this separation of timescales is a robust phenomenon and occurs for a number of distinct learning rules.

      Strengths:

      This is a very clearly written and insightful paper, and I think people in the community will benefit from understanding how RD can emerge in such artificial networks. The mechanism underlying RD in these models is clearly laid out and the explanation given is convincing.

      We thank the reviewer for the support.

      Weaknesses:

      It is unclear how this mechanism may account for the learning of multiple environments.

      There are two facets to the topic of multiple environments. First, are the results of the current paper relevant when there are multiple environments? Second, what is the interaction between brain mechanisms of dealing with multiple environments and the results of the current paper?

      We believe the answer to the first question is positive. The near-orthogonality of representations between environments implies that changes in one can happen without changes in the other. This is evident, for instance, in Khatib et al and Geva et al - in both cases, drift seems to happen independently in two environments, even though they are visited intermittently and are visually similar.

      The second question is a fascinating one, and we are planning to pursue it in future work. While the exact way in which the brain achieves this near-independence is an open question, remapping is one possible window into this process.

      We extended the discussion to make these points clear.

      The process of RD through this mechanism also appears highly non-stationary, in contrast to what is seen in familiar environments in the hippocampus, for example.

      The non-stationarity noted by the reviewer is indeed a major feature of our observations, and is indeed linked to familiarity. We divide learning into three phases (now more clearly stated in Table 1 and Figure 4C). The first, rapid phase, consists of improvement of performance - corresponding to initial familiarity with the environment. The third phase, often reported in the literature of representational drift, is indeed stationary and obtained after prolonged familiarity. Our work focuses on the second phase, which is not as immediate as the first one, and can take several days. We note in the discussion that experiments which include a long familiarization process can miss this phase (see also Table 3). Furthermore, we speculate that real life is less stationary than a lab environment, and this second phase might actually be more relevant there.

      Reviewer #3 (Public Review):

      Summary:

      Single-unit neural activity tuned to environmental or behavioral variables gradually changes over time. This phenomenon, called representational drift, occurs even when all external variables remain constant, and challenges the idea that stable neural activity supports the performance of well-learned behaviors. While a number of studies have described representational drift across multiple brain regions, our understanding of the underlying mechanism driving drift is limited. Ratzon et al. propose that implicit regularization - which occurs when machine learning networks continue to reconfigure after reaching an optimal solution - could provide insights into why and how drift occurs in neurons. To test this theory, Ratzon et al. trained a Feedforward Network trained to perform the oft-utilized linear track behavioral paradigm and compare the changes in hidden layer units to those observed in hippocampal place cells recorded in awake, behaving animals.

      Ratzon et al. clearly demonstrate that hidden layer units in their model undergo consistent changes even after the task is well-learned, mirroring representational drift observed in real hippocampal neurons. They show that the drift occurs across three separate measures: the active proportion of units (referred to as sparsification), spatial information of units, and correlation of spatial activity. They continue to address the conditions and parameters under which drift occurs in their model to assess the generalizability of their findings.

      However, the generalizability results are presented primarily in written form: additional figures are warranted to aid in reproducibility.

      We added figures, and a Github with all the code to allow full reproducibility.

      Last, they investigate the mechanism through which sparsification occurs, showing that the flatness of the manifold near the solution can influence how the network reconfigures. The authors suggest that their findings indicate a three-stage learning process: 1) fast initial learning followed by 2) directed motion along a manifold which transitions to 3) undirected motion along a manifold.

      Overall, the authors' results support the main conclusion that implicit regularization in machine learning networks mirrors representational drift observed in hippocampal place cells.

      We thank the reviewer for this summary.

      However, additional figures/analyses are needed to clearly demonstrate how different parameters used in their model qualitatively and quantitatively influence drift.

      We now provide additional figures regarding parameters (Figures S1, S2).

      Finally, the authors need to clearly identify how their data supports the three-stage learning model they suggest.

      Their findings promise to open new fields of inquiry into the connection between machine learning and representational drift and generate testable predictions for neural data.

      Strengths:

      (1) Ratzon et al. make an insightful connection between well-known phenomena in two separate fields: implicit regularization in machine learning and representational drift in the brain. They demonstrate that changes in a recurrent neural network mirror those observed in the brain, which opens a number of interesting questions for future investigation.

      (2) The authors do an admirable job of writing to a large audience and make efforts to provide examples to make machine learning ideas accessible to a neuroscience audience and vice versa. This is no small feat and aids in broadening the impact of their work.

      (3) This paper promises to generate testable hypotheses to examine in real neural data, e.g., that drift rate should plateau over long timescales (now testable with the ability to track single-unit neural activity across long time scales with calcium imaging and flexible silicon probes). Additionally, it provides another set of tools for the neuroscience community at large to use when analyzing the increasingly high-dimensional data sets collected today.

      We thank the reviewer for these comments. Regarding the hypotheses, these are partially confirmed in the new analyses we provide of data from multiple labs (new Figure 3 and Table 3) - indicating that prolonged exposure to the environment leads to more stationarity.

      Weaknesses:

      (1) Neural representational drift and directed/undirected random walks along a manifold in ML are well described. However, outside of the first section of the main text, the analysis focuses primarily on the connection between manifold exploration and sparsification without addressing the other two drift metrics: spatial information and place field correlations. It is therefore unclear if the results from Figures 3 and 4 are specific to sparseness or extend to the other two metrics. For example, are these other metrics of drift also insensitive to most of the Feedforward Network parameters as shown in Figure 3 and the related text? These concerns could be addressed with panels analogous to Figures 3a-c and 4b for the other metrics and will increase the reproducibility of this work.

      We note that the results from figures 3 and 4 (original manuscript) are based on abstract tasks, while in figure 2 there is a contextual notion of spatial position. Spatial position metrics are not applicable to the abstract tasks as they are simple random mapping of inputs, and there isn’t necessarily an underlying latent variable such as position. This transition between task types is better explained in the text now. In essence the spatial information and place field correlation changes are simply signatures of the movements in parameter space. In the abstract tasks their change becomes trivial, as the spatial information becomes strongly correlated with sparsity and place fields are simply the activity vectors of units. These are guaranteed to change as long as there are changes in the activity statistics. We present here the calculation of these metrics averaged over simulations for completeness.

      Author response image 1.

      PV correlation between training time points averaged over 362 simulations. (B) Mean SI of units normalized to first time step, averaged over 362 simulations. Red line shows the average time point of loss convergence, the shaded area represents one standard deviation.

      (2) Many caveats/exceptions to the generality of findings are mentioned only in the main text without any supporting figures, e.g., "For label noise, the dynamics were qualitatively different, the fraction of active units did not reduce, but the activity of the units did sparsify" (lines 116-117). Supporting figures are warranted to illustrate which findings are "qualitatively different" from the main model, which are not different from the main model, and which of the many parameters mentioned are important for reproducing the findings.

      We now added figures (S1, S2) that show this exactly. We also added a github to allow full reproduction.

      (3) Key details of the model used by the authors are not listed in the methods. While they are mentioned in reference 30 (Recanatesi et al., 2021), they need to be explicitly defined in the methods section to ensure future reproducibility.

      The details of the simulation are detailed in the methods sections. We also added a github to allow full reproducibility.

      (4) How different states of drift correspond to the three learning stages outlined by the authors is unclear. Specifically, it is not clear where the second stage ends, and the third stage begins, either in real neural data or in the figures. This is compounded by the fact that the third stage - of undirected, random manifold exploration - is only discussed in relation to the introductory Figure 1 and is never connected to the neural network data or actual brain data presented by the authors. Are both stages meant to represent drift? Or is only the second stage meant to mirror drift, while undirected random motion along a manifold is a prediction that could be tested in real neural data? Identifying where each stage occurs in Figures 2C and E, for example, would clearly illustrate which attributes of drift in hidden layer neurons and real hippocampal neurons correspond to each stage.

      Thanks for this comment, which urged us to better explain these concepts.

      The different processes (reduction in loss, reduction in Hessian) happen in parallel with different timescales. Thus, there are no sharp transitions between the phases. This is now explained in the text in relation to figure 4C, where the approximate boundaries are depicted.

      The term drift is often used to denote a change in representation without a change in behavior. In this sense, both the second and third phases correspond to drift. Only the third stage is stationary. This is now emphasized in the text and in the new Table 1. Regarding experimental data, apart from the new figure 3 with four datasets, we also summarize in Table 3 the relation between duration of familiarity and stationarity of the data.

      Recommendations for the authors:

      The reviewers have raised several concerns. They concur that the authors should address the specific points below to enhance the manuscript.

      (1) The three different phases of learning should be clearly delineated, along with how they are determined. It remains unclear in which exact phase the drift is observed.

      This is now clearly explained in the new Table 1 and Figure 4C. Note that the different processes (reduction in loss, reduction in Hessian) happen in parallel with different timescales. Thus, there are no sharp transitions between the phases. This is now explained in the text in relation to figure 4C, where the approximate boundaries are depicted.

      The term drift is often used to denote a change in representation without a change in behavior. In this sense, both the second and third phases correspond to drift. Only the third stage is stationary. This is now emphasized in the text and in the new Table 1. Regarding experimental data, apart from the new figure 3 with four datasets, we also summarize in Table 3 the relation between duration of familiarity and stationarity of the data.

      (2) The term "sparsification" of unit activity is not fully clear. Its meaning should be more explicitly explained, especially since, in the simulations, a significant number of units appear to remain active (Fig. 3A).

      We now define precisely the two measures we use - Active Fraction, and Fraction Active Units. There is a new section with an accompanying figure in the Methods section. As Figure S2 shows, the noise statistics (label noise vs. update noise) differentially affects these two measures.

      (3) While the study primarily focuses on one aspect of representational drift-the proportion of active units-it should also explore other features traditionally associated with representational drift, such as spatial information and the correlation between place fields.

      This absence of features is related to the abstract nature of some of the tasks simulated in our paper. In our original submission the transition between a predictive coding task to more abstract tasks was not clearly explained, creating some confusion regarding the measured metrics. We now clarified the motivation for this transition.

      Both the initial simulation and the new experimental data analysis include spatial information (Figures 2,3). The following simulations (Figure 4) with many parameter choices use more abstract tasks, for which the notion of correlation between place cells and spatial information loses its meaning as there is no spatial ordering of the inputs, and every input is encountered only once. Spatial information becomes strongly correlated with the inverse of the active fraction metric. The correlation between place cells is also directly linked to increase in sparseness for these tasks.

      (4) There should be a clearer illustration of how labeling noise influences learning dynamics and sparsification.

      This was indeed confusing in the original submission. We removed the simulations with label noise from Figure 4, and added a supplementary figure (S2) illustrating the different effects of label noise.

      (5) The representational drift observed in this study's simulations appears to be nonstationary, which differs from in vivo reports. The reasons for this discrepancy should be clarified.

      We added experimental results from three additional labs demonstrating a change in activity statistics (i.e. increase in spatial information and increase in sparseness) over a long period of time. We suggest that such a change long after the environment is already familiar is an indication for the second phase, and stress that this change seems to saturate at some point, and that most drift papers start collecting data after this saturation, hence this effect was missed in previous in vivo reports. Furthermore, these effects are become more abundant with the advent on new calcium imaging methods, as the older electrophysiological regording methods did not usually allow recording of large amounts of cells for long periods of time. The new Table 3 surveys several experimental papers, emphasizing the degree of familiarity with the environment.

      (6) A distinctive feature of the hippocampus is its ability to learn different spatial representations for various environments. The study does not test representational drift in this context, a topic of significant interest to the community. Whether the authors choose to delve into this is up to them, but it should at least be discussed more comprehensively, as it's only briefly touched upon in the current manuscript version.

      There are two facets to the topic of multiple environments. First, are the results of the current paper relevant when there are multiple environments? Second, what is the interaction between brain mechanisms of dealing with multiple environments and the results of the current paper?

      We believe the answer to the first question is positive. The near-orthogonality of representations between environments implies that changes in one can happen without changes in the other. This is evident, for instance, in Khatib et al and Geva et al - in both cases, drift seems to happen independently in two environments, even though they are visited intermittently and are visually similar.

      The second question is a fascinating one, and we are planning to pursue it in future work. While the exact way in which the brain achieves this near-independence is an open question, remapping is one possible window into this process.

      We extended the discussion to make these points clear.

      (7) The methods section should offer more details about the neural nets employed in the study. The manuscript should be explicit about the terms "hidden layer", "units", and "neurons", ensuring they are defined clearly and not used interchangeably..

      We changed the usage of these terms to be more coherent and made our code publicly available. Specifically, “units” refer to artificial networks and “neurons” to biological ones.

      In addition, each reviewer has raised both major and minor concerns. These are listed below and should be addressed where possible.

      Reviewer #1 (Recommendations For The Authors):

      I recommend that the authors edit the text to soften their claims. For example:

      In the abstract "To uncover the underlying mechanism, we..." could be changed to "To investigate, we..."

      Agree. Done

      On line 21, "Specifically, recent studies showed that..." could be changed to "Specifically, recent studies suggest that..."

      Agree. Done

      On line 100, "All cases" should probably be softened to "Most cases" or more details should be added to Figure 3 to support the claim that every simulation truly had a phase of directed random motion.

      The text was changed in accordance with the reviewer’s suggestion. In addition, the figure was changed and only includes simulations in which we expected unit sparsity to arise (without label noise). We also added explanations and supplementary figures for label noise.

      Unless I missed something obvious, there is no new experimental data analysis reported in the paper. Thus, line 159 of the discussion, "a phenomenon we also observed in experimental data" should be changed to "a phenomenon that recently reported in experimental data."

      We thank the reviewer for drawing our attention to this. We now analyzed data from three other labs, two of which are novel analyses on existing data. All four datasets show the same trends of sparseness with increasing spatial information. The new Figure 3 and text now describe this.

      On line 179 of the Discussion, "a family of network configurations that have identical performance..." could be softened to "nearly identical performance." It would be possible for networks to have minuscule differences in performance that are not detected due to stochastic batch effects or limits on machine precision.

      The text was changed in accordance with the reviewer’s suggestion.

      Other minor comments:

      Citation 44 is missing the conference venue, please check all citations are formatted properly.

      Corrected.

      In the discussion on line 184, the connection to remapping was confusing to me, particularly because the cited reference (Sanders et al. 2020) is more of a conceptual model than an artificial network model that could be adapted to the setting of noisy learning considered in this paper. How would an RNN model of remapping (e.g. Low et al. 2023; Remapping in a recurrent neural network model of navigation and context inference) be expected to behave during the sparsifying portion of drift?

      We now clarified this section. The conceptual model of Sanders et al includes a specific prediction (Figure 7 there) which is very similar to ours - a systematic change in robustness depending on duration of training. Regarding the Low et al model, using such mechanistic models is an exciting avenue for future research.

      Reviewer #2 (Recommendations For The Authors):

      I only have two major questions.

      (1) Learning multiple representations: Memory systems in the brain typically must store many distinct memories. Certainly, the hippocampus, where RD is prominent, is involved in the ongoing storage of episodic memories. But even in the idealized case of just two spatial memories, for example, two distinct linear tracks, how would this learning process look? Would there be any interference between the two learning processes or would they be largely independent? Is the separation of time scales robust to the number of representations stored? I understand that to answer this question fully probably requires a research effort that goes well beyond the current study, but perhaps an example could be shown with two environments. At the very least the authors could express their thoughts on the matter.

      There are two facets to the topic of multiple environments. First, are the results of the current paper relevant when there are multiple environments? Second, what is the interaction between brain mechanisms of dealing with multiple environments and the results of the current paper?

      We believe the answer to the first question is positive. The near-orthogonality of representations between environments implies that changes in one can happen without changes in the other. This is evident, for instance, in Khatib et al and Geva et al - in both cases, drift seems to happen independently in two environments, even though they are visited intermittently and are visually similar.

      The second question is a fascinating one, and we are planning to pursue it in future work. While the exact way in which the brain achieves this near-independence is an open question, remapping is one possible window into this process.

      We extended the discussion to make these points clear.

      (2) Directed drift versus stationarity: I could not help but notice that the RD illustrated in Fig.2D is not stationary in nature, i.e. the upper right and lower left panels are quite different. This appears to contrast with findings in the hippocampus, for example, Fig.3e-g in (Ziv et al, 2013). Perhaps it is obvious that a directed process will not be stationary, but the authors note that there is a third phase of steady-state null drift. Is the RD seen there stationary? Basically, I wonder if the process the authors are studying is relevant only as a novel environment becomes familiar, or if it is also applicable to RD in an already familiar environment. Please discuss the issue of stationarity in this context.

      The non-stationarity noted by the reviewer is indeed a major feature of our observations, and is indeed linked to familiarity. We divide learning into three phases (now more clearly stated in Table 1 and Figure 4C). The first, rapid, phase consists of improvement of performance - corresponding to initial familiarity with the environment. The third phase, often reported in the literature of representational drift, is indeed stationary and obtained after prolonged familiarity. Our work focuses on the second phase, which is not as immediate as the first one, and can take several days. We note in the discussion that experiments which include a long familiarization process can miss this phase (see also Table 3). Furthermore, we speculate that real life is less stationary than a lab environment, and this second phase might actually be more relevant there.

      Reviewer #3 (Recommendations For The Authors):

      Most of my general recommendations are outlined in the public review. A large portion of my comments regards increasing clarity and explicitly defining many of the terms used which may require generating more figures (to better illustrate the generality of findings) or modifying existing figures (e.g., to show how/where the three stages of learning map onto the authors' data).

      Sparsification is not clearly defined in the main text. As I read it, sparsification is meant to refer to the activity of neurons, but this needs to be clearly defined. For example, lines 262-263 in the methods define "sparseness" by the number of active units, but lines 116-117 state: "For label noise, the dynamics were qualitatively different, the fraction of active units did not reduce, but the activity of the units did sparsify." If the fraction of active units (defined as "sparseness") did not change, what does it mean that the activity of the units "sparsified"? If the authors mean that the spatial activity patterns of hidden units became more sharply tuned, this should be clearly stated.

      We now defined precisely the two measures we use - Active Fraction, and Fraction Active Units. There is a new section with an accompanying figure in the Methods section. As Figure S2 shows, the noise statistics (label noise vs. update noise) differentially affects these two measures.

      Likewise, it is unclear which of the features the authors outlined - spatial information, active proportion of units, and spatial correlation - are meant to represent drift. The authors should clearly delineate which of these three metrics they mean to delineate drift in the main text rather than leave it to the reader to infer. While all three are mentioned early on in the text (Figure 2), the authors focus more on sparseness in the last half of the text, making it unclear if it is just sparseness that the authors mean to represent drift or the other metrics as well.

      The main focus of our paper is on the non-stationarity of drift. Namely that features (such as these three) systematically change in a directed manner as part of the drift process. This is in The new analyses of experimental data show sparseness and spatial information.

      The focus on sparseness in the second half of the paper is because we move to more abstract These are also easy to study in the more abstract tasks in the second part of the paper. In our original submission the transition between a predictive coding task to more abstract tasks was not clearly explained, creating some confusion regarding the measured metrics. We now clarified the motivation for this transition.

      It is not clear if a change in the number of active units alone constitutes "drift", especially since Geva et al. (2023) recently showed that both changes in firing rate AND place field location drive drift, and that the passage of time drives changes in activity rate (or # cells active).

      Our work did not deal with purely time-dependent drift, but rather focused on experience-dependence. Furthermore, Geva et al study the stationary phase of drift, where we do not expect a systematic change in the total number of cells active. They report changes in the average firing rate of active cells in this phase, as a function of time - which does not contradict our findings.

      "hidden layer", "units", and "neurons" seem to be used interchangeably in the text (e.g., line 81-85). However, this is confusing in several places, in particular in lines 83-85 where "neurons" is used twice. The first usage appears to refer to the rate maps of the hidden layer units simulated by the authors, while the second "neurons" appears to refer to real data from Ziv 2013 (ref 5). The authors should make it explicit whether they are referring to hidden layer units or actual neurons to avoid reader confusion.

      We changed the usage of these terms to be more coherent. Specifically, “units” refer to artificial networks and “neurons” to biological ones.

      The authors should clearly illustrate which parts of their findings support their three-phase learning theory. For example, does 2E illustrate these phases, with the first tenth of training time points illustrating the early phase, time 0.1-0.4 illustrating the intermediate phase, and 0.4-1 illustrating the last phase? Additionally, they should clarify whether the second and third stages are meant to represent drift, or is it only the second stage of directed manifold exploration that is considered to represent drift? This is unclear from the main text.

      The different processes (reduction in loss, reduction in Hessian) happen in parallel with different timescales. Thus, there are no sharp transitions between the phases. This is now explained in the text in relation to figure 4C, where the approximate boundaries are depicted.

      The term drift is often used to denote a change in representation without a change in behavior. In this sense, both the second and third phases correspond to drift. Only the third stage is stationary. This is now emphasized in the text and in the new Table 1. Regarding experimental data, apart from the new figure 3 with four datasets, we also summarize in Table 3 the relation between duration of familiarity and stationarity of the data.

      Line 45 - It appears that the acronym ML is not defined above here anywhere.

      Added.

      Line 71: the ReLU function should be defined in the text, e.g., sigma(x) = x if x > 0 else 0.

      Added.

      106-107: Figures (or supplemental figures) to demonstrate how most parameters do not influence sparsification dynamics are warranted. As written, it is unclear what "most parameters" mean - all but noise scale. What about the learning rule? Are there any interactions between parameters?

      We now removed the label noise from Figure 4, and added two supplementary figures to clearly explain the effect of parameters. Figure 4 itself was also redone to clarify this issue.

      2F middle: should "change" be omitted for SI?

      The panel was replaced by a new one in Figure 3.

      116-119: A figure showing how results differ for label noise is warranted.

      This is now done in Figure S1, S2.

      124: typo, The -> the

      Corrected.

      127-129: This conclusion statement is the first place in the text where the three stages are explicitly outlined. There does not appear to be any support or further explanation of these stages in the text above.

      We now explain this earlier at the end of the Introduction section, along with the new Table 1 and marking on Figure 4C.

      132-133 seems to be more of a statement and less of a prediction or conclusion - do the authors mean "the flatness of the loss landscape in the vicinity of the solution predicts the rate of sparsification?"

      We thank the reviewer for this observation. The sentence was rephrased:

      Old: As illustrated in Fig. 1, different solutions in the zero-loss manifold might vary in some of their properties. The specific property suggested from theory is the flatness of the loss landscape in the vicinity of the solution.

      New: As illustrated in Fig. 1, solutions in the zero-loss manifold have identical loss, but might vary in some of their properties. The authors of [26] suggest that noisy learning will slowly increase the flatness of the loss landscape in the vicinity of the solution.

      135: typo, it's -> its

      Corrected.

      Line 135-136 "Crucially, the loss on the 136 entire manifold is exactly zero..." This appears to contradict the Figure 4A legend - the loss appears to be very high near the top and bottom edges of the manifold in 4A. Do the authors mean that the loss along the horizontal axis of the manifold is zero?

      The reviewer is correct. The manifold mentioned in the sentence is indeed the horizontal axis. We changed the text and the figure to make it clearer.

      Equation 6: This does not appear to agree with equation 2 - should there be an E_t term for an expectation function?

      Corrected.

      Line 262-263: "Sparseness means that a unit has become inactive for all inputs." This should also be stated explicitly as the definition of sparseness/sparsification in the main text.

      We now define precisely the two measures we use - Active Fraction, and Fraction Active Units. There is a new section with an accompanying figure in the Methods section. As Figure S2 shows, the noise statistics (label noise vs. update noise) differentially affects these two measures.

    2. Reviewer #2 (Public Review):

      Summary:

      In the manuscript "Representational drift as a result of implicit regularization" the authors study the phenomenon of representational drift (RD) in the context of an artificial network which is trained in a predictive coding framework. When trained on a task for spatial navigation on a linear track, they found that a stochastic gradient descent algorithm led to a fast initial convergence to spatially tuned units, but then to a second very slow, yet directed drift which sparsified the representation while increasing the spatial information. They finally show that this separation of time-scales is a robust phenomenon and occurs for a number of distinct learning rules.

      This is a very clearly written and insightful paper, and I think people in the community will benefit from understanding how RD can emerge in such artificial networks. The mechanism underlying RD in these models is clearly laid out and the explanation given is convincing.

      It still remains unclear how this mechanism may account for the learning of multiple environments, although this is perhaps a topic for future study. The non-stationarity of the drift in this framework would seem, at first blush, to contrast with what one sees experimentally, but the authors provide compelling evidence that there are continuous changes in network properties during learning and that stationarity may be the hallmark of overfamiliarized environments. Future experimental work may further shed light on differences in RD between novel and familiar environments.

    1. Author response:

      The following is the authors’ response to the original reviews.

      General comments

      All three experts have raised excellent ideas and made important suggestions to extend the scope of our study and provide additional information. While we fully acknowledge that these points are valid and would provide exciting new knowledge, we also should not lose track of the fact that a single study cannot cover all bases. Sulfated steroids, for example, are clearly essential components of mouse urine. Unfortunately, however, all chemical analysis approaches are limited and the one we opted for is not suitable for analysis of such signaling molecules. Future studies should certainly focus on these aspects. The same holds true for the fact that we do not know which of the identified compounds are actually VSN ligands. These are inherent limitations of the approach, and we are not claiming otherwise.

      Reviewer #1 (Public Review):

      (1) In this manuscript, Nagel et al. sought to comprehensively characterize the composition of urinary compounds, some of which are putative chemosignals. They used urines from adult males and females in three different strains, including one wild-derived strain. By performing mass spectrometry of two classes of compounds: volatile organic compounds and proteins, they found that urines from inbred strains are qualitatively similar to those of a wild strain. This finding is significant because there is a high degree of genetic diversity in wild mice, with chemosensory receptor genes harboring many polymorphisms.

      We agree and thank the Reviewer for his / her positive assessment.

      (2) In the second part of this work, the authors used calcium imaging to monitor the pattern of vomeronasal neuron responses to these urines. By performing pairwise comparisons, the authors found a large degree of strain-specific response and a relatively minor response to sex-specific urinary stimuli. This is a finding generally in agreement with previous calcium imaging work by Ron Yu and colleagues in 2008. The authors extend the previous work by using urines from wild mice. They further report that the concentration diversity of urinary compounds in different urine batches is largely uncorrelated with the activity profiles of these urines. In addition, the authors found that the patterns of vomeronasal neuron response to urinary cues are not identical when measured using different recipient strains. This fascinating finding, however, requires an additional control to exclude the possibility that this is not due to sampling error.

      We thank Reviewer 1 for pointing this out. We agree that this is truly a “fascinating finding.” Reviewer 1 emphasizes that we need to add an “additional control to exclude […] that this is not due to sampling error”, and he / she elaborates on the required control in his / her Recommendations For The Authors (see below). Reviewer 1 states that “for Fig. 5, in order to conclude that the same urine activates a different population of VSNs in two different strains, a critical control is needed to demonstrate that this is not due to the sampling variability - as compositions of V1Rs and V2Rs could vary between different slices, one preferred control is to use VNO slices from the same strain and compare the selectivity used here across the A-P axis.” Importantly, we believe that this is already controlled for. In fact, for each experiment, we routinely prepare VNO slices along the organ’s entire anterior-to-posterior axis (not including the most anterior tip, where the VNO lumen tapers into the vomeronasal duct, and the most posterior part, the lumen ‘‘twists’’ toward the ventral aspect and its volume decreases (see Figs. 7 & S7 in Hamacher et al., 2024, Current Biology)). This usually yields ~7 slices per individual experiment / session. Therefore, we routinely sample and average across the entire VNO anterior-to-posterior axis for each experiment. In Fig. 5, in which we analyzed whether the “same urine activates a different population of VSNs in two different strains”, individual independent experiments from each strain (C57BL/6 versus BALB/c) amounted to (a) n = 6 versus n = 8; (b) n = 10 versus n = 10; (c) n = 7 versus n = 9; (d) n = 9 versus n = 10; (e) n = 10 versus n = 9; and (f) n = 12 versus n = 10. Together, we conclude that it is very unlikely that the considerably different response profiles measured in different recipient strains result from a “sampling error.”

      To clarify this point in the revised manuscript, we now explain our sampling routine in more detail in the Materials and Methods. Moreover, we now also refer to this point in the Results.

      (3) There are several weaknesses in this manuscript, including the lack of analysis of the compositions of sulfated steroids and other steroids, which have been proposed to be the major constituents of vomeronasal ligands in urines and the indirect (correlational) nature of their mass spectrometry data and activity data.

      Reviewer 1 is correct to point out that our chemical profiling approach omits (sulfated) steroids. We are aware of this weakness. We deliberately decided to omit steroids as well as other nonvolatile small organic molecules for three main reasons: (i) as the reviewer points out, (sulfated) steroid composition has been the focus of analysis in several previous studies and there is ample published information available on their role as VSN stimuli; (ii) the analytical tools available to us do not allow comprehensive profiling of non-volatile small organic molecules; employing two-dimensional head-space GC-MS as well as LC-MS/MS is not suitable for steroid detection; and (iii) the relatively small sample volumes forced us to prioritize and focus on specific chemical classes (in our case, VOCs and proteins). We made an effort to use of the exact same stimuli as previously employed to investigate sensory representations in the accessory olfactory bulb (AOB) (Bansal et al., 2021), a feature that we consider a strength of the current study. However, this entailed that we had to effectively split our samples, further reducing the available sample volume.

      We acknowledge that we did not sufficiently describe our rationale for focusing on VOCs and proteins on the previous version of the manuscript (nor did we discuss the known role of (sulfated) steroids in VSN signaling in adequate detail). We have now made an effort to address these shortcomings in the revised manuscript. Specifically, we have added new text to the Introduction (“Prominent molecularly identified VSN stimuli include various sulfated steroids (Celsi et al., 2012; Fu et al., 2015; Haga-Yamanaka et al., 2015, 2014; Isogai et al., 2011; Nodari et al., 2008; Turaga and Holy, 2012), which could reflect the dynamic endocrine state of an individual.”) and the Discussion (“Notably, our chemical profiling approach omits (sulfated) steroids other non-volatile small organic molecules, which have previously been identified in mouse urine as VSN stimuli (Nodari et al., 2008). Caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone.” & “In line with the notion of highly selective vomeronasal sampling is our observation that the concentration differences between compounds shared among strains, which are often substantial, are not reflected by similarly pronounced differences in response strength among generalist VSNs. There are several, not necessarily mutually exclusive explanations for this finding: First, concentration could simply not be a read-out parameter for VSNs, which would support previous ideas of concentration-invariant VSN activity (Leinders-Zufall et al., 2000). Second, the concentrations in freshly released urine could just exceed the dynamic tuning range of VSNs since, particularly for VOCs, natural signals (e.g., in scent marks) must be accessible to a recipient for a prolonged amount of time (sometimes days). A similar rationale could explain the increased protein concentrations in male urine, since male mice use scent marking to establish and maintain their territories and urinary lipocalins serve as long-lasting reservoirs of VOCs (Hurst et al., 1998). Third, generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations. In fact, in the most extreme scenario, several compounds that do display substantial strain- and/or sex-specific differences in concentration might not act as chemosignals at all. Forth, to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.”).

      (4) Overall, the major contribution of this work is the identification of specific molecules in mouse urines. This work is likely to be of significant interest to researchers in chemosensory signaling in mammals and provides a systematic avenue to exhaustively identify vomeronasal ligands in the future.

      We thank the Reviewer for his / her generally positive assessment.

      Reviewer #2 (Public Review):

      (1) This manuscript by Nagel et al provides a comprehensive examination of the chemical composition of mouse urine (an important source of semiochemicals) across strain and sex, and correlates these differences with functional responses of vomeronasal sensory neurons (an important sensory population for detecting chemical social cues). The strength of the work lies in the careful and comprehensive imaging and chemical analyses, the rigor of quantification of functional responses, and the insight into the relevance of olfactory work on lab-derived vs wild-derived mice.

      We thank the Reviewer for his / her generally positive assessment.

      (2) With regards to the chemical analysis, the reader should keep in mind that a difference in the concentration of a chemical across strain or sex does not necessarily mean that that chemical is used for chemical communication. In the most extreme case, the animals may be completely insensitive to the chemical. Thus, the fact that the repertoire of proteins and volatiles could potentially allow sex and/or strain discrimination, it is unclear to what degree both are used in different situations.

      Reviewer 2 is correct to point out that sex- and/or strain-dependent differences in urine molecular composition do not automatically attribute a signaling function to those molecules. We concur and, in fact, stress this point many times throughout the manuscript. In the Results, for example, we point out (i) that “in female urine, BALB/c-specific proteins are substantially underrepresented, a fact not reflected by VSN response profiles”, (ii) that “as observed in C57BL/6 neurons, the skewed distributions of protein concentration indices were not reflected by BALB/c generalist VSN profiles”, and (iii) that “VSN population response profiles do not reflect the global molecular content of urine, suggesting that the VNO functions as a rather selective molecular detector.” Moreover, in the Discussion, we state (i) that “caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone”; (ii) that, for several sex- and/or strain-specific molecules, none “has previously been attributed a chemosensory function. Challenging the mouse VNO with purified recombinant protein(s) will help elucidate whether such functions exist”; (iii) that “generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations”; and (iv) that “to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.”

      In the revised manuscript, we now aim to even more strongly emphasize the point made by Reviewer 2. In the Discussion, we have deleted a sentence that read: “Sex- and strain-specific chemical profiles give rise to unique VSN activity patterns.” Moreover, we have added the following statement: “In fact, in the most extreme scenario, several compounds that do display substantial strain- and/or sex-specific differences in concentration might not act as chemosignals at all.”

      Reviewer #3 (Public Review):

      (1) One of the primary objectives in this study is to ascertain the extent to which the response profiles of VSNs are specific to sex and strain. The design of these Ca2+ imaging experiments uses a simple stimulus design, using two interleaved bouts of stimulation with pairs of urine (e.g. male versus female C57BL/6, male C57BL/6 versus male BALB/c) at a single dilution factor (1:100). This introduces two significant limitations: (1) the "generalist" versus "specialist" descriptors pertain only to the specific pairwise comparisons made and (2) there is no information about the sensitivity/concentration-dependence of the responses.

      Reviewer 3 points to two limitations of our VSN activity assay. He / she is correct to mention that characterizing a VSN as generalist or specialist based on a “pairwise comparison” should not be the basis of attributing such a “generalist” or “specialist” label in general (i.e., regarding the global stimulus space). We acknowledge this point, but we do not regard this as a limitation of our study since we are not investigating rather broad (i.e., multidimensional) questions of selectivity. All we are asking in the context of this study is whether VSNs - when being challenged with pairs of sex- or strain-specific urine samples - act as rather selective semiochemical detectors. Of course, one can always think of a study design that provides more information. However, we here opted for an assay that - in our hands - is robust, “low noise” (i.e., displays low intrinsic signal variability as evident form reliability index calculations), ensures recovery from VSN adaptation (Wong et al., 2018), and, importantly, answers the specific question we are asking.

      Regarding the second point (“there is no information about the sensitivity/concentrationdependence of the responses”), we would like to emphasize that this was not a focus of our study either. In fact, concentration-dependence of VSN activity has been a major focus of several previous studies referenced in our manuscript (e.g., Leinders-Zufall et al., 2000; He et al., 2008), albeit with contradictory results. In our study, we ask whether a pair of stimuli that we have shown to display, in part, strikingly different chemical composition (both absolute and relative) preferentially activates the same or different VSNs. With this question in mind, we believe that our assay (and its results) are highly informative.

      (2) The functional measurements of VSN tuning to various pairs of urine stimuli are consistently presented alongside mass spectrometry-based comparisons. Although it is clear from the manuscript text that the mass spectrometry-based analysis was separated from the VSN tuning experiments/analysis, the juxtaposition of VSN tuning measurements with independent molecular diversity measurements gives the appearance to readers that these experiments were integrated (i.e., that the diversity of ligands was underlying the diversity of physiological responses). This is a hypothesis raised by the parallel studies, not a supported conclusion of the work. This data presentation style risks confusing readers.

      As Reviewer 3 points out correctly “it is clear from the manuscript text that the mass spectrometry-based analysis was separated from the VSN tuning experiments/analysis.” In the figures, we try make the distinction between VSN response statistics and chemical profiling more obvious by gray shadows that link the plots depicting VSN response characteristics to the general pie charts.

      We now also made an extra effort to avoid “confusing readers” by stating in the Discussion (i) that “caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone”; (ii) that, for several sex- and/or strain-specific molecules, none “has previously been attributed a chemosensory function. Challenging the mouse VNO with purified recombinant protein(s) will help elucidate whether such functions exist”; (iii) that “generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations”; and (iv) that “to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.” Moreover, we have deleted a sentence that read: “sex- and strain-specific chemical profiles give rise to unique VSN activity patterns”, and we have added the following statement: “In fact, in the most extreme scenario, several compounds that do display substantial strain- and/or sex-specific differences in concentration might not act as chemosignals at all.”

      However, we believe that there is value in presenting “VSN tuning measurements” next to “independent molecular diversity measurements.” While these are independent measurements, their similarity or, quite frequently, lack thereof are informative. We are sure that by taking the above “precautions” we have now mitigated the risk of “confusing readers.”

      (3) The impact of mass spectrometry findings is limited by the fact that none of these molecules (in bulk, fractions, or monomolecular candidate ligands) were tested on VSNs. It is possible that only a very small number of these ligands activate the VNO. The list of variably expressed proteins - especially several proteins that are preferentially found in female urine - is compelling, but, again, there is no evidence presented that indicates whether or not these candidate ligands drive VSN activity. It is noteworthy that the largest class of known natural ligands for VSNs are small nonvolatiles that are found at high levels in mouse urine. These molecules were almost certainly involved in driving VSN activity in the physiology assays (both "generalist" and "specialist"), but they are absent from the molecular analysis.

      Reviewer 3 is right, of course, that at this point we have not tested the identified molecules on VSNs. This is clearly beyond the scope of the present study. We believe that the data we present will be the basis of (several full-length) future studies that aim to identify specific ligands and - best case scenario - receptor-ligand pairs. We find it hard to concur that our study, which provides the necessary basis for those future endeavors, is regarded as “incomplete”. By design, all studies are somewhat incomplete, i.e., there are always remaining questions and we are not contesting that.

      It is true, of course, that a class of “known natural ligands for VSNs are small nonvolatiles.” As we replied above, our chemical profiling approach omits (sulfated) steroids. We are aware of this weakness. We deliberately decided to omit steroids as well as other non-volatile small organic molecules for three main reasons: (i) steroid composition has been the focus of analysis in several previous studies and there is ample published information available on their role as VSN stimuli; (ii) the analytical tools available to us do not allow comprehensive profiling of non-volatile small organic molecules; employing two-dimensional head-space GC-MS as well as LC-MS/MS is not suitable for steroid detection; and (iii) the relatively small sample volumes forced us to prioritize and focus on specific chemical classes (in our case, VOCs and proteins). We made an effort to use of the exact same stimuli as previously employed to investigate sensory representations in the accessory olfactory bulb (AOB) (Bansal et al., 2021), a fact that we consider a key strength of our current study. However, this entailed that we had to effectively split our samples, further reducing the available sample volume.

      We acknowledge that we did not sufficiently describe our rationale for focusing on VOCs and proteins on the previous version of the manuscript (nor did we discuss the known role of (sulfated) steroids in VSN signaling in adequate detail). We have now made an effort to address these shortcomings in the revised manuscript. Specifically, we have added new text to the Introduction (“Prominent molecularly identified VSN stimuli include various sulfated steroids (Celsi et al., 2012; Fu et al., 2015; Haga-Yamanaka et al., 2015, 2014; Isogai et al., 2011; Nodari et al., 2008; Turaga and Holy, 2012), which could reflect the dynamic endocrine state of an individual.”) and the Discussion (“Notably, our chemical profiling approach omits (sulfated) steroids other non-volatile small organic molecules, which have previously been identified in mouse urine as VSN stimuli (Nodari et al., 2008). Caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone.” & “In line with the notion of highly selective vomeronasal sampling is our observation that the concentration differences between compounds shared among strains, which are often substantial, are not reflected by similarly pronounced differences in response strength among generalist VSNs. There are several, not necessarily mutually exclusive explanations for this finding: First, concentration could simply not be a read-out parameter for VSNs, which would support previous ideas of concentration-invariant VSN activity (Leinders-Zufall et al., 2000). Second, the concentrations in freshly released urine could just exceed the dynamic tuning range of VSNs since, particularly for VOCs, natural signals (e.g., in scent marks) must be accessible to a recipient for a prolonged amount of time (sometimes days). A similar rationale could explain the increased protein concentrations in male urine, since male mice use scent marking to establish and maintain their territories and urinary lipocalins serve as long-lasting reservoirs of VOCs (Hurst et al., 1998). Third, generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations. In fact, in the most extreme scenario, several compounds that do display substantial strain- and/or sex-specific differences in concentration might not act as chemosignals at all. Forth, to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.”).

      Reviewer #1 (Recommendations For The Authors):

      (1) I find that the study is highly valuable for researchers in this field. With the finding that wild mouse urines do not elicit significantly more variable responses from urines from inbred strains, researchers can now be reassured to use inbred strains to gain general insights on pheromone signaling.

      A major omission of this study is non-volatile small organic molecules such as steroids. These compounds are the only molecular class in urine that have been identified to stimulate specific vomeronasal receptors to date. It is unclear to me that the specificity of VOC and proteins can alone fully explain the response specificity of the VSNs that have been monitored in this study. The discussion of this topic is highly beneficial for the readers.

      Reviewer 1 is correct to point out that our chemical profiling approach omits (sulfated) steroids. We are aware of this weakness. We deliberately decided to omit steroids as well as other nonvolatile small organic molecules for three main reasons: (i) as the reviewer points out, (sulfated) steroid composition has been the focus of analysis in several previous studies and there is ample published information available on their role as VSN stimuli; (ii) the analytical tools available to us do not allow comprehensive profiling of non-volatile small organic molecules; employing two-dimensional head-space GC-MS as well as LC-MS/MS is not suitable for steroid detection; and (iii) the relatively small sample volumes forced us to prioritize and focus on specific chemical classes (in our case, VOCs and proteins). We made an effort to use of the exact same stimuli as previously employed to investigate sensory representations in the accessory olfactory bulb (AOB) (Bansal et al., 2021), a fact that we consider a key strength of our current study. However, this entailed that we had to effectively split our samples, further reducing the available sample volume.

      We acknowledge that we did not sufficiently describe our rationale for focusing on VOCs and proteins on the previous version of the manuscript (nor did we discuss the known role of (sulfated) steroids in VSN signaling in adequate detail). We have now made an effort to address these shortcomings in the revised manuscript. Specifically, we have added new text to the Introduction (“Prominent molecularly identified VSN stimuli include various sulfated steroids (Celsi et al., 2012; Fu et al., 2015; Haga-Yamanaka et al., 2015, 2014; Isogai et al., 2011; Nodari et al., 2008; Turaga and Holy, 2012), which could reflect the dynamic endocrine state of an individual.”) and the Discussion (“Notably, our chemical profiling approach omits (sulfated) steroids other non-volatile small organic molecules, which have previously been identified in mouse urine as VSN stimuli (Nodari et al., 2008). Caution should thus be exerted to not attempt to fully explain VSN response specificity based on VOC and protein content alone.” & “In line with the notion of highly selective vomeronasal sampling is our observation that the concentration differences between compounds shared among strains, which are often substantial, are not reflected by similarly pronounced differences in response strength among generalist VSNs. There are several, not necessarily mutually exclusive explanations for this finding: First, concentration could simply not be a read-out parameter for VSNs, which would support previous ideas of concentration-invariant VSN activity (Leinders-Zufall et al., 2000). Second, the concentrations in freshly released urine could just exceed the dynamic tuning range of VSNs since, particularly for VOCs, natural signals (e.g., in scent marks) must be accessible to a recipient for a prolonged amount of time (sometimes days). A similar rationale could explain the increased protein concentrations in male urine, since male mice use scent marking to establish and maintain their territories and urinary lipocalins serve as long-lasting reservoirs of VOCs (Hurst et al., 1998). Third, generalist VSNs might sample information only from a select subset of urinary compounds, which, given their role as biologically relevant chemosignals, might be released at tightly controlled (and thus similar) concentrations. Forth, to some extent, different response profiles could be attributed to non-volatile small organic molecules such as steroids (Nodari et al., 2008), which were beyond the focus of our chemical analysis.”).

      (2) How many different wild mouse urines were tested in this study? Is this sufficient to capture the diversity of wild M. musculus in local (Prague) habitats?

      We thank the reviewer for pointing this out. For the present study, 20 male (M) and 27 female (F) wild mice were caught at six different sites in the broader Prague area (i.e., Bohnice (50.13415N, 14.41421E; 2M+4F), Dolni Brezany (49.96321N, 14.4585E; 3M+4F), Hodkovice (49.97227N, 14.48039E; 5M+6F), Písnice (49.98988N, 14.46625E; 3M+6F), Lhota (49.95369N, 14.43087E; 1M+2F), and Zalepy (49.9532N, 14.40829E; 6M+5F). 18 of the 27 wild females were caught pregnant. The remaining 9 females were mated with males caught at the same site and produced offspring within a month. When selecting 10 male and 10 female individuals from first-generation offspring for urine collection, we ensured that all six capture sites were represented and that age-matched animals displayed similar weight (~17g). We believe that this capture / breeding strategy sufficiently represents “the diversity of wild M. musculus in local (Prague) habitats.” In the revised manuscript, we have now included these details in the Materials and Methods.

      (3) I found Figure 1e and figures in a similar format confusing - one panel describes the response statistics of VSNs, and other panels show the number of compounds found in different MS profiling, which is not immediately obvious from the figures. Is the y-axis legend correct (%)?

      We now try make the distinction between VSN “response statistics” and chemical profiling more obvious by gray shadows that link the plots depicting VSN response characteristics to the general pie charts. Moreover, we thank the Reviewer for pointing out the mislabeling of the y-axis. Accordingly, we have deleted “%” in all corresponding figures.

      (4) For Figure 5, in order to conclude that the same urine activates a different population of VSNs in two different strains, a critical control is needed to demonstrate that this is not due to the sampling variability - as compositions of V1Rs and V2Rs could vary between different slices, one preferred control is to use VNO slices from the same strain and compare the selectivity used here across the A-P axis.

      We thank Reviewer 1 for pointing this out. Importantly, we believe that this is already controlled for (see our response to the Public Review). In fact, for each experiment, we routinely prepare VNO slices along the entire anterior-to-posterior axis (not including the most anterior tip, where the VNO lumen tapers into the vomeronasal duct, and the most posterior part, the lumen ‘‘twists’’ toward the ventral aspect and its volume decreases (see Figs. 7 & S7 in Hamacher et al., 2024, Current Biology)). This usually yields ~7 slices per individual experiment / session. Therefore, we routinely sample and average across the entire VNO anterior-to-posterior axis for each experiment. In Fig. 5, individual independent experiments from each strain (C57BL/6 versus BALB/c) amounted to (a) n = 6 versus n = 8; (b) n = 10 versus n = 10; (c) n = 7 versus n = 9; (d) n = 9 versus n = 10; (e) n = 10 versus n = 9; and (f) n = 12 versus n = 10. Together, we can thus exclude that the considerably different response profiles that we measured using different recipient strains result from a “sampling error.”

      To clarify this point in the revised manuscript, we now explain our sampling routine in more detail in the Materials and Methods. Moreover, we now also mention this point in the Results.

      Reviewer #2 (Recommendations For The Authors):

      (1) Pg 5 Lines 3-16: This summary paragraph contains too much detail given that the reader has not read the paper yet, which makes it bewildering. This should be condensed.

      We agree and have substantially condensed this paragraph.

      (2) Pg 6 Line 5-8: This summary of the experimental design is obtuse and should be edited for clarity.

      We have edited the relevant passage for clarity.

      (3) Pg 6 Line 11: "VSNs were categorized..." Specialist vs generalist is defined as responding to one or both stimuli. This definition is placed right after saying that the cells were also tested with KCl. The reader might think that specialist vs generalist was defined in relation to KCl.

      We have edited this sentence, which now reads: “Dependent on their individual urine response profiles, VSNs were categorized as either specialists (selective response to one stimulus) or generalists (responsive to both stimuli).”

      (4) Pg 6 Line 13: "we recorded urine-dependent Ca2+ signals from a total of 16,715 VSNs". Is a "signal" a response? Did all 16,715 VSNs respond to urine? What was the total of KCl responsive cells recorded?

      We edited the corresponding passage for clarification. The text now reads: “Overall, we recorded >43,000 K+-sensitive neurons, of which a total of 16,715 VSNs (38.4%) responded to urine stimulation. Of these urine-sensitive neurons, 61.4% displayed generalist profiles, whereas 38.6% were categorized as specialists (Figure 1c,d).”

      (5) Pg 7 Line 6: The repeated use of the word "pooled" is confusing as it suggests a variation in the experiment. The authors should establish once in the Methods and maybe in the Results that stimuli were pooled across animals. Then they should just refer to the stimulus as male or female or BALB/c rather than "pooled" male etc.

      We acknowledge the reviewer’s argument. Accordingly, we now introduce the experimental use of pooled urine once in the Methods and in the introductory paragraph of the Results. All other references to “pooled” urine in the Results and Captions have been deleted.

      (6) Pg 7 Line 10: "...detected in >=3 out of 10 male..." For the chemical analysis, were these samples not pooled?

      Correct. We deliberately did not pool samples for chemical analysis, but instead analyzed all individual samples separately (i.e., 60 samples were subjected to both proteomic and metabolomic analyses). Thus, the criterion that a VOC or protein must be detected in at least 3 of the 10 individual samples from a given sex/strain combination for a ‘present’ call (and in at least 6 of the 10 samples to be called ‘enriched’) ensures that the molecular signatures we identify are not “contaminated” by unusual aberrations within single samples.<br /> For clarification, we now explicitly outline this procedure in the Methods (Experimental Design and Statistical Analysis – Proteomics and metabolomics).

      (7) Pg 7 Line 23: In line 7, the specialist rate was defined as 5% in reference to the total KCl responsive cells. Here the specialist rate is defined from responsive cells. This is confusing.

      We apologize for the confusion. In both cases, the numbers (%) refer to all K+-sensitive neurons. We have added this information to both relevant sentences (l. 7 as well as ll. 23-24). Note that the rate in ll. 23-24 refers to generalists.

      (8) Pg 7 Line 25: Concentration index should be defined before its use here.

      We have revised the corresponding sentence, which now reads: “By contrast, analogously calculated concentration indices (see Materials and Methods) that can reflect potential disparities are distributed more broadly and non-normally (Figure 1h).”

      (9) Pg 7 Line 29: change "trivially" to "simply".

      Done

      (10) Pg 7 Line 30: What is meant by a "generalist" ligand? The neurons are generalists. Probably should read "common ligands"

      We have changed the text accordingly.

      (11) Pg 7 Line 31: What is meant by "global observed concentration disparities" ?

      We have changed the text to “…represented by the observed general concentration disparities.”

      (12) Pg 8 Lines 7-11: This section needs to be edited for clarity as it is very difficult to follow. For example, the definition of "enriched" is buried in a parenthetical. Also, it is very difficult to figure out what a "sample" is in this paper. Is it a pooled stimulus, or is it urine from an individual animal?

      We apologize for the confusion. Throughout the paper a “sample” is a pooled stimulus (from all 10 individuals of a given sex/strain combination) for all physiological experiments. For chemical analysis a “sample” refers to urine from an individual animal.

      (13)Pg 8 Line 11: "abundant proteins" Does this mean absolute concentration or enriched in one sample vs another?

      We changed the term “abundant” to “enriched” as this descriptor has been defined (present in ≥6 of 10 individual samples) in the previous sentence.

      (14) Pg 8 Line 18: "While 32.9% of all..." Please edit for clarity. What is the point?

      The main point here is that, for VOCs, the vast majority of compounds (91.3%) are either generic mouse urinary molecules or are sex/strain-specific.

      (15) Pg 10 Line 18: "Increased VSN selectivity..." This title is misleading as it suggests a change in sensitivity with animal exposure. I think the authors are trying to say "VSNs are more selective for strain than for sex". The authors should avoid the term "exposure to" when they mean "stimulation with" as the former suggests chronic exposure prior to testing.

      We thank the reviewer for the advice and have changed the title accordingly. We also edited the text to avoid the term "exposure to" throughout the manuscript.

      (16) Pg 12 Line 10: "we recorded hardly any..." Hardly any in comparison to what? BALB/c?

      We apologize for the confusion. We have edited the text for clarity, which now reads: “In fact, (i) compared to an average specialist rate of 11.2% ± 6.6% (mean ± SD) calculated over all 13 binary stimulus pairs (n = 26 specialist types), we observed only few specialist responses upon stimulation with urine from wild females (2% and 3%, respectively), and…”

      Reviewer #3 (Recommendations For The Authors):

      (1) Related to the pairwise stimulus-response experimental design and analysis: there is precedent in the field for studies that explore the same topic (sex- and strain-selectivity), but measure VSN sensitivity across many urine stimuli, not just two at a time. This has been done both in the VNO (He et al, Science, 2008; Fu, et al, Cell, 2015) and in the AOB (Tolokh, et al, Journal of Neuroscience, 2013). The current manuscript does not cite these studies.

      Reviewer 3 is correct and we apologize for this oversight. We now cite the two VSN-related studies by He et al. and Fu et al. in the Introduction.

      (2) The findings of the mass spectrometry-based profiling of mouse urine - especially for volatiles - is only accessible through repositories, making it difficult to for readers to understand which molecules were found to be highly divergent between sexes/strains. There is value in the list of ligands to further investigate, but this information should be made more accessible to readers without having to comb through the repositories.

      We agree that there “is value in the list of ligands to further investigate” and, accordingly, we now provide a table (Table 1) that lists the top-5 VOCs that – according to sPLS-DA – display the most discriminative power to classify samples by sex (related to Figure 2c) or strain (related to Figure 2d). For ease of identification, all entries list internal mass spectrometry identifiers, identifiers extracted from MS analysis database, the sex or strain that drives separation, which two-dimensional component / x-variate represents the most discriminative variable, PubChem chemical formula, PubChem common or alternative names, Chemical Entities of Biological Interest or PubChem Compound Identification, and the VOC’s putative origin.

      (3) There is a long precedent for integrating molecular assessments and physiological recordings to identify specific ligands for the vomeronasal system: - nonvolatiles (e.g., Leinders-Zufall, et al., Nature, 2000)

      • peptides (e.g., Kimoto et al., Nature, 2005; Leinders-Zufall et al. Science, 2004; Riviere et al., Nature, 2009; Liberles, et al., PNAS, 2009)
      • proteins (e.g., Chamero et al., Nature, 2007; Roberts et al., BMC Biology, 2010)

      • excreted steroids and bile acids (Nodari et al., Journal of Neuroscience, 2008; Fu et al., Cell, 2015; Doyle, et al., Nature Communications, 2016)

      The Leinders-Zufall (2000), Roberts, and Nodari papers are referenced, but the broader efforts by the community to find specific drivers of vomeronasal activity are not fully represented in the manuscript. The focus of this paper is fully related to this broader effort, and it would be appropriate for this work to be placed in this context in the introduction and discussion.

      We now refer to all of the studies mentioned in the Introduction (except the article published by Liberles et al. in 2009, since the authors of that study do not identify vomeronasal ligands).

      (4) Throughout the manuscript (starting in Fig. 1h) the figure panels and captions use the term "response index" whereas the methods define a "preference index." It seems to be the case that these two terms are synonymous. If so, a single term should be consistently used. If not, this needs to be clarified.

      We now consistently use the term “response index” throughout the manuscript.

      (5) It would be useful to provide a table associated with Figure 2 - figure supplement 1 that lists the common names and/or chemical formulas for the volatiles that were found to be of high importance.

      We agree and, accordingly, we now provide a table (Table 2) that lists VOC, which – according to Random Forest classification and resulting Gini importance scores – display the most discriminative power to classify samples by sex (related to Figure 2 - figure supplement 1a) or strain (related to Figure 2 - figure supplement 1b). Notably, it is generally reassuring that several VOCs are listed in both Table 1 and Table 2, emphasizing that two different supervised machine learning algorithms (i.e., sPLS-DA (Table 1) and Random Forest (Table 2)) yield largely congruent results.

      (5) The use of the term "comprehensive" for the molecular analysis is a little bit misleading, as volatiles and proteins are just two of the many categories of molecules present in mouse urine.

      We have now deleted most mentions of the term "comprehensive" when referring to the molecular analysis.

      (7) Page 11, lines 24-27: The sentences starting "We conclude..." and ending in "semiochemical concentrations." These two sentences do not make sense. It is not known how many of the identified proteins are actual VSN ligands. Moreover, there is abundant evidence from other studies that individual VSN activity provides information about distinct semiochemical concentrations.

      We have substantially edited and rephrased this paragraph to better reflect that different scenarios / interpretations are possible. The relevant text now reads: “We conclude that VSN population response strength might not be so strongly affected by strain-dependent concentration differences among common urinary proteins. In that case, it would appear somewhat unlikely that individual VSN activity provides fine-tuned information about distinct semiochemical concentrations. Alternatively, as some (or even many) of the identified proteins could not serve as vomeronasal ligands at all, generalist VSNs might sample information from only a subset of compounds which, in fact, are secreted at roughly similar concentrations.”

      (8) The explanation of stimulus timing is mentioned several times but not defined clearly in methods. Page 19, lines 14-19 have information about the stimulus delivery device, but it would be helpful to have stimulus timing explicitly stated.

      In addition to the relevant captions, we now explicitly state stimulus timing (i.e., 10 s stimulations at 180 s inter-stimulus intervals) in the Results.

      (9) Typos: Page 10, line 7: "male biased" → "male-biased" for clarity

      Wilcoxon "signed-rank" test is often misspelled "Wilcoxon singed ranked test" or "Wilcoxon signed ranked test"

      In the Fig. 3 legend, the asterisk meaning is unspecified.

      "(im)balances" → imbalances (page 27, line 24; page 37, line 16; page 38, line 16)

      Figure 2 - figure supplement 1 and in Figure 2 - figure supplement 2, in the box-andwhisker plots the units are not specified in the graph or legend.”

      We have made all required corrections.

    2. eLife assessment

      This carefully executed study provides a comparison of the chemical composition of mouse urine across strain and sex with the responses of vomeronasal sensory neurons, which are responsible for detecting chemical social cues. While the authors did not examine all molecular classes found in mouse urine or directly test whether the urinary volatile chemicals that vary with sex and strain are effective vomeronasal neuron ligands, solid data are provided that will be of significant interest to those studying chemical communication in rodents. This work should provide a valuable foundation for future research that will determine which molecules drive sex- and strain-specific vomeronasal responses.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This important study utilizes a virus-mediated short hairpin RNA (shRNA) approach to investigate in a novel way the role of the wild-type PHOX2B transcription factor in critical chemosensory neurons in the brainstem retrotrapezoid nucleus (RTN) region for maintaining normal CO2 chemoreflex control of breathing in adult rats. The solid results presented show blunted ventilation during elevated inhaled CO2 (hypercapnia) with knockdown of PHOX2B, accompanied by a reduction in expression of Gpr4 and Task2 mRNA for the proposed RTN neuron proton sensor proteins GPR4 and TASK2. These results suggest that maintained expression of wild-type PHOX2B affects respiratory control in adult animals, which complements previous studies showing that PHOX2B-expressing RTN neurons may be critical for chemosensory control throughout the lifespan and with implications for neurological disorders involving the RTN. When some methodological, data interpretation, and prior literature reference issues further highlighting novelty are adequately addressed, this study will be of interest to neuroscientists studying respiratory neurobiology as well as the neurodevelopmental control of motor behavior.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This important study investigated the role of the PHOX2B transcription factor in neurons in the key brainstem chemosensory structure, the retrotrapezoid nucleus (RTN), for maintaining proper CO2 chemoreflex responses of breathing in the adult rat in vivo. PHOX2B has an important transcriptional role in neuronal survival and/or function, and mutations of PHOX2B severely impair the development and function of the autonomic nervous system and RTN, resulting in the developmental genetic disease congenital central hypoventilation syndrome (CCHS) in neonates, where the RTN may not form and is functionally impaired. The function of the wild-type PHOX2B protein in adult RTN neurons that continue to express PHOX2B is not fully understood. By utilizing a viral PHOX2B-shRNA approach for knockdown of PHOX2B specifically in RTN neurons, the authors' solid results show impaired ventilatory responses to elevated inspired CO2, measured by whole-body plethysmography in freely behaving adult rats, that develop progressively over a four-week period in vivo, indicating effects on RTN neuron transcriptional activity and associated blunting of the CO2 ventilatory response. The RTN neuronal mRNA expression data presented suggests the impaired hypercapnic ventilatory response is possibly due to the decreased expression of key proton sensors in the RTN. This study will be of interest to neuroscientists studying respiratory neurobiology as well as the neurodevelopmental control of motor behavior.

      Strengths:

      (1) The authors used a shRNA viral approach to progressively knock down the PHOX2B protein, specifically in RTN neurons to determine whether PHOX2B is necessary for the survival and/or chemosensory function of adult RTN neurons in vivo.

      (2) To determine the extent of PHOX2B knockdown in RTN neurons, the authors combined RNAScope® and immunohistochemistry assays to quantify the subpopulation of RTN neurons expressing PHOX2B and neuromedin B (Nmb), which has been proposed to be key chemosensory neurons in the RTN.

      (3) The authors demonstrate that knockdown efficiency is time-dependent, with a progressive decrease in the number of Nmb-expressing RTN neurons that co-express PHOX2B over a four-week period.

      (4) Their results convincingly show hypoventilation particularly in 7.2% CO2 only for PHOX2B-shRNA RTN-injected rats after four weeks as compared to naïve and non-PHOX2B-shRNA targeted (NT-shRNA) RTN injected rats, suggesting a specific impairment of chemosensitive properties in RTN neurons with PHOX2B knockdown.

      (5) Analysis of the association between PHOX2B knockdown in RTN neurons and the attenuation of the hypercapnic ventilatory response (HCVR), by evaluating the correlation between the number of Nmb+/PHOX2B+ or Nmb+/PHOX2B- cells in the RTN and the resulting HCVR, showed a significant correlation between HCVR and number of Nmb+/PHOX2B+ and Nmb+/PHOX2B- cells, suggesting that the number of PHOX2B-expressing cells in the RTN is a predictor of the chemoreflex response and the reduction of PHOX2B protein impairs the CO2-chemoreflex.

      (6) The data presented indicate that PHOX2B knockdown not only causes a reduction in the HCVR but also a reduction in the expression of Gpr4 and Task2 mRNAs, suggesting that PHOX2B knockdown affects RTN neurons transcriptional activity and decreases the CO2 response, possibly by reducing the expression of key proton sensors in the RTN.

      (7) Results of this study show that independent of the role of PHOX2B during development, PHOX2B is still required to maintain proper CO2 chemoreflex responses in the adult brain, and its reduction in CCHS may contribute to the respiratory impairment in this disorder.

      Weaknesses:

      (1) The authors found a significant decrease in the total number of Nmb+ RTN neurons (i.e., Nmb+/PHOX2B+ plus Nmb+/ PHOX2B-) in NT-shRNA rats at two weeks post viral injection, and also at the four-week period where the impairment of the chemosensory function of the RTN became significant, suggesting some inherent cell death possibly due to off-target toxic effects associated with shRNA procedures that may affect the experimental results.

      (2) The tissue sampling procedures for quantifying numbers of cells expressing proteins/mRNAs throughout the extended RTN region bilaterally have not been completely validated to accurately represent the full expression patterns in the RTN under experimental conditions.

      (3) The inferences about RTN neuronal expression of NMB, GPR4, or TASK2 are based on changes in mRNA levels, so it remains speculation that the observed reduction in Gpr4 and Task2 mRNA translates to a reduction in the protein levels and associated reduction of RTN neuronal chemosensitive properties.

      Thank you for sharing the excitement for our study showing novel findings on the contribution of PHOX2B to the chemoreflex response and activity of adult RTN neurons. We believe that reporting the results on cell death following shRNA viral injections, potentially due to some off-target effects, are important to share with the scientific community to help plan experiments of similar kind in various fields of neuroscience.

      Thanks for pointing out your concerns about cell quantification, we have edited the methods and results section to add clarity about our analytical procedure.

      As we discussed in the manuscript, we were only able to assess mRNA levels of Nmb, Gpr4, Task2 as current available antibodies for the 3 targets are still unreliable. Future studies will benefit from the analysis of changes at protein levels and possibly electrophysiological recordings to verify that chemosensitive properties of RTN neurons are impaired due to reduction of PHOX2B expression. We discuss these limitations in the discussion.

      Reviewer #2 (Public Review):

      Summary:

      The authors used a short hairpin RNA technique strategy to elucidate the functional activity of neurons in the retrotrapezoid nucleus (RTN), a critical brainstem region for central chemoreception. Dysfunction in this area is associated with the neuropathology of congenital central hypoventilation syndrome (CCHS). The subsequent examination of these rats aimed to shed light on the intricate aspects of RTN and its implications for central chemoreception and disorders like CCHS in adults. They found that using the short hairpin RNA (shRNA) targeting Phox2b mRNA, a reduction of Phox2b expression was observed in Nmb neurons. In addition, Phox2b knockdown did not affect breathing in room air or under hypoxia, but the hypercapnia ventilatory response was significantly impaired. They concluded that Phox2b in the adult brain has an important role in CO2 chemoreception. They thought that their findings provided new evidence for mechanisms related to CCHS neuropathology. The conclusions of this paper are well supported by data, but careful discussion seems to be required for comparison with the results of various previous studies performed by different genetic strategies for the RTN neurons.

      Strengths:

      The most exciting aspect of this work is the modelling of the Phox2b knockdown in one element of the central neuronal circuit mediating respiratory reflexes, that is in the RTN. To date, mutations in the PHOX2B gene are commonly associated with most patients diagnosed with CCHS, a disease characterized by hypoventilation and absence of chemoreflexes, in the neonatal period, which in severe cases can lead to respiratory arrest during sleep. In the present study, the authors demonstrated that the role of Phox2b extends beyond the developmental period, and its reduction in CCHS may contribute to the respiratory impairment observed in this disorder.

      Weaknesses:

      Whereas the most exciting part of this work is the knockdown of the Phox2b in the RTN in adult rodents, the weakness of this study is the lack of a clear physiological, developmental, and anatomical distinction between this approach and similar studies already reported elsewhere (Ruffault et al., 2015, DOI: 10.7554/eLife.07051; Ramanantsoa et al., 2011, DOI: 10.1523/JNEUROSCI.1721-11.2011; Huang et al., 2017, DOI: 10.1016/j.neuron.2012.06.027; Hernandez-Miranda et al., 2018, DOI: 10.1073/pnas.1813520115; Ferreira et al., 2022 DOI: 10.7554/eLife.73130; Takakura et al., 2008 DOI: 10.1113/jphysiol.2008.153163; Basting et al., 2015 DOI: 10.1523/JNEUROSCI.2923-14.2015; Marina et al., 2010 DOI: 10.1523/JNEUROSCI.3141-10.2010). In addition, several conclusions presented in this work are not directly supported by the provided data.

      Thanks for the feedback on or manuscript. We have further highlighted in our discussion the previous developmental work aimed at determining the role of PHOX2B in embryonic development. Our study was triggered by the fascinating observations that despite its important role in development of the central and peripheral nervous system, PHOX2B is still present in the adult brain and its function in adult neurons is unknown, thus we aimed to investigate its role in the adult RTN by knocking down its expression with a shRNA approach. Therefore, in our model knockdown of PHOX2B does not affect development of the RTN. Previous studies (mentioned by the reviewer, as well as cited in the manuscript) have focused on investigating 1) the role of PHOX2B in the developmental period, 2) the physiological changes associated with the transgenic expression of mutant forms of PHOX2B in relation to CCHS, 3) the killing or the acute silencing/excitation of neuronal activity of PHOX2B+ RTN neurons. Our study had a different aim: to test whether the transcription factor PHOX2B had a physiologically relevant role in adult RTN neurons. In this experimental approach PHOX2B is not altered throughout embryonic or postnatal development. By knocking down PHOX2B in the Nmb+ cells of the RTN our results show a reduction in chemoreflex response and mRNA expression of protein sensors. Hence, we conclude that PHOX2B alters the function of Nmb+ RTN neurons, possibly through transcriptional changes including the reduction in Gpr4 and Task2 mRNA expression.

      Reviewer #3 (Public Review):

      A brain region called the retrotrapezoid nucleus (RTN) regulates breathing in response to changes in CO2/H+, a process termed central chemoreception. A transcription factor called PHOX2B is important for RTN development and mutations in the PHOX2B gene result in a severe type of sleep apnea called Congenital Central Hypoventilation Syndrome. PHOX2B is also expressed throughout life, but its postmitotic functions remain unknown. This study shows that knockdown of PHOX2B in the RTN region in adult rats decreased expression of Task2 and Gpr4 in Nmb-expressing RTN chemoreceptors and this corresponded with a diminished ventilatory response to CO2 but did not impact baseline breathing or the hypoxic ventilatory response. These results provide novel insight regarding the postmitotic functions of PHOX2B in RTN neurons.

      Main issues:

      (1) The experimental approach was not targeted to Nmb+ neurons and since other cells in the area also express Phox2b, conclusions should be tempered to focus on Phox2b expressing parafacial neurons NOT specifically RTN neurons.

      (2) It is not clear whether PHOX2B is important for the transcription of pH sensing machinery, cell health, or both. If knockdown of PHOX2B knockdown results in loss of RTN neurons this is also expected to decrease Task2 and Gpr4 levels, albeit by a transcription-independent mechanism.

      Although we did not specifically target Nmb+ neurons, we performed viral injections within the area where neurons expressing PHOX2B and Nmb are localized (i.e., the RTN region). We carefully quantified the impact of PHOX2B knockdown on Nmb expressing neurons, as well as the effects on the adjacent TH expressing C1 population and FN neurons (figure 5). As reported in the results section, significant changes in the numbers of PHOX2B expressing neurons was only observed at the site of injection in PHOX2B+/Nmb+ neurons. We did not observe changes in the total number of C1 cells (TH+/PHOX2B+), in the number of TH cells coexpressing PHOX2B, or in the hypoxic ventilatory response (which is dependent on the health status of C1 neuron). We have updated figure 5 to show representative expression of PHOX2B in TH+ neurons in the ventral medulla to complement our cell count analysis. To address potential effects on other cell populations we have edited our discussion as follows:

      “PHOX2B knockdown was also restricted to RTN neurons, as adjacent C1 TH+ neurons did not show any change in number of TH+/PHOX2B+ expressing cells, although we cannot exclude that some C1 cells may have been infected and their relative PHOX2B expression levels were reduced. To support the lack of significant alterations associated with the possible loss of C1 function was the absence of significant changes in the hypoxic response that has been shown to be dependent on C1 neurons (Malheiros-Lima et al., 2017).”

      Where appropriate, we have substituted “RTN” with “Nmb expressing neurons of the RTN” throughout the manuscript.

      We have clarified in the methods and results section how we quantified Task2 and Gpr4 mRNA expression. The quantification was performed on a pool of single cells (200-250/rat) expressing Nmb. Hence, the overall reduction is not a result of general fluorescence loss in the RTN region, but specifically assessed in single cells expressing Nmb. This is therefore independent of the reduction of the total number of Nmb cells.

      We propose that cell death is not a direct effect of PHOX2B knockdown, but rather it is associated with the injection of the viral constructs that have been already reported to promote some off-target effects (as reported in the manuscript). While modest cell death is observed only in the first two weeks post-infection, it does not increase further between 2 and 4 weeks post infection, when the reduction in PHOX2B (not associated with a further reduction in Nmb+ cells, hence no further cell death in RTN) is evident and the respiratory chemoreflex is impaired. These results suggest that 1) reduction of PHOX2B is not responsible for cell death; 2) it is the reduction of PHOX2B levels that promotes chemoreflex impairment. Given the observation that Nmb cells with no detectable PHOX2B protein show reduced expression of Task2 and Gpr4 mRNA, we propose that one of the possible mechanisms of chemoreflex impairment in PHOX2B shRNA rats is the reduction of Task2 and Gpr4. In the discussion we also suggest possible additional mechanisms that can be investigated in further studies.

      Recommendations for the authors:.

      In revising this manuscript, the authors should carefully address the issues raised by the reviewers to substantially improve the manuscript and solidify the reviewers' general assessment of the potential importance of this work.

      Reviewer #1 (Recommendations For The Authors):

      Major concerns:

      (1) The cell counts for Nmb+/PHOX2B+ and Nmb+/PHOX2B- RTN neurons are a critical component of the study, and it is unclear how the tissue sampling procedures (eight sections per animal) for quantifying numbers of cells expressing proteins/mRNAs throughout the extended RTN region bilaterally has been validated to accurately represent the full expression patterns in the RTN under the experimental conditions. It is possible that the sampling/quantification procedures used may be adequate, but validation is important. Also, quantification of the CTCF signal for Nmb, Gpr4, and Task2 mRNA is an important component of this study, but only four sections/rats were used.

      Thank you for pointing out your concern on our quantification method. We have clarified in the methods section the procedure for cell counting and quantification of the CTCF signal. We have sampled the area of the RTN in order to identify Nmb cells of RTN.

      We have edited the methods section as follows:

      “To quantify Nmb+/PHOX2B- and Nmb+/PHOX2B+ neurons within the RTN region, we analysed one in every seven sections (210 µm interval; 8 sections/rat in total) along the rostrocaudal distribution of the RTN on the ventral surface of the brainstem and compared total bilateral cell counts of PHOX2B-shRNA rats with non-target control (NT-shRNA) and naïve rats. Cells that expressed Nmb and Phox2b mRNAs but did not show co-localization with PHOX2B protein were considered Nmb+/PHOX2B-.

      The Corrected Total Cell Fluorescence (CTCF) signal for Nmb, Gpr4 and Task2 mRNAs was quantified as previously described (Cardani et al., 2022; McCloy et al., 2014). Briefly, a Leica TCS SP5 (B-120G) Laser Scanning Confocal microscope was used to acquire images of the tissue. Exposure time and acquisition parameters were set for the naïve group and kept unchanged for the entire dataset acquisition. The collected images were then analysed by selecting a single cell at a time and measuring the area, integrated density and mean grey value (McCloy et al., 2014). For each image, three background areas were used to normalize against autofluorescence. We used 4 sections/rat (210 µm interval) to count Nmb, Gpr4 and Task2 mRNA CTCF in the core of the RTN area where several Nmb cells could be identified. For each section two images were acquired with a 20× objective, so that at least fifty cells per tissue sample were obtained for the mRNA quantification analysis. To evaluate changes in Nmb mRNA expression levels following PHOX2B knockdown at the level of the RTN, we compared, the fluorescence intensity of each RTN Nmb+ cell (223.2 ± 37.1 cells/animal) with the average fluorescent signal of Nmb+ cells located dorsally in the NTS (4.3 ± 1.2 cells/animal) (Nmb CTCF ratio RTN/NTS) as we reasoned that the latter would not be affected by the shRNA infection and knockdown.

      To quantify Gpr4 and Task2 mRNA expression in Nmb cells of the RTN, we first quantified single cell CTCF for either Gpr4 (200.7 ± 13.2 cells/animal) or Task2 (169.6 ± 10.3 cells/animal) mRNA in Nmb+ RTN neurons in the 3 experimental groups (naïve, NT shRNA and PHOX2B shRNA) independent of their PHOX2B expression. We then compared CTCF values of Gpr4 and Task2 mRNA between Nmb+/PHOX2B+ and Nmb+/PHOX2B- RTN neurons in PHOX2B-shRNA rats to address changes in their mRNA expression induced by PHOX2B knockdown.

      (2) Furthermore, to evaluate changes in Nmb mRNA expression following PHOX2B knockdown at the level of the RTN, it is stated in Materials and Methods "we compared, on the same tissue section, the fluorescence intensity of RTN Nmb+ cells with the signal of Nmb+ cells in the NTS (Nmb CTCF ratio RTN/NTS)". How this was accomplished is unclear, considering the non-overlapping locations of the RTN and rostral NTS. Providing images would be helpful.

      The first sections containing Nmb cells in the ventral medulla also express few Nmb cells in the dorsal medulla. We used those cells as reference for fluorescence levels since they would not be affected by the viral infection. Similar cells are also present in the brains of mice and reported in the Allen Brain atlas (https://mouse.brain-map.org/experiment/show/71836874). We have clarified our procedure in the methods section (see above) and included a sample image of Nmb in both ventral and dorsal regions in Figure 5.

      (3) The staining for tyrosine hydroxylase (TH) to identify and quantify C1 cells (TH+/PHOX2B+) following shRNA injection provides important information, and it would be useful to show images of histological examples to accompany Fig. 5A.

      We included in figure 5A a sample image of C1 neurons used for our TH quantification.

      Minor:

      (1) Provide animal ns in the text of the Results section for the four weeks of PHOX2B knockdown.

      They have been included.

      (2) Please state in the legends for Figures 2 & 3, which images are superimposition images.

      We have in the figure information about merged images.

      Reviewer #2 (Recommendations For The Authors):

      This manuscript by Cardani and colleagues attempts to address whether a reduction in Phox2b expression in chemosensitive neuromedin-B (NMB)-expressing neurons in the RTN alters respiratory function. The authors used a short hairpin RNA technique to silence RTN chemosensor neurons. The present study is very interesting, but there are several major concerns that need to be addressed, including the main hypothesis.

      Major

      (1) Page 6, lines 119-121: I did not grasp the mechanistic property described by the authors in this passage, nor did I understand the experiments they conducted to establish a mechanistic link between Phox2b and the chemosensitive property. Could the authors provide further clarification on these points?

      We believe the reviewer refers to this paragraph: “In order to have a better understanding of the role of PHOX2B in the CO2 homeostatic processes we used a non-replicating lentivirus vector of two short-hairpin RNA (shRNA) clones targeting selectively Phox2b mRNA to knockdown the expression of PHOX2B in the RTN of adult rats and tested ventilation and chemoreflex responses. In parallel, we also determined whether knockdown of PHOX2B in adult RTN neurons negatively affected cell survival. Finally, we sought to provide a mechanistic link between PHOX2B expression and the chemosensitive properties of RTN neurons, which have been attributed to two proton sensors, the proton-activated G protein-coupled receptor (GPR4) and the proton-modulated potassium channel (TASK-2).”

      The rationale for running these experiments is based on the fact that it is well known in the literature that PHOX2B is an important transcription factor for the development of several neuronal populations. PHOX2B Knockout mice die before birth and heterozygous mice have some anatomical defects, but respiration is only impaired in the early post-natal period. While many developmental transcription factors are generally downregulated in the post-natal period, PHOX2B is still expressed in some neurons into adulthood. What is the function of PHOX2B in these fully developed neurons? We do not know as we do not yet know the entire set of target genes that PHOX2B regulates in the adult brain. Hence we decided to test what would happen if we knocked down the PHOX2B protein in the Nmb neurons of the RTN, an area that is critical for central chemoreception and involved in the presentation of CCHS. Our results show that reduction of PHOX2B blunts the CO2 chemoreflex response and reduces mRNA expression of Task2 and Gpr4, two pH sensors that have been shown to be key for RTN chemosensitive properties. We also show that the Nmb mRNA and cell survival are not affected by PHOX2B knockdown and we propose that the reduced CO2 chemoreflex may be attributed to a reduction of chemosensory function of Nmb neurons of the RTN due to partial loss of Gpr4 and Task2.

      (2) It is imperative for the authors to enhance the description of their hypothesis, as, from my perspective, the contribution of the data to the field is not clearly articulated. Numerous more selectively designed experiments were conducted to investigate the role of Phoxb-expressing neurons at the RTN level and their involvement in respiratory activity. In summary, the current study appears to lack novelty.

      We respectfully disagree with this statement. We believe we have adequately summarized previous work, although we realize we can’t reference every single publication on this topic. As described above, the developmental role of PHOX2B has been elegantly investigated in mouse embryonic studies (extensively cited in the manuscript). Furthermore, very interesting studies have shown that when the CCHS defining mutant PHOX2B protein (+7Ala PHOX2B) and other mutations linked to CCHS have been transgenically expressed in mice through development, severe anatomical defects are observed and respiratory function is impaired (extensively cited in the manuscript). We have also cited papers relevant to this study that describe the role of PHOX2B/Nmb RTN neurons and the pH protein sensors in the CO2 chemoreflex. If we missed some papers that the reviewer deems essential in the context of this study we will be happy to include them.

      We are not aware of other studies that have investigated the specific role of the PHOX2B protein in the adult RTN in the absence of confounding developmental pathogenesis (i.e. in an otherwise ‘healthy’ animal), and of no other studies that looked at the effects on the RTN proton sensors and Nmb expression following PHOX2B knockdown. Hence we believe that our results are novel and, in our opinion, very interesting.

      (3) On pages 13 and 14 (Results section), I am seeking clarity on the novelty of the findings. Doug Bayliss's prior work has already demonstrated the role of Gpr4 and Task2 on Phox2b neurons in regulating ventilation in conscious rodents.

      Bayliss’ group has elegantly demonstrated that Gpr4 and Task2 are the two proton sensors in the PHOX2B/Nmb neurons of the RTN that have a key role in chemoreception (cited in the manuscript). The novelty of our findings is that we show that a reduction in PHOX2B protein is associated with a reduction of mRNA levels of Gpr4 and Task2. This is a novel finding. Currently, we do not know what transcriptional activity PHOX2B has in adult RTN neurons (i.e., what gene targets PHOX2B has in this cell population and many others) and here we propose that Nmb is not a gene target of PHOX2B while Gpr4 and Task2 are.

      (4) The authors assert that the transcription factor Phox2b remains not fully understood. While I concur, the present study falls short of fully investigating the actual contribution of Phox2b to breathing regulation. In other words, the knockdown of Phox2b neurons did not add much to the knowledge of the field.

      We respectfully disagree with the reviewer. With the exception of very few target genes, the transcriptional role of PHOX2B beyond the embryonic development is poorly understood. No mechanistic connection has been made before between the transcriptional activity of PHOX2B with the expression of proton sensors in the RTN. Other groups have investigated the role of stimulating or depressing the neuronal activity of PHOX2B/NMB neurons in the RTN showing a key role of RTN on respiratory control, but these prior studies did not test whether changing the expression of the PHOX2B protein in these neurons had a role on respiratory control and the central chemoreflex. No other study has investigated the role of the PHOX2B protein within the RTN cells, with the exception of PHOX2B knockout mice or transgenic expression of the mutated PHOX2B that are relevant for CCHS. Again, these previous studies were done on a background of developmental impairment and to the best of our knowledge did not seek to show any association between PHOX2B expression and expression of Gpr4 or Task2.

      (5) I recommend removing the entire section entitled "The role of Phox2b in development and in the adult brain." The authors merely describe Phox2b expression without contextualizing it within the obtained data.

      Because reviewers raised the issue about not including important information about the role of PHOX2B in development and respiratory control we prefer to keep the section.

      (6) Are the authors aware of whether the shRNA in Phox2b/Nmb neurons truly induced cell death or solely depleted the expression of the transcription factor protein? Do the chemosensitive neurons persist?

      This is an excellent question that we tried to address with our study. As we report in figures 2 and 3, we propose that some cell death is occurring as an off-target effect within the first 2 weeks post-infection, likely due to off-target action of the shRNA approach and not dependent on the reduction of PHOX2B expression (discussed in the manuscript). This is further evidenced by our Fig.S1 data in which higher concentrations of shRNA led to more cell death, indicative of off-target effects. We do not believe it is a consequence of our surgical procedure as we do not see similar cell loss when injecting vehicle or other control solutions (unpublished work; Janes et al., 2024).

      During the first 2 weeks post-surgery the proportion of Nmb+/PHOX2B- cells does not change compared to control rats or non-target shRNA (knockdown is not yet visible at protein level). Four weeks post-injection, there is no further cell death (assessed by the total number of NMB cells), whereas the fraction of NMB cells that express PHOX2B is reduced (and the fraction of NMB not expressing PHOX2B is increased), suggesting that the reduction of PHOX2B protein in Nmb cells is not correlated with cell loss/survival whereas the impairment that we observe in terms of central chemoreception is possibly due to the progressive decrease of PHOX2B expression in these neurons.

      (7) In Figures 2 and 3, it is noteworthy that the authors observe peak expression at a very caudal level. In rats, the RTN initiates at the caudal end of the facial, approximately 11.6 mm, and should exhibit a rostral direction of about 2 mm.

      In our experience the Nmb cells on the ventral surface of the medulla peak in number around the caudal tip of the facial nucleus in adult SD rats (Janes et al., 2024). To add clarity to the figure we reported cell count distribution data in relation to the distance from caudal tip of the facial.

      Minor

      (1) I would like to suggest that the authors correct the recurring statement throughout the manuscript that Phox2b is essential only for the development of the autonomic nervous system. In my view, it also plays a crucial role in certain sensory and respiratory systems.

      We have addressed this in the manuscript.

      (2) Page 4, lines 59-60: Out of curiosity, do the data include information from different countries?

      This data refers to information from France and Japan. Currently it is estimated that there are 1000-2000 CCHS patients worldwide.

      (3) Page 7, lines 129-131: In my understanding, the sentence is quite clear; if we knock down the PHOX2B gene, we are expected to reduce or even eliminate the expression of Gpr4 or Task2. Am I right?

      This is what we propose from the results of this study. We would like to point out that the transcriptional activity of PHOX2B (i.e., what genes PHOX2B regulate) in adult neurons has not yet been fully investigated. With the exception of few target genes (e.g., TH, DBH) the transcriptional activity of PHOX2B in neurons is not yet known. Here we report novel findings that suggest that Gpr4 and Task2 are potential target genes of PHOX2B in RTN neurons.

      (4) The authors mentioned that NT-shRNA also impacts CO2 chemosensitivity. Could this effect be attributed to mechanical damage of the tissue resulting from the injection?

      Just to clarify, we observe some impairment in chemosensitivity when NT-shRNA was injected in “larger” (2x 200ul/side) volume. No impairment was observed in NT-shRNA when we injected smaller volumes (2x 100ul/side). Physical damage could be a possibility although in our experience (unpublished work; Janes et al, 2024, Acta Physiologica) injections of similar volume of solution performed by the same investigator in the same brain area and experimental settings did not produce a physical lesion associated with respiratory impairment. Hence we attribute the unexpected results with larger volumes to toxic effects associated with the shRNA viral constructs.

      (5) In the reference section, the authors should review and correct some entries. For instance, Janes, T. A., Cardani, S., Saini, J. K., & Pagliardini, S. (2024). Title: "Etonogestrel Promotes Respiratory Recovery in an In Vivo Rat Model of Central Chemoreflex Impairment." Running title: "Chemoreflex Recovery by Etonogestrel." Some references contain the journal, pages, and volume, while others lack this information entirely.

      We have updated references. Janes et al., 2024 has now been published in Acta Physiologica.

      (6) Why does the baseline have distribution points, whereas the other boxplots do not?

      We have clarified in the figure legend that, to be fair to the presentation of our results, the data points shown in some of the boxplot graphs do not refer to entire baseline data but only the ones that are outliers.

      In our Box-and Whisker-Plots, whiskers represent the 10th and 90th percentiles, showing the range of values for the middle 80% of the data. Individual data values that fall outside the 10th/90th percentile range are represented as single point (outliers).

      Reviewer #3 (Recommendations For The Authors):

      • What is the rationale behind dedicating the first paragraph of results to discussing an artifact?

      We think that it is important to report off target effects of shRNA viral constructs as concentration and volumes of viruses injected in various studies vary considerably and other investigators may attempt to use larger volumes of viruses to obtain more considerable or faster knockdown but would obtain erroneous conclusions if appropriate tests are not performed.

      Furthermore, because some readers could question whether we injected enough virus to knockdown the expression of PHOX2B, and may wonder if with a larger amount of virus we would increase knockdown efficiency, we wanted to show that, in our opinion, we used the maximum amount of virus to knockdown PHOX2B without causing toxic effects or physiological changes that are not dependent on PHOX2B knockdown.

      • All individual data points should be visible in floating bar graphs in Figures 1 and 4. For example, I don't see any dots for naïve animals in any of the panels in Figure 1.

      We have clarified in the figure legend that, to be fair to the presentation of our results, the data points shown in some of the boxplot graphs do not refer to entire baseline data but only the ones that are outliers.

      In our Box-and Whisker-Plots, whiskers represent the 10th and 90th percentiles, showing the range of values for the middle 80% of the data. Individual data values that fall outside the 10th/90th percentile range are represented as single point (outliers).

      • Please include specific F and T values along with DF.

      We have included a table with all the specific values in the supplementary section as Table 1.

      • The C1 and facial partly overlap with the RTN at this level of the medulla and these cells should appear as Phox2b+/Nmb- cells so it is not clear to me why these cells are not evident in the control tissue in Figures 2B and 3B. Also, some of the bregma levels shown in Figure 5A overlap with Figures 2-3 so again it is not clear to me how this non-cell type specific viral approach was targeted to Nmb cells but not nearby TH+ cells. Please clarify.

      In our experience, C1 TH cells are located slightly medial to the Nmb cells and they spread much more caudally than Nmb cells of the RTN. We focused our small volume injection in the core of the RTN to target Nmb cells but we also assessed PHOX2B knockdown in TH C1 cells by counting the PHOX2B/TH cells across treatment groups. Although we can’t exclude subtle changes in the C1 population, we did not observe changes in the total number of C1 cells (TH+/PHOX2B+), in the number of TH cells expressing PHOX2B, or in the hypoxic ventilatory response (which is dependent on the health status of C1 neuron). We have updated figure 5 to show representative expression of PHOX2B in TH+ neurons in the ventral medulla to complement our cell count analysis. To address potential effects on other cell populations we have edited our discussion as follows:

      “PHOX2B knockdown was also restricted to RTN neurons, as adjacent C1 TH+ neurons did not show any change in number of TH+/PHOX2B+ expressing cells, although we cannot exclude that some C1 cells may have been infected and their relative PHOX2B expression levels were reduced. To support the lack of significant alterations associated with the possible loss of C1 function was the absence of significant changes in the hypoxic response that has been shown to be dependent on C1 neurons (Malheiros-Lima et al., 2017).”

      • To confirm, Nmb is not expressed in the NTS, and this region was chosen as a background, right?

      In order to systematically analyze Nmb mRNA expression we decided to use measurement of fluorescence relative to Nmb neurons present in the dorsal brainstem. Here cells are sparse but we used them as reference fluorescence since they would not be affected by the ventral shRNA injection. Similar cells are also present in the brains of mice and reported by the Allen Brain atlas (https://mouse.brain-map.org/experiment/show/71836874). We have clarified our procedure in the methods section (see above) and included a sample image of Nmb in both ventral and dorsal in Figure 5.

      • How do you get a loss of Nmb+ neurons (Figs 2-3) with no change in Nmb fluorescence (Fig. 5B)? In the absence of representative images these results are not compelling and should be substantiated by more readily quantifiable approaches like qPCR.

      We have clarified in the methods and results section our analytical procedure to assess PHOX2B and Nmb expression. Figure 2 and 3 display the results of counting numbers of Nmb+ cells in the RTN. Figure 5B reports the average of total cell fluorescence measured inside Nmb+ cells, not an average fluorescence measurement of the area of the ventral medulla. Basically, our results show that we have less Nmb cells that express PHOX2B but the overall Nmb mRNA fluorescence (expression) in Nmb cells relative to Nmb fluorescence in cells of the dorsal brainstem is the same.

      We have edited the methods as follows:

      “The Corrected Total Cell Fluorescence (CTCF) signal for Nmb, Gpr4 and Task2 mRNAs was quantified as previously described (Cardani et al., 2022; McCloy et al., 2014). Briefly, a Leica TCS SP5 (B-120G) Laser Scanning Confocal microscope was used to acquire images of the tissue. Exposure time and acquisition parameters were set for the naïve group and kept unchanged for the entire dataset acquisition. The collected images were then analysed by selecting a single cell at a time and measuring the area, integrated density and mean grey value (McCloy et al., 2014). For each image, three background areas were used to normalize against autofluorescence. We used 4 sections/rat (210 µm interval) to count Nmb, Gpr4 and Task2 mRNA CTCF in the core of the RTN area where several Nmb cells could be identified. For each section two images were acquired with a 20× objective, so that at least fifty cells per tissue sample were obtained for the mRNA quantification analysis. To evaluate changes in Nmb mRNA expression levels following PHOX2B knockdown at the level of the RTN, we compared the fluorescence intensity of each RTN Nmb+ cell (223.2 ± 37.1 cells/animal) with the average fluorescent signal of Nmb+ cells located dorsally in the NTS ( 4.3 ± 1.2 cells/animal) (Nmb CTCF ratio RTN/NTS) as we reasoned that the latter would not be affected by the shRNA infection and knockdown. “

      A single cell qPCR analysis would be definitely ideal but a qPCR from dissected tissue would not help us determine whether within a cell there was a reduction in Nmb mRNA levels.

      • The boxed RTN region in these examples is all over the place. It the RTN should be consistently placed along the ventral surface under the facial and pprox.. equal distance from the trigeminal and pyramids.

      We have update the figures to consistently present the areas of interest where Nmb cells are located and images are taken.

      • Fluorescent in situ typically appears as discrete puncta so it is not clear to me why that is not the case here.

      Our images are taken at low magnification (20X) where it is difficult to distinguish the single mRNA molecules. However, is it possible to appreciate the differences between the grainy fluorescent signal in the in situ hybridization assay (RNAScope) and the smoother signal of protein detection in the immunofluorescence assay.

      • Can TUNEL staining be done to confirm loss of Nmb neurons is due to death and not re-localization?

      Does the reviewer mean “cell migration” with relocalization? We do not expect that this would occur in our experiments. Although TUNEL in the first week post-infection could be useful to determine cell death in our tissue, we do not expect a cell migration of neurons within the brain as our viral shRNA injections are performed in adult rats when developmental processes are already concluded.

    2. Reviewer #2 (Public Review):

      Summary:

      This significant research explored how the PHOX2B transcription factor functions within neurons located in the retrotrapezoid nucleus (RTN), a crucial brainstem chemosensory area, to sustain appropriate CO2 chemoreflex reactions related to breathing in adult rats when observed in a living state. By applying a viral shRNA technique to selectively suppress PHOX2B in RTN neurons, the authors present compelling evidence of deteriorating ventilatory reactions to increased CO2 levels. This impairment progresses over a four-week period in vivo, hinting at disruptions in RTN neuron transcriptional processes and a consequent dulling of CO2-induced breathing responses. The data on RTN neuronal mRNA expression indicates that the weakened hypercapnic ventilatory response may stem from reduced levels of crucial proton sensors within the RTN. This research holds relevance for neuroscientists focused on the neurobiology of respiration and the neurodevelopmental regulation of motor functions.

      Strengths:

      The authors employed a shRNA viral strategy to systematically reduce PHOX2B protein levels, targeting RTN neurons specifically, to assess the importance of PHOX2B for the survival and chemosensory capabilities of adult RTN neurons in a living organism. The findings of this research underscore that beyond its developmental role, PHOX2B remains essential for sustaining accurate CO2 chemoreflex reactions in the adult brain. Furthermore, its diminished presence in Congenital Central Hypoventilation Syndrome (CCHS) could be a factor in the respiratory deficiencies observed in the condition. This study highlights the critical ongoing function of PHOX2B in adult physiology and its potential impact on respiratory health, offering valuable insights for the scientific and medical communities involved in treating and understanding respiratory disorders.

      Weaknesses:

      N/A

    1. Author response:

      We sincerely thank the editors and reviewers for the rigorous evaluation of our work and the precious time invested. The positive comments resonate with our endeavor to explore the intrinsic role of astrocyte aquaporin in brain water homeostasis. Meanwhile, we very appreciate the constructive suggestions of the reviewers to consolidate this study. Here is the provisional response, which briefly outlines our acknowledgement of the reviewers’ suggestions:

      To Reviewer #1:

      • Imaging data will be examined and collected to determine whether AQP4 inhibition has differential effects on astrocyte calcium signals in terms of cellular locations.

      • New analysis will be performed for CSD swelling data to provide additional kinetic information.

      • The mentioned original papers are important, and will be included in the revision.

      To Reviewer #2:

      We agree, a careful revision will improve and better position the study.

      • Echoing Reviewer #1, the introduction and discussion will be strengthened with current scientific contexts, while paying attention to the important advances in glymphatic system. The limits of the study mentioned in the reviews will be stated.

      • The use of TGN-020 was based on its validation by wide range of ex vivo and in vivo studies. AER-270(271) was nicely introduced by Farr et al., 2019 (PMID: 30738082). Its validation in vivo in AQP4 KO mice, and the comparison to TGN-020, is reported in a very recent study (Giannetto et al., 2024 - PMID: 38363040) that provides valuable insights.

      • The description of specific methodologies, including the DW-MRI, will be reinforced. The presentation of experiments and statistical analysis will be refined.

      To Reviewer #3:

      • Solenov et al., 2004 (PMID: 14576087) used the calcein quenching assay and KO mice convincingly showing AQP4 is a functional water channel in cultured astroctyes. AQP4 deletion reduced both astrocyte water permeability and the absolute amplitude of swelling over comparable time, and also slowed down cell shrinking, which overall parallels our results from acute AQP4 blocking. Yet in Solenovr’s study, the time to swelling plateau was prolonged in AQP4 KO astrocytes, differing from our data of acute blocking. This difference may be due to compensatory mechanisms in chronic AQP4 KO, or reflect the different volume responses in cultured astrocytes from brain slices/in vivo results as noted previously (e.g., Risher et al., 2009 - PMID: 18720409). As suggested, methods for volume recordings will be examined.

      • It is an important point that TGN-020 partially blocks AQP4, implying the actual functional impact of AQP4 per se might be stronger than what we observed. TGN provides a means to acutely probe AQP4 function in situ, still we agree, its limitation needs be acknowledged.

      • As also pointed by Reviewer #2, the description and interpretation of DW-MRI data will be improved.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This is a valuable computational study that applies the machine learning method of bilinear modeling to the problem of relating gene expression to connectivity. Specifically, the author attempts to use transcriptomic data from mouse retinal neurons to predict their known connectivity. The results are promising, although the reviewers felt that demonstration of the general applicability of the approach required testing it against a second data set. Hence the present results were felt to provide borderline incomplete support for a key premise of the paper.

      We thank the reviewers for their insightful and constructive feedback. In response to the reviews, we have undertaken a comprehensive revision of our manuscript, incorporating changes and improvements as outlined below:

      (1) New results have been included showcasing the application of our bilinear model to a seconddataset focusing on C. elegans gap junction connectivity. This extension validates our model with a biological context other than mouse retina and facilitates a direct comparison with the spatial connectome model (SCM).

      (2) A new section titled "Previous Approaches" has been added to background, situating our studywithin the broader landscape of existing modeling methodologies.

      (3) The discussion sections have been expanded to fully incorporate the suggestions and insightsoffered by the reviewers. This includes a deeper exploration of the implications of our findings, potential applications of our model, and a more thorough consideration of its limitations and future directions.

      (4) To streamline the main text and ensure that the core narrative remains focused and accessible, select figures and tables have been relocated to the "Supplementary Materials" section.

      Reviewer 1 (Public Review):

      Summary of what the author was trying to achieve: In this study, the author aimed to develop a method for estimating neuronal-type connectivity from transcriptomic gene expression data, specifically from mouse retinal neurons. They sought to develop an interpretable model that could be used to characterize the underlying genetic mechanisms of circuit assembly and connectivity.

      Strengths:

      The proposed bilinear model draws inspiration from commonly implemented recommendation systems in the field of machine learning. The author presents the model clearly and addresses critical statistical limitations that may weaken the validity of the model such as multicollinearity and outliers. The author presents two formulations of the model for separate scenarios in which varying levels of data resolution are available. The author effectively references key work in the field when establishing assumptions that affect the underlying model and subsequent results. For example, correspondence between gene expression cell types and connectivity cell types from different references are clearly outlined in Tables 1-3. The model training and validation are sufficient and yield a relatively high correlation with the ground truth connectivity matrix. Seemingly valid biological assumptions are made throughout, however, some assumptions may reduce resolution (such as averaging over cell types), thus missing potentially important single-cell gene expression interactions.

      Thank you for recognizing the strengths of our work, particularly the clarity of the model presentation and its foundation in recommendation systems. In the revised manuscript we have also extended the model’s capabilities to analyze gene interactions for neural connectivity at single-cell resolution, when gene expression and connectivity of each cell are known simultaneously.

      Weaknesses:

      The main results of the study could benefit from replication in another dataset beyond mouse retinal neurons, to validate the proposed method. Dimensionality reduction significantly reduces the resolution of the model and the PCA methodology employed is largely non-deterministic. This may reduce the resolution and reproducibility of the model. It may be worth exploring how the PCA methodology of the model may affect results when replicating. Figure 5, ’Gene signatures associated with the two latent dimensions’, lacks some readability and related results could be outlined more clearly in the results section. There should be more discussion on weaknesses of the results e.g. quantification of what connectivity motifs were not captured and what gene signatures might have been missed.

      We acknowledge the significance of validating our method across different datasets. In line with this, our revised manuscript now includes an expanded analysis utilizing a C. elegans gap junction connectivity dataset, which not only broadens the method’s demonstrated applicability but also underscores its versatility across varied neuronal systems.

      To address the concern of resolution and reproducibility associated with PCA preprocessing, we have conducted a comparative analysis from five replicates of the bilinear model, presenting the results in the revised manuscript (Figure S3). This analysis confirms the consistency of the solutions, as evidenced by the similarity metrics. Furthermore, we discussed alternative methodologies, such as L1 or L2 regularization, to tackle multicollinearity, offering flexibility in preprocessing choices.

      In response to feedback on the original Figure 5’s clarity, we have replaced the original Figure 5e-h with Table S4, which summarizes the gene ontology (GO) enrichment results and quantifies the number of genes associated with aspects of neural development and synaptic organization. This revision aims to improve the interpretability and accessibility of the results, ensuring a clearer presentation of the model’s insights.

      Finally, we have expanded our discussion to address the study’s limitations more comprehensively. This includes exploration of potentially missed connections and gene signatures, such as transcription factors, which might not be captured by a linear model due to its inherent preference for predictors with strong correlations to the target variable.

      The main weakness is the lack of comparison against other similar methods, e.g. methods presented in Barabási, Dániel L., and Albert-László Barabási. "A genetic model of the connectome." Neuron 105.3 (2020): 435-445. Kovács, István A., Dániel L. Barabási, and Albert-László Barabási. "Uncovering the genetic blueprint of the C. elegans nervous system." Proceedings of the National Academy of Sciences 117.52 (2020): 33570-33577. Taylor, Seth R., et al. "Molecular topography of an entire nervous system." Cell 184.16 (2021): 4329-4347.

      We value your suggestion to compare our model with established methods. The revised manuscript now includes a comparative analysis with the spatial connectome model (SCM) using the same C. elegans dataset. In addition, a section reviewing previous approaches has been included in the background part, and the discussion part has been extended for the comparison.

      Appraisal of whether the author achieved their aims, and whether results support their conclusions: The author achieved their aims by recapitulating key connectivity motifs from single-cell gene expression data in the mouse retina. Furthermore, the model setup allowed for insight into gene signatures and interactions, however could have benefited from a deeper evaluation of the accuracy of these signatures. The author claims the method sets a new benchmark for single-cell transcriptomic analysis of synaptic connections. This should be more rigorously proven. (I’m not sure I can speak on the novelty of the method)

      In the revised manuscript. we emphasized the bilinear model’s innovative application in the context of neuronal connectivity analysis, inspired by collaborative filtering in recommendation systems. We present quantitative performance metrics, such as the ROC-AUC score and Pearson correlation coefficient, as well as its comparison with the SCM, to benchmark our model’s efficacy in reconstructing connectivity matrices. We also quantified the overlap of the genetic interactions revealed by the bilinear model and the SCM (using the C. elegans dataset), and reported the percentage of the top genes associated with neural development and synaptic organization (using the mouse retina dataset). These numbers set a precedent for future methodological comparisons.

      Discussion of the likely impact of the work on the field, and the utility of methods and data to the community : This study provides an understandable bilinear model for decoding the genetic programming of neuronal type connectivity. The proposed model leaves the door open for further testing and comparison with alternative linear and/or non-linear models, such as neural networkbased models. In addition to more complex models, this model can be built on to include higher resolution data such as more gene expression dimensions, different types of connectivity measures, and additional omics data.

      We are grateful for your recognition of the study’s potential impact. The bilinear model indeed offers a foundation for future explorations, allowing for integration with more complex models, higher-resolution data, and diverse connectivity measures.

      Reviewer 1 (Recommendations For The Authors):

      The inclusion of predicted connectivity (Figure 6) of unknown BC neurons is useful as it shows that this is a strong hypothesis generation tool. This utility should potentially be showcased more as it is also brought up in the abstract, "genetic manipulation of circuit wiring", with an explanation of how the model could be leveraged as such. The discussion may benefit from a summarizing sentence regarding which key gene signatures were identified and are in line with the literature, which key gene signatures/connectivity motifs may have been missed, and which gene signatures are novel.

      Thank you for the insightful recommendation on emphasizing the model’s utility in generating hypotheses, particularly regarding predicting connectivity. In the revised manuscript, we have expanded the discussion on how our model can be leveraged to guide genetic manipulations at altering circuit wiring and highlighted its potential impact in the field.

      We have discussed key gene signatures identified from our model that are in line with existing literature, such as plexins and cadherins, which have been previously recognized for their involvement in synaptic connection formation and maintenance. We have also introduced potential new candidates, such as delta-protocadherins. In the revised manuscript, we summarized potentially missed gene signatures or synaptic connections, to provide a comprehensive view of our findings.

      Reviewer 2 (Public Review):

      Summary:

      In this study, Mu Qiao employs a bilinear modeling approach, commonly utilized in recommendation systems, to explore the intricate neural connections between different pre- and post-synaptic neuronal types. This approach involves projecting single-cell transcriptomic datasets of pre- and post-synaptic neuronal types into a latent space through transformation matrices. Subsequently, the cross-correlation between these projected latent spaces is employed to estimate neuronal connectivity. To facilitate the model training, connectomic data is used to estimate the ground-truth connectivity map. This work introduces a promising model for the exploration of neuronal connectivity and its associated molecular determinants. However, it is important to note that the current model has only been tested with Bipolar Cell and Retinal Ganglion Cell data, and its applicability in more general neuronal connectivity scenarios remains to be demonstrated.

      Strengths:

      This study introduces a succinct yet promising computational model for investigating connections between neuronal types. The model, while straightforward, effectively integrates singlecell transcriptomic and connectomic data to produce a reasonably accurate connectivity map, particularly within the context of retinal connectivity. Furthermore, it successfully recapitulates connectivity patterns and helps uncover the genetic factors that underlie these connections.

      Thank you for your positive assessment of the paper.

      Weaknesses:

      (1) The study lacks experimental validation of the model’s prediction results.

      We recognize the importance of experimental validation in substantiating the predictions made by computational models. While the primary focus of this study remains computational, we have dedicated a section in the revised manuscript, titled "Experimental Validation of Candidate Genes", to outline proposed methodologies for the empirical verification of our model’s predictions. This section specifically discusses the experimental exploration of novel candidate genes, such as deltaprotocadherins, within the mouse retina using AAV-mediated CRISPR/Cas9 genetic manipulation. We plan to collaborate with experimental laboratories to facilitate the validation. Given the extensive nature of experimental work, both in terms of time and resources, it is more pragmatic to present a comprehensive experimental investigation in a follow-up study.

      (2) The model’s applicability in other neuronal connectivity settings has not been thoroughly explored.

      The question of the model’s broader applicability is well-taken. In response, we have expanded our analysis to include additional neuronal data and connectivity settings. Specifically, the revised manuscript includes results where we apply the model to a dataset of C. elegans gap junction connectivity, demonstrating its potential in different neuronal systems. This extension serves to illustrate the model’s adaptability and potential applicability to a broader range of neuronal connectivity studies.

      (3) The proposed method relies on the availability of neuronal connectomic data for model training,which may be limited or absent in certain brain connectivity settings.

      We acknowledge the limitations posed by the model’s dependency on comprehensive connectomic data, which may not be readily available across all research contexts. To address this, we have discussed in the revised manuscript several alternative strategies to adapt our model to the available data. This includes exploring the potential of applying the model to available data such as projectome, and integrating other data modalities such as electrophysiological measurements. These initiatives aim to enhance the model’s applicability and ensure its utility in a broader spectrum of brain connectivity studies, especially in scenarios where detailed connectomic data are not available.

      Reviewer 2 (Recommendations For The Authors):

      Q1. In this work, the author has mainly been studying the retina neuronal type connectivity, it will be interesting to see whether the model works for other brain regions or other neuronal type connectivity as well.

      We value your interest in the model’s applicability to other brain regions and neuronal types. To address this, we have extended our analysis in the revised manuscript to include a study on gap junction connectivity between C. elegans neurons. This extension demonstrates the model’s versatility and its potential applicability across various nervous systems and connectivity types.

      Q2. Whether the authors can use the same transformation matrices trained from the retina data to predict neuronal connectivity in other brain regions? Or an easier case, the connectivity between RGC types to the neuronal types in SC, dLGN, or other post-RGC-synaptic brain regions. As the neuronal connection mechanisms are conserved and widely shared between different neuronal types, one would expect the same transformation matrices may work in predicting other neuronal type connectivity as well (at least to some extent).

      The idea to use the same transformation matrices for predicting connectivity in other brain regions is intriguing. While direct application of these matrices to different regions remains challenging, we discussed the potential scalability of our model to other brain areas. By applying the model to combined datasets from various regions, we could uncover conserved neuronal connection mechanisms. This approach is theoretically feasible and is supported by the demonstrated scalability of the bilinear model and its deep learning variants in industrial applications.

      Q3. Section 5.2 Connectivity metric generation: in this work, the author uses the stratification profiles of the neurons to estimate the connectivity metric, how reliable this method is? There will be a scenario where though two neuronal types project to a similar inner plexiform layer, they may not have any connection. Have the authors considered combining other experimental data (like electrophysiology data or neuron tracing data)?

      We discussed the reliability of using stratification profiles for estimating connectivity metrics, acknowledging potential limitations. In the revised manuscript, we added discussion on how the integration of additional experimental data, such as electrophysiological and neuron tracing data, could enhance the accuracy of the connectivity metrics.

      Q4. Section 6 Model training and validation: does the author have a potential hypothesis as to why 2 dimensions are the best latent feature spaces dimensionality? One would imagine with more dimensionality, the model will give better results. Could it be that the connectivity data that is used to train the model is only considering the two-dimensional space of the neuronal stratification?

      The selection of two dimensions for the latent feature space was informed by 5-fold cross-validation, aimed at optimizing model generalization to unseen data. Here while increasing dimensionality improves performance on the training set, it does not necessarily enhance generalization to the validation set. Thus, the choice of two dimensions ensures good performance without overfitting to the training data.

      Q5. Could the author provide the source code for the analysis? Or could the author make it a python/R package so that non-computational biologists can easily apply the method to their own data?

      We have included a "Data and Code Availability" section in the revised manuscript. This section provides a link to the source code with pointers to datasets used in our study, facilitating the application of our methods by researchers from various backgrounds.

      Q6. I know it may be difficult for the author to do, but is it possible to design and perform some experiments to validate the model prediction results, either connectivity partners of transcriptomicallydefined RGC types or the function of the key genetic molecules (which hasn’t been discovered before)? The author may consider collaborating with some experimental labs. The author may even consider predicting the connectivity between RGC with some of its post-synaptic neurons in the brain regions, like SC or dLGN, as recently there are a lot of single-cell sequencing data as well as connectivity data.

      We appreciate your suggestion regarding experimental validation. As a future direction, we have discussed potential experimental approaches to validate the model’s predictions in the "Experimental Validation of Candidate Genes" section. Specifically, we propose an experimental design involving the manipulation of delta-protocadherins using AAV-mediated CRISPR/Cas9 and subsequent examination of connectivity phenotypes. We are also open to collaborating with experimental labs to further explore the model’s predictions, particularly in predicting connectivity between RGCs and their post-synaptic neurons in other brain regions.

    2. Reviewer #2 (Public Review):

      Summary:

      In this study, Mu Qiao employs a bilinear modelling approach, commonly utilised in the recommendation systems, to explore the intricate neural connections between different pre- and post-synaptic neuronal types. This approach involves projecting single-cell Transcriptomic datasets of pre- and post-synaptic neuronal types into a latent space through transformation matrices. Subsequently, the cross-correlation between these projected latent spaces is employed to estimate neuronal connectivity. To facilitate the model training, Connectomic data is used to estimate the ground-truth connectivity map. This work introduces a promising model for the exploration of neuronal connectivity and its associated molecular determinants. In the revised version of the manuscript, the author has applied and validated the model in both C. elegans gap junction connectivity and the retina neuron connectivity conditions.

      Strengths:

      This study introduces a succinct yet promising computational model for investigating connections between neuronal types. The model, while straightforward, effectively integrates single-cell transcriptomic and connectomic data to produce a reasonably accurate connectivity map, particularly within the context of retinal connectivity. Furthermore, it successfully recapitulates connectivity patterns and helps uncover the genetic factors that underlie these connections.

      Weaknesses:

      (1) When compared with the previous method - SCM, the new model shows a similar performance level. This may be due to the limitation of the dataset itself, as it only has the innexin expression data. Is it possible to apply the SCM model to the more complete retina dataset and compare the performance with the proposed bilinear modelling approach?

      Minor Weakness:

      (1) The study lacks experimental validation of the model's prediction results.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews

      Reviewer #1 (Public Review):

      Comment: The fact that there are Arid1a transcripts that escape the Cre system in the Arid1a KO mouse model might difficult the interpretation of the data. The phenotype of the Arid1a knockout is probably masked by the fact that many of the sequencing techniques used here are done on a heterogeneous population of knockout and wild type spermatocytes. In relation to this, I think that the use of the term "pachytene arrest" might be overstated, since this is not the phenotype truly observed. Knockout mice produce sperm, and probably litters, although a full description of the subfertility phenotype is lacking, along with identification of the stage at which cell death is happening by detection of apoptosis.

      Response: As the reviewer indicates, we did not observe a complete arrest at Pachynema. In fact, the histology shows the presence of spermatids and sperm in seminiferous tubules and epididymides (Fig. Sup. 3). However, our data argue that the wild-type haploid gametes produced were derived from spermatocyte precursors that have likely escaped Cre mediated activity (Fig. Sup. 4). Furthermore, diplotene and metaphase-I spermatocytes lacking ARID1A protein by IF were undetectable in the Arid1acKO testes (Fig. S4B). Therefore, although we do not demonstrate a strict pachytene arrest, it is reasonable to conclude that ARID1A is necessary to progress beyond pachynema. We have revised the manuscript to reflect this point (Abstract lines 17,18; Results lines 153,154)

      Comment: It is clear from this work that ARID1a is part of the protein network that contributes to silencing of the sex chromosomes. However, it is challenging to understand the timing of the role of ARID1a in the context of the well-known DDR pathways that have been described for MSCI.

      Response: With respect to the comment on the lack of clarity as to which stage of meiosis we observe cell death, our data do suggest that it is reasonable to conclude that mutant spermatocytes (ARID1A-) undergo cell death at pachynema given their inability to execute MSCI, which is a well-established phenotype.

      Comment: Staining of chromosome spreads with Arid1a antibody showed localization at the sex chromosomes by diplonema; however, analysis of gene expression in Arid1a KO was performed on pachytene spermatocytes. Therefore, is not very clear how the chromatin remodeling activity of Arid1a in diplonema is affecting gene expression of a previous stage. CUTnRUN showed that ARID1a is present at the sex chromatin in earlier stages, leading to hypothesize that immunofluorescence with ARID1a antibody might not reflect ARID1a real localization.

      Response: It is unclear what the reviewer means about not understanding how ARID1A activity at diplonema affects gene expression at earlier stages. Our interpretations were not based solely on the observation of ARID1A associations with the XY body at diplonema. In fact, mRNA expression and CUT&RUN analyses were performed on pachytene-enriched populations. ARID1A's association with the XY body is not exclusive to diplonema. Based on both CUT&RUN and IF data, ARID1A associates with XY chromatin as early as pachynema. Only at late diplonema did we observe ARID1A hyperaccumulation on the XY body by IF.

      Reviewer #2 (Public Review):

      Comment: The inefficient deletion of ARID1A in this mouse model does not allow any detailed analysis in a quantitative manner.

      Response: As explained in our response to these comments in the first revision, we respectfully disagree with this reviewer’s conclusions. We have been quantitative by co-staining for ARID1A, ensuring that we can score mutant pachytene spermatocytes from escapers. Additionally, we provide data to show the efficiency of ARID1A loss in the purified pachytene populations sampled in our genomic assays.

      Reviewer #3 (Public Review):

      Comment: The data demonstrate that the mutant cells fail to progress past pachytene, although it is unclear whether this specifically reflects pachytene arrest, as accumulation in other stages of Prophase also is suggested by the data in Table 1. The western blot showing ARID1A expression in WT vs. cKO spermatocytes (Fig. S2) is supportive of the cKO model but raises some questions. The blot shows many bands that are at lower intensity in the cKO, at MWs from 100-250kDa. The text and accompanying figure legend have limited information. Are the various bands with reduced expression different isoforms of ARID1A, or something else? What is the loading control 'NCL'? How was quantification done given the variation in signal across a large range of MWs?

      Response: The loading control is Nucleolin. With respect to the other bands in the range of 100-250 kDa, it is difficult to say whether they represent ARID1A isoforms. The Uniprot entry for Mouse ARID1A only indicates a large mol. wt sequence of ~242 kDa; therefore, the band corresponding to that size was quantified. There is no evidence to suggest that lower molecular weight isoforms may be translated. Although speculative, it is possible that the lower molecular weight bands represent proteolytic/proteasomal degradation products or products of antibody non-specificity. These points are addressed in the revised manuscript (Legend to Fig S2, lines 926-931). Blots were scanned on a LI-COR Odyssey CLx imager and viewed and quantified using Image Studio Version 5.2.5 (Methods, lines 640-642).

      Comment: An additional weakness relates to how the authors describe the relationship between ARID1A and DNA damage response (DDR) signaling. The authors don't see defects in a few DDR markers in ARID1A CKO cells (including a low-resolution assessment of ATR), suggesting that ARID1A may not be required for meiotic DDR signaling. However, as previously noted the data do not rule out the possibility that ARID1A is downstream of DDR signaling and the authors even indicate that "it is reasonable to hypothesize that DDR signaling might recruit BAF-A to the sex chromosomes (lines 509-510)." It therefore is difficult to understand why the authors continue to state that "...the mechanisms underlying ARID1A-mediated repression of the sex-linked transcription are mutually exclusive to DDR pathways regulating sex body formation" (p. 8) and that "BAF-A-mediated transcriptional repression of the sex chromosomes occurs independently of DDR signaling" (p. 16). The data provided do not justify these conclusions, as a role for DDR signaling upstream of ARID1A would mean that these mechanisms are not mutually exclusive or independent of one another.

      Response: The reviewer’s argument is reasonable, and we have made the recommended changes (Results, lines 212-215; Discussion, lines 499-500).

      Comment: A final comment relates to the impacts of ARID1A loss on DMC1 focus formation and the interesting observation of reduced sex chromosome association by DMC1. The authors additionally assess the related recombinase RAD51 and suggest that it is unaffected by ARID1A loss. However, only a single image of RAD51 staining in the cKO is provided (Fig. S11) and there are no associated quantitative data provided. The data are suggestive but it would be appropriate to add a qualifier to the conclusion regarding RAD51 in the discussion which states that "...loss of ARID1a decreases DMC1 foci on the XY chromosomes without affecting RAD51" given that the provided RAD51 data are not rigorous. In the long-term it also would be interesting to quantitatively examine DMC1 and RAD51 focus formation on autosomes as well.

      Response: We agree with the reviewer’s comment and have made the recommended changes (Discussion, lines 518-519).

      Response to non-public recommendations

      Reviewer 2:

      Comment: Meiotic arrest is usually judged based on testicular phenotypes. If mutant testes do not have any haploid spermatids, we can conclude that meiotic arrest is a phenotype. In this case, mutant testes have haploid spermatids and are fertile. The authors cannot conclude meiotic arrest. The mutant cells appear to undergo cell death in the pachytene stage, but the authors cannot say "meiotic arrest."

      Response: We disagree with this comment. By IF, we see that ~70% of the spermatocytes have deleted ARID1A. Furthermore, we never observed diplotene spermatocytes that lacked ARID1A. The conclusion that the absence of ARID1A results in a pachynema arrest and that the escapers produce the haploid spermatids is firm.

      Comment: Fig. S2 and S3 have wrong figure legends.

      Response: The figure legends for Fig. S2 and S3 are correct.

      Comment: The authors do not appear to evaluate independent mice for scoring (the result is about 74% deletion above, Table S1). Sup S2: how many independent mice did the authors examine?

      Response:These were Sta-Put purified fractions obtained from 14-15 WT and mutant mice. It is difficult to isolate pachytene spermatocytes by Sta-Put at the required purity in sufficient yields using one mouse at a time. We used three technical replicates to quantify the band intensity, and the error bars represent the standard error of the mean (S.E.M) of the band intensity.

      Comment: Comparison of cKO and wild-type littermate yielded nearly identical results (Avg total conc WT = 32.65 M/m; Avg total conc cKO = 32.06 M/ml)". This sounds like a negative result (i.e., no difference between WT and cKO).

      Response: This is correct. There is no difference between Arid1aWT and Arid1aCKO sperm production. This is because wild-type haploid gametes produced were derived from spermatocyte precursors that have escaped Cre-mediated activity (Fig. S4). These data merely serve to highlight an inherent caveat of our conditional knockout model and are not intended to support the main conclusion that ARID1A is necessary for pachytene progression.

      Comment: The authors now admit ~ 70 % efficiency in deletion, and the authors did not show the purity of these samples. If the purity of pachytene spermatocytes is ~ 80%, the real proportion of mutant cells can be ~ 56%. It is very difficult to interpret the data.

      Response: The original submission did refer to inefficient Cre-induced recombination. The reviewer asked for the % efficiency, which was provided in the revised version. Also, please refer to Fig. S2, where Western blot analysis demonstrates a significant loss of ARID1A protein levels in CKO relative to WT pachytene spermatocyte populations that were used for CUT&RUN data generation.

      Comment: The authors should not use the other study to justify their own data. The H3.3 ChIP-seq data in the NAR paper detected clear peaks on autosomes. However, in this study, as shown in Fig. S7A, the authors detected only 4 peaks on autosomes based on MACS2 peak calling. This must be a failed experiment. Also, S7A appears to have labeling errors.

      Response: I believe the reviewer is referring to supplementary figure 8A. Here, it is not clear which labeling errors the reviewer is referring to. In the wild type, the identified peaks were overwhelmingly sex-linked intergenic sites. This is consistent with the fact that H3.3 is hyper-accumulated on the sex chromosomes at pachynema.

      The authors of the NAR paper did not perform a peak-calling analysis using MACS2 or any other peak-calling algorithm. They merely compared the coverage of H3.3 relative to input. Therefore, it is not clear on what basis the reviewer says that the NAR paper identified autosomal peaks. Their H3.3 signal appears widely distributed over a 6 kb window centered at the TSS of autosomal genes, which, compared to input, appears enriched. Our data clearly demonstrates a less noisy and narrower window of H3.3 enrichment at autosomal TSSs in WT pachytene spermatocytes, albeit at levels lower than that seen in CKO pachytene spermatocytes (Fig S8B and see data copied below for each individual replicate). Moreover, the lack of peaks does not mean that there was an absence of H3.3 at these autosomal TSSs (Supp. Fig. S8B). Therefore, we disagree with the reviewer’s comment that the H3.3 CUT&RUN was a failed experiment.

      Author response image 1.

      H3.3 Occupancy at genes mis-regulated in the absence of ARID1A

      Comment: If the author wishes to study the function of ARID2 in spermatogenesis, they may need to try other cre-lines to have more robust phenotypes, and all analyses must be redone using a mouse model with efficient deletion of ARID2.

      Response: As noted, we chose Stra8-Cre to conditionally knockout Arid1a because ARID1A is haploinsufficient during embryonic development. The lack of Cre expression in the maternal germline allows for transmission of the floxed allele, allowing for the experiments to progress.

      Comment: The inefficient deletion of ARID1A in this mouse model does not allow any detailed analysis in a quantitative manner.

      Response: In many experiments, we have been quantitative when possible by co-staining for ARID1A, ensuring that we can score mutant pachytene spermatocytes from escapers. Additionally, we provide data to show the efficiency of ARID1A loss in the purified pachytene populations sampled in our genomic assays.

      Reviewer 3:

      Comment: The Methods section refers to antibodies as being in Supplementary Table 3, but the table is labeled as Supplementary Table 2.

      Response: This has been corrected

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Here we address the major points raised by the reviewers.

      Reviewer #1 (Public Review):

      Weaknesses:

      • The signaling pathway upstream of Maf1 remains unknown. In eukaryotes, Maf1 is a negative regulator of RNA pol III and is regulated by external signals via the TORC pathway. Since TORC components are absent in the apicomplexan lineage, one central question that remains open is how Maf1 is regulated in P. falciparum. Magnesium is probably not the sole stimulus involved, as suggested by the observation that Ile deprivation also down-regulates RNA pol III activity.

      We agree that there is still much to uncover relating to the PfMaf1 signaling pathway. While we still do not know each component, we have been able to link external factors (of course not limited to only magnesium) to the increased nuclear occupancy of PfMaf1. Other protein interactors that potentially regulate PfMaf1, while not confirmed, have been identified in plasma sample as candidates for future experiments to validate their potential involvement of RNA Pol III inhibition.

      • The study does not address why MgCl2 levels vary depending on the clinical state. It is unclear whether plasma magnesium is increased during asymptomatic malaria or decreased during symptomatic infection, as the study does not include control groups with non-infected individuals. Along the same line, MgCl2 supplementation in parasite cultures was done at 3mM, which is higher than the highest concentrations observed in clinical samples.

      This reviewer raised a valid point. The plasma magnesium levels for the wet symptomatic samples (averaging [0.79mM]) were within the normal range of a healthy individual (between [0.75-0.95mM]) while the dry asymptomatic levels were above the normal range (averaging [1.13mM]). Ideally, we would have liked to have control uninfected plasma samples from individuals from The Gambia. Unfortunately, field studies and human volunteer studies do not always have all the ideal controls that in vitro studies have. We recognize that [3mM] is higher than the normal range for magnesium levels, which is why we included a revised Supplementary Figure 3A. This figure shows that magnesium concentrations as low as [1mM] (similar to the levels found in dry asymptomatic samples) reduced the expression of RNA Pol III-transcribed genes.

      • Although the study provides biochemical evidence of Maf1 accumulation in the parasite nuclear fraction upon magnesium addition, this is not fully supported by the immunofluorescence experiments.

      We agree that the resolution of IFA images does not allow to support the WB data. We believe that the importance of the IFA Supplementary Figure is to show that PfMaf1 clusters together in foci, which has not been previously reported.

      Reviewer #2 (Public Review):

      Weaknesses:

      However, most analyses are rather preliminary as only very few (3-5) candidate genes are analyzed by qPCR instead of carrying out comprehensive analyses with a large qPCR panel or RNA-seq experiments with GO term analyses. Data presentation lacks clarity, the number of biological replicates is rather low and the statistical analyses need to be largely revised. Although the in vivo data from wet (mildly symptomatic) and dry (asymptomatic) season parasites with different expression levels of Pol III-regulated genes, var genes, and MgCl2 are interesting, the link between the in vitro data and the in vivo virulence of P. falciparum, which is made in many sections of the manuscript, should be toned down. Especially since (i) the only endothelial receptor studied is CD36, which is associated with parasite binding during mild malaria, and (ii) several studies provide contradictory data on MgCl2 levels during malaria and in different disease states, which is not further discussed, but the authors mainly focused on this external stimulus in their experiments.

      We agree that, ideally, we would have liked to do full RNA-seq on The Gambia samples. However, that was out of the scope of this project. The RNA samples were limited which is why we did not use more primers. We believe that an appropriate number of replicates was done for the experiments. The wet symptomatic samples from this study were from mildly symptomatic individuals, as stated in the manuscript. Therefore, CD36 was a relevant receptor to use for our studies.

      We agree that the published studies about magnesium levels in infected individuals are not always consistent. What these studies do not consider is the time of year, whether the infection occurred during the dry or wet season. These studies were also done in different regions of the world using different technologies. For this reason, we only highlight the observed difference observed in our field study data from The Gambia.

      Reviewer #3 (Public Review):

      Weaknesses:

      (1) The signals upstream of Maf1 remain rather a black box. 4 are tested - heat shock and low-glucose, which seem to suppress ALL transcription; low-Isoleucine and high magnesium, which suppress Pol3. Therefore the authors use Mg supplementation throughout as a 'starvation type' stimulus. They do not discuss why they didn't use amino acid limitation, which could be more easily rationalised physiologically. It may be for experimental simplicity (no need for dropout media) but this should be discussed, and ideally, sample experiments with low-IsoLeu should be done too, to see if the responses (e.g. cytoadhesion) are all the same.

      We agree that deprivation of isoleucine would have been another experimental assay for our study, but it also would not have been as novel as magnesium. While understanding the exact mechanism or involvement of magnesium as a stress condition was not the scope of this manuscript, we believe that our data will be valuable into demonstrating that external stimuli act on P. falciparum virulence gene expression via RNA Pol III inhibition. Since we also had plasma level data for magnesium, and not isoleucine, we believed it made for a better external factor to use for our in vitro studies.

      (2) The proteomics, conducted to seek partners of Maf1, is probably the weakest part. From Figure S3: the proteins highlighted in the text are clearly highly selected (as ones that might be relevant, e.g. phosphatases), but many others are more enriched. It would be good to see the whole list, and which GO terms actually came top in enrichment.

      We apologize if the reviewer did not see the attached supplementary Co-IP MS data. The file includes all proteins found in each sample as well as GO term analysis. For the purpose of this work, we highlight proteins potentially involved in the canonical role of Maf1 that have been shown in model organisms to reversibly inhibit RNA Pol III (phosphatases, RNA Pol III subunits).

      (3) Figure 3 shows the Maf1-low line has very poor growth after only 5 days but it is stated that no dead parasites are seen even after 8 cycles and the merozoites number is down only ~18 to 15... is this too small to account for such poor growth (~5-fold reduced in a single cycle, day 3-5)? It would additionally be interesting to see a cell-cycle length assessment and invasion assay, to see if Maf1-low parasites have further defects in growth.

      We agree with the reviewer that the observed reduced merozoite numbers may not the only cause of the reduced growth rate. Other factors in the PfMaf1 knock-down line may contribute to the observed poor growth.

    1. eLife assessment

      This study assessed antibody levels, which are indicative of protection, resulting from both COVID-19 vaccination and natural infection in a representative sample of the Canadian population. The work provides solid evidence that Individuals who received a booster vaccination and had a prior infection had the highest antibody levels, particularly when either the vaccination or natural infection had occurred within the past six months. These findings are of fundamental importance in supporting the value of booster vaccination in populations vulnerable to severe COVID-19.

    2. Reviewer #1 (Public Review):

      This study holds significant importance as it assessed antibody levels arising from both COVID-19 vaccination and natural infection in a representative population-based sample. The analysis was conducted with thoughtfulness and rigor. The sampling methodology ensured the representation of the broader Canadian population, including minorities and indigenous communities. Findings suggest, that despite a substantial number of individuals having been previously infected, especially following the first omicron wave, repeat booster vaccination is essential to ensure that individuals develop an optimal antibody response against new exposures to infection, given the waning of antibodies over time. The study findings carry global significance as it informs decisions about the relevance of booster vaccination for reducing infection incidence amid the ongoing challenge of vaccine hesitancy and the continual emergence of new variants.

      Among the weaknesses of the study, from my perspective, is the lack of explicit clarification that one objective of achieving repeat booster vaccination is to impart a robust level of protection against acquiring infections. Previous studies have demonstrated that the effectiveness of even only primary-series vaccination against COVID-19 severe disease was high, with slow waning over time. However, even when effectiveness against severity is high, infections may still present a risk for progression to severe COVID-19 among older individuals and those with comorbidities. Another limitation is that the study did not investigate whether there were variations in spike levels based on the last vaccine type administered. Furthermore, it is important to comment on the generalizability of the findings considering that individuals who participated in the research may have been different from those who did not participate and therefore residual confounding cannot be eliminated.

    3. Reviewer #2 (Public Review):

      Strengths<br /> (1) The study benefits from a Large sample size, encompassing serial assessments of 4000-9000 adults over an extended period. This large cohort enhances the reliability and generalizability of the findings.<br /> (2) The study employs a rigorous methodology, including serial assessments, self-collected dried blood spots, and highly sensitive antibody assays. The use of multiple measures ensures a robust evaluation of hybrid immunity and SARS-CoV-2 incidence within the Canadian population.<br /> (3) The manuscript provides detailed analyses of antibody levels, vaccination history, infection rates, and demographic factors. The inclusion of stratified analyses by age, sex, and ethnicity enhances the understanding of population-level immunity dynamics.<br /> (4) The study's findings contribute valuable insights into the dynamics of hybrid immunity and SARS-CoV-2 incidence, particularly during the emergence of the Omicron variant. The observed decline in COVID-19 death rates amidst rising infection rates underscores the potential protective role of hybrid immunity against severe outcomes.

      Weaknesses<br /> (1) Sampling Limitations: While the study claims to be representative of the Canadian population, there are potential limitations in sampling methods, particularly reliance on an online polling platform. This approach may introduce selection bias and limit the generalizability of findings to certain demographic groups.<br /> (2) Assay Limitations: The study acknowledges limitations associated with antibody assays and the potential for assay saturation, the reliance on self-reported vaccination history and infection status may introduce recall bias and affect the accuracy of estimates.<br /> (3) Data Interpretation: While the study presents compelling data on hybrid immunity and SARS-CoV-2 incidence, some interpretations may be speculative. The assertion of a causal relationship between hybrid immunity and reduced COVID-19 mortality warrants cautious interpretation, given the complexity of factors influencing disease outcomes.<br /> (4) Lack of inclusion and exclusion criteria: The manuscript does not have specific inclusion and exclusion criteria for participants and the methods used for data analysis.<br /> (5) The protocol does not include disaggregated data, this is only available on page 25 as an annex.

    1. Reviewer #3 (Public Review):

      Summary:

      The article by Huang et.al. presents an in-depth study on the role of DNA methylation in regulating virulence and metabolism in Pseudomonas syringae, a model phytopathogenic bacterium. This comprehensive research utilized single-molecule real-time (SMRT) sequencing to profile the DNA methylation landscape across three model pathovars of P. syringae, identifying significant epigenetic mechanisms through the Type-I restriction-modification system (HsdMSR), which includes a conserved sequence motif associated with N6-methyladenine (6mA). The study provides novel insights into the epigenetic mechanisms of P. syringae, expanding the understanding of bacterial pathogenicity and adaptation. The use of SMRT sequencing for methylome profiling, coupled with transcriptomic analysis and in vivo validation, establishes a robust evidence base for the findings

      Strengths:

      The results are presented clearly, with well-organized figures and tables that effectively illustrate the study's findings.

      Weaknesses:

      It would be helpful to add more details, especially in the methods, which make it easy to evaluate and enhance the manuscript's reproducibility.

    1. eLife assessment

      This study is focused on the question of how Nrp1 contributes to the regulation of vascular permeability and whether or why there are differences between different vascular beds. The scientific concept of this paper suggests a possible role of Nrp1 on perivascular cells as a participant in the regulation of vascular permeability. This concept is interesting and potentially useful. However, the methodology and quantitative analysis are currently inadequate to fully support the claims.

    2. Reviewer #1 (Public Review):

      Summary:

      This study examines how blood vessels exposed to the cytokine VEGF respond to vascular leakage when the VEGF receptor NRP1 is targeted. This study compares results in in two different body sites of the dermis and in a different organ, the trachea. The authors refer to the two different sites of the dermis as two different organs, but the dermis is one organ. The authors report that vascular leakage is differentially affected by NRP1 targeting in the ear skin compared to the trachea and back skin. They attribute these differences to NRP1 presence in cells other than the vascular endothelium, especially in the ear skin, where they observe higher perivascular NRP1 staining.

      The manuscript states that the aim was to uncover the role of NRP1 in VEGF-mediated vascular permeability. This was misleading, because a lot is already known on NRP1 in this pathway, as is evidenced by a large number of publications the authors themselves quote (and sometimes misquote). The main information they wish to add is the possibility that NRP1 may also play a role in other cells to regulate permeability, as they previously suggested for blood vessel growth. Several technical issues and experimental limitations call into question whether the above conclusion can be reached with the data provided.

      Strengths:

      It is an interesting concept that NRP1 regulates vascular permeability by acting in perivascular cells.

      Weaknesses:

      (A) Technical limitations due to assay type:

      A direct comparison of the skin in two body sites is not warranted given that the authors used different methods to study the two sites. Below is a list of differences reported in their methods section:

      (A1) Different tracers were used to visualize VEGF165-induced leakage in different sites.<br /> Ear skin assay: 2 kDa FITC and two different dextrans, 10 kDa TRITC dextran, and another dextran whose molecular weight is not specified. It is not explained why 3 different tracers were used. Figures 1 and 2 report data with 2 kDa TRITC dextran.<br /> Back skin assay: They describe the Miles assay using Evans Blue, which binds to albumin, making it a 67 kDa tracer. However, Figure 1 suggests that 2 kDa dextran was used, and perhaps Evans Blue was only used for the supplemental data. This is relevant because current knowledge suggests that small dyes use the junctional pathway, whereas larger proteins such as albumin can use vesicular transport. The former is thought to be a fast pathway (hence, the authors measured dye extravasation 3 min after VEGF165 injection). The latter pathway is a slower one (hence, measured 30 min after VEGF165 injection in the Miles assay).

      Quantification: For ear skin, the number of leakage sites and lag period is quantified, as well as leakage over time. For back skin, the amount of extravasated dye is quantified at a fixed time point. Such different measurements do not allow for direct comparison.

      (A2) Mice were prepared in different ways for the different body sites studied:<br /> Ear skin assay: general anesthesia with ketamine-xylazine.<br /> Back skin assay: No anesthesia is described for the back skin Miles assay. This would be a concern because intradermal injections are considered to be painful. For back skin histology, they do report to have used isoflurane anesthesia before perfusion fixation. However, it is not advisable to use used isoflurane anesthesia for perfusion fixation if this has been done via the conventional cardiac route, because opening the chest cavity to access the heart for perfusion causes lung collapse, meaning that the mice cannot breathe the anaesthetic, and there is a risk of them regaining consciousness. The authors should clarify what exactly they have done, for ethical reasons and also because the type of anesthesia can affect vascular studies, for example, see PMID 36418078.

      (A3) Differential histamine use:<br /> Back skin assay: uses anti-histamine, as is advised with intradermal injections to minimize vascular leakage due to histamine release after local trauma.<br /> Ear skin assay: no anti-histamine was used, so histamine-induced background leakage might have been present, independently of VEGF165. The authors suggest that the ear skin injection does not cause trauma, but it is unclear how this is possible, given that skin needs to be disrupted for the needle to enter the tissue.

      (A4) Different VEGF165 concentration used:<br /> The ear skin assay uses 10 ng VEGF per injection, and the back skin assay 80 ng.

      Given all these differences in experimental protocols, as well as different knockdown efficiency (see below), the results for the different sites are not directly comparable. Hence it cannot presently be concluded that the role of NRP1 in both sites is different, and further work is required to make a firm conclusion. In addition, the conflicts between the reported methods and figures need to be resolved.

      (B) It is unclear whether appropriate controls were used:

      (B1) What genotype and treatment are the control mice for NRP1 targeting? The ideal control would be wild-type mice with the same CreER, injected with tamoxifen according to the same timeline, to account for vehicle, tamoxifen, and tamoxifen-induced CreER toxicity (https://doi.org/10.1038/s44161-022-00125-6). This could be a littermate mouse or, alternatively, a separate experiment should be shown comparing wild-type mice carrying the same CreER as used for the ablation studies and injected with tamoxifen, versus wild-type mice injected with tamoxifen, to demonstrate that the induction regime does not in itself cause phenotypes.

      (B2) Has a PBS injection been performed to compare baseline leakage between genotypes, independently of VEGF165 injections? This is an essential control.

      (B3) The experimental protocol assays 4 days after 5 consecutive tamoxifen injections, which does not allow much time for drug washout. Moreover, this is a lot of tamoxifen (80 mg x 5 = 400 mg tamoxifen per kg). Due to the possibility that tamoxifen-induced effects might still be present and cause sex-differential effects, the corresponding sex for each individual data point should be indicated in all graphs.

      (B4) i.p. peanut oil is used in undefined volumes; this vehicle was shown to cause inflammation if administered i.p. (PMID 33139505). Therefore, inflammation might be present, which might affect different body sites differently.

      (C) Validation of NRP1 targeting:<br /> The authors have not performed an NRP1 knockout in the endothelium, as they repeatedly claim. In the lung, there is a good knockdown of around 75%; this may or may not be due to complete EC knockdown with preservation of NRP1 in other cell types. In the trachea, ear skin, and back skin, knockdown was not quantified, although qualitative comparisons by NRP1 immunostaining in Supplementary Figure 1 suggest that the back skin targeting worked better than the ear skin targeting, which would confound results, but in any case, it was neither a knockdown nor knockout. The staining for global targeting looks fainter than for the other genotypes, and the single-channel images seem to have different intensities than the overlays in Supplementary Figure 1 A.

      (D) Systemic permeability studies:<br /> Organs have very different baseline permeability, due to the properties of the vascular barrier, i.e. tight barriers in the brain and retina and permeable endothelium in the liver and kidney. In this assay, VEGF is not delivered from the tissue side, as would be typical during inflammation but is delivered through the circulation, which has been shown to differentially affect the VEGF response, at least in some tissues (PMID 25175707). Nevertheless, this is a helpful readout, especially given that PBS controls appear not to have been performed above to establish baseline leakage between genotypes and tissues.

      Figure Supplement 3 shows that VEGF induces vascular leakage in all body sites examined, independently of the size of the tracer used, and agreeing with current literature. An additional set of panels should be included with data shown without calculating the fold change relative to the control, set to 1, to account for the endothelium in different organs having different baseline vascular permeability. How do the authors explain that VEGF has the same effect in the ear and back skin in this assay, when NRP1 is present, given that they claim a role for perivascular NRP1 in the ear, but not back skin, for reducing VEGF/VEGFR2 signalling?

      (E) Comparing results obtained with different tools:

      - The endothelial NRP1 knockdown yielded different results for ear and back skin.<br /> - Anti-NRP1 yielded similar results for ear and back skin.<br /> - The global NRP1 ko yielded similar results for ear and back skin.<br /> Because anti-NRP1 and the global NRP1 knockdown gives similar results for all tissues, the authors deduce that the NRP1 acts in cell types other than endothelial cells to regulate permeability. This is an interesting idea, based on the lab's prior work in angiogenesis. In their trans-interaction scenario, NRP1 would have the same role in ECs in all sites, but non-endothelial NRP1 can override the function of the endothelial NRP1 function depending on its expression levels.

      Confidence in this conclusion would require additional experiments:<br /> - Show that the endothelial knockdown works equally well in different body sites, via NRP1 staining and/or by checking recombination efficiency with a reporter.<br /> - Using an analogous assay to measure permeability in different body sites.<br /> - Perform a non-endothelial knockdown, i.e. in pericytes, which is hypothesized to be the source of NRP1 that affects vascular leakage signalling in endothelial cells in trans.

      (F) Abstract, introduction, and references:<br /> The authors suggest controversy with regard to NRP1's roles in permeability. However, NRP1's function in VEGF signalling has been defined as being an accessory to VEGFR2, with a role in promoting SFK activation. This function relies on the NRP1 cytoplasmic domain, which mediates VEGFR2 trafficking and signalling; the relevant literature for the NRP1 cytoplasmic domain is mentioned for arteriogenesis (PMID 23639442), but not permeability (PMID 28289053). Another paper is mentioned which describes a VEGFR2-independent pathway for a CendR ligand, but this prior study did NOT make the claim that VEGF signalling is NRP1-independent or promotes it (PMID 27117252). In the eye, NRP1 has been implicated in both SEMA3A and VEGF165-induced permeability, which was also corroborated by the Miles assay in two prior studies (PMID 18180379, PMID 28289053). The last sentence in the abstract is incorrect, because differences in ear versus back skin do not constitute organotypic difference (as the organ is the dermis), and the potential role of perivascular cells is only inferred from the global endothelial NRP1 knockdown, which gives the same result as reported for the endothelial NRP1 knockdown in the literature.

      (1) Lines 5/.53: The references for VEGF-NRP1 signalling in age-related macular degeneration are not helpful: Raimondi investigated VEGF-independent NRP1 pathways in angiogenesis, Fernandez-Robredo investigated NRP1 pathways in angiogenesis and showed that fewer vessels correlated with less leakage but did not test VEGF signaling specifically. A more suitable reference would have been PMID 28289053.

      (2) Lines 63/64 and repeated in 84-89: The references quoted all showed that NRP1 inhibition reduces vascular permeability, and therefore do not provide evidence for the idea that NRP1 inhibition promotes permeability, as the authors report here for the ear skin; the only study supporting them is one using arterial endothelial cells, which are not permeability-relevant.

      (3) Lines 106/107: The references used to underpin organ-specific barrier properties are correct, but as stated above, the dermis is the dermis, and therefore, these references would not be useful to provide support for the idea that the ear and back skin behave differently after NRP1 knockdown.

      (G) Additional comments on the figures:<br /> Figure 4: The authors show that VEGFR2 is essential for permeability, and VEGF164 effects are VEGFR2 dependent - this is well established for VEGF164 in the Miles assay, including the accessory role of NRP1 (e.g. PMID 28289053). As the proposed trans function of NRP1 cannot make a difference in VEGFR2 signaling when VEGFR2 is not there, this experiment is only confirmatory of prior VEGFR2 knowledge.

    3. Reviewer #2 (Public Review):

      The paper by Pal et al. examines the role of Nrp1 in organ-specific permeability response to VEGF. The subject is certainly interesting, but there are a number of significant methodological problems that make data evaluation rather problematic. In particular, lung endothelial cells are used to assess the effectiveness of Nrp1 knockout when experiments focus on different organs; small number of data points (as small as 2 or 3) are used to claim statistically significant differences; obvious data scatter is not commented on and seems ignored; key reagents (anti-Nrp1 Ab) are not well characterized, a proposed model is not verified in vitro, etc. Some of these issues are outlined in detail below, but the list of problems is much longer than this.

      (1) Intradermal injection of anti-Nrp1 Ab: I am puzzled by this experiment: Will Ab presence be limited locally or is there a systemic distribution? This needs to be verified.

      (2) What does anti-Nrp1 Ab actually do? Does it block VEGF binding? Induces Nrp1 and VEGFR2 endocytosis?

      (3) How does i.v. injection of anti-Nrp1 Ab affect permeability in different organs?

      (4) Effect of endothelial Nrp1KO: Since the authors examine organ-specific effects of Nrp1, it seems illogical to assess its expression in the lung as a measure of KO as KO efficiency may differ organ by organ. Immunocytochemistry is not particularly quantitative and prone to selection bias. I'd suggest using EC bulk RNAseq from different organs to confirm the magnitude of the knockout in different beds.

      (5) Figures 1B and 2B show profoundly different levels of Nrp1 KO in lung ECs. Were different mouse strains used in Figure 1 and Figure 2 experiments? This may well explain the differences the authors have observed.

      (6) Supplementary Figure 2: why is there no leakage of 10kD dextran in the heart in response to VEGF when there is an increase in the 70kD dextran leakage? That does not seem possible. Further, the authors observed no significant increase in 70kD dextran leakage after VEGF in the skeletal muscle. That also seems very unlikely and flies against experience of many labs in the field.

      (7) Since the authors think that peri-vascular cell Nrp1 expression accounts for organ-specific Nrp1 effects, this should be studied and examined in an in vitro co-culture model.

      (8) Quantification: a lot of quantifications- of Nrp1 expression level, VE-cadherin Y685 phosphorylation, etc. are done on the basis of immunocytochemistry. This really is not a quantitative technique and is prone to numerous artifacts. The data should be at least confirmed by whole-tissue Westerns. I am also puzzled by small numbers of samples. If each dot on a graph represents an individual data point, how do authors get a p<0.5 value with an N of 3? (for example Figure 5B, but there are other examples). Also, in Figure 4F data scatter is quite enormous. This is either an experimental problem or, more likely, there is a biological message here - the tissue is not uniform. In any case, I do not see how one gets a significant result here. Figures 5B and 5C have a similar problem while Figure 5D seems to be based on only two data points?

    4. Reviewer #3 (Public Review):

      Summary:

      Pal et al. provide valuable evidence supporting distinct vascular bed-specific VEGF-A mediated vascular permeability function of Neuropilin-1 (NRP1) in adult mice. Using a suite of genetic mice models and state-of-the-art vascular permeability assays the authors demonstrate that ear skin vasculature of EC-specific NRP1 adult knockout mice is hypersensitive to VEGF-A mediated high-molecular weight dye leakage from venules, as opposed to back skin and tracheal vasculature where EC-specific NRP1 loss had a more classical negative effect on permeability. Interestingly, both whole organism KO of NRP1 and a blocking antibody treatment, attenuated VEGF-A mediated permeability in ear skin and had the usual attenuation of permeability phenotype in back skin and tracheal vasculature. Using a pericyte promoter specific reporter mice line, the authors characterize NRP1 expression in the vascular beds of the ear dermis and back skin and conclude that NRP1 expression is higher in perivascular cells in the ear dermis as opposed to back skin vasculature, thus indicating a juxtracrine NRP1-VEGFR2 signaling model in adult mice. Further, they use a Vegfr2 phosphosite mutant homozygous mice model in the background of NRP1 iECKO to find the hypersensitivity to VEGF-A stimulation in ear skin is abrogated and therefore, prove the juxtracrine NRP1 control of VEGFR2 mediated downstream signaling leading to vascular permeability. Further, they successfully show distinctive vascular bed-specific results as above using a well-characterized VE-Cadherin Y685 antibody staining which corresponds to vascular leakage downstream of VEGF-A/VEGFR2 signaling in ear dermis and back skin vascular beds.

      Strengths:

      The question of the in vivo role of NRP1 in VEGF-A-induced hyper-permeability is an unresolved one and the elegant use of genetic mice models to demonstrate the phenotypes is valuable to the field. The organotypic differences observed in vascular permeability upon VEGF-A treatment in ear skin versus back skin and tracheal vasculature are solid. The subsequent investigation to validate heightened VEGFR2 signaling in ear dermis downstream of VEGF-A stimulation using Vegfr2 Y949F mice, VEC Y685 antibody, and pPLCγ antibody is also very convincing.

      Weaknesses:

      The mechanism proposed by the authors by which EC-specific loss of NRP1 caused hypersensitivity to VEGF-A in ear dermis is through elevated juxtracrine signaling of NRP1 expressed in pericytes in trans binding and retaining VEGFR2 on the cell surface of ECs to sustain downstream signaling for longer time, in corroboration to earlier findings in Koch et al., 2014, where NRP1 was studied in the context of tumor angiogenesis. To support their claim, the authors stain the ear dermis and back skin vasculature of Pdgfrb-GFP reporter mice, with NRP1 and CD31 antibodies and find out that ear skin vasculature has higher perivascular cells as opposed to back skin vasculature. While this is a good experiment to prove the above point, there are no functional experiments to support this model.

      Overall, although the paper presents very useful findings in the field of NRP1-VEGFR2 biology, and most of the conclusions are well supported by the data, there are a few points if addressed can significantly substantiate the model of juxtracrine signaling proposed by the authors. They are:

      (1) It will be important to know if the perivascular to vascular NRP1 expression (such as in Figure 3B) increases further in ear skin vasculatures of NRP1 iECKO mice compared to otherwise WT mice.

      (2) Does knocking out NRP1 in pericytes attenuate the VEGF-A mediated hyperpermeability observed in ear skin of NRP1 iECKO mice (similar to experiments in 1C, 2C)?

      (3) What is the status of VEGFR2 expression in ECs of ear skin and back skin of NRP1 iECKO and NRP1 iKO mice? This experiment is a proof-of-concept and is not essential to prove the point of juxtracrine NRP1 signaling since downstream readouts - pPLCγ and VEC Y685 staining have already been shown to correlate in the ear dermis.

    1. eLife assessment

      This important study uses cellular automata and evolution algorithms to offer an alternative to long-range signalling models of developmental patterning. The computational evidence that local rules suffice to produce a robust and global pattern is convincing. With some additional insights that connect the theoretical findings back to real biological examples, this work could be of interest to the broad community of developmental and systems biologists.

    2. Reviewer #1 (Public Review):

      Summary:

      In this article, Kremser et al set off to explore how local interactions between cells can drive pattern formation by focusing on the French flag problem whereby an initially homogeneous system breaks axial symmetry to form three distinct regions of different cell fates. The authors use a cellular automata model together with evolution searches on possible rules that determine cell state and tissue level patterning. It is assumed that three cell states are possible and that at each time iteration each cell updates its fate according to the current state of itself and its neighbours. The authors use a computational procedure based on evolution algorithms to identify "fit" update rules that can successfully drive patterning into three distinct domains and go on to provide insights with regards to the function of these rules as well as their properties such as robustness and patterning dynamics. The article is generally well-written, the results seem solid, and the analysis and methods are thorough and generally well-explained. A main concern is the lack of connection between the biology that motivated the analysis and the results, this could be improved in the discussion by making the methods somewhat more concise to allow space to make links back to potential biological mechanisms when the results are presented. We raise some general points and some more specific questions and suggestions for clarification below that we hope will help improve the MS and make it more accessible to a wider audience.

      General points:

      • Although the authors motivate their work on the premise that biological patterns at the tissue level often are driven by local cell-cell interactions, by the end of the analysis any possible connection to the underlying biology is lost. For example, it would have been useful to discuss how the rules that evolved to dominate the patterning process in the results section could be implemented by cells. Is there a connection that could be made back to Notch signalling and its multiple ligands or to morphogens that diffuse only locally? Would the large number of rules possible in the cellular automata context reflect transcriptional feedback? This is an important point to bring the work "home". At the moment, it feels like a nice computational analysis of cellular automata but the links to the systems that motivate the work are lost in the process.

      • When growth is considered (p.14-15) a discussion of timescales seems pertinent. Often patterning takes place at a timescale faster than cell division so the system could be allowed to reach a steady state before a new division event takes place. What are the time scales of updating the phenotype compared with the time scales of division in the model and in relevant biological systems? How would different limiting cases impact conclusions, e.g. new cells added and pattern allowed to reach steady state before more growth versus cells added while patterning dynamics are still updating?

      • An interesting question is whether certain elements of rules (out of the 27 possible elements for the system with 3 states) are more or less likely to appear together in an evolved final rule. This may give a mechanistic understanding of what combinations of elements are likely to drive the optimal pattern and which combinations are avoided altogether.

    3. Reviewer #2 (Public Review):

      Summary:

      In this paper, the authors seek to identify strategies that can be used to generate robust one-dimensional large-scale patterns through the sequential application of only local, unchanging, space-independent rules. This is an important general question in developmental biology.

      Strengths:

      The authors do a nice job of laying out the problem, which they explore through cellular automaton (CA) modeling. The modeling framework is well described, as are the methods used for computational identification of effective (most "fit") strategies. As many biologists are unfamiliar with CA models, the clarity of description offered by these authors is especially important, as is the attention that was paid to useful visualization of results.

      Ultimately, the authors use their approach to converge on certain generic strategies for achieving robust patterns. In the case when there are only three states (no hidden or transient states) available to cells, they rationalize the consensus strategy that emerges to involve a combination of "sorting" and "bulldozer" modules, which are relatively easy to rationalize. In cases involving a fourth state, a more complicated set of strategies arise and are considered.

      As a pure modeling paper, I find the work to be very well done, and the conclusions are well supported by the data and analyses. In terms of the long-term importance of this approach to biologists studying pattern formation, I see this paper as primarily laying a foundation for taking the next step, which is moving into two (or three dimensions). Clearly, the complexity of rules becomes much greater, but one may expect some big qualitative differences to show up in higher dimensions, where simple strategies like sorting and bulldozing cannot work quite as simply. It will be interesting to see where this leads.

      Weaknesses:

      Ultimately, the relevance of this work to biology rests with its ability to provide insight into important biological problems. In terms of explaining the challenging nature of generating long-range patterns using short-range rules, I think the authors do a good job. However, they could do a better job of relating the results of the work back to biology. For example, are there examples of "sorting module" and "bulldozer module" behavior in biology? Could they be involved in explaining actual biological patterns?

      It also would have been helpful for the authors to generalize more about the way in which their CA rules achieve global patterns with other patterning mechanisms. For example, in a Wolpert positional information model, patterning information is distributed over space in a steady-state gradient. In the CA model, no information spreads more than one cell at any one time point, but over time information still spreads, so in a sense a stationary spatial gradient has been traded for a moving spatial discontinuity. Because the discontinuity moves without decrement, any stationary state ends up being determined by the boundaries of the system, which goes a long way to explaining the robustness they observe, as well as why the result is quite sensitive to growth (which keeps changing the boundary).