    1. Reviewer #3 (Public Review):

      The study investigated how statistical aspects of temperature sequences, such as manipulations of stochasticity (i.e., randomness of a sequence) and volatility (i.e., speed at which a sequence unfolded) influenced pain perception. Using an innovative stimulation paradigm and computational modelling of perceptual variables, this study demonstrated that perception is weighted by expectations. Overall, the findings support the conclusion that pain perception is mediated by expectations in a Bayesian manner. The provision of additional details during the review process strengthens the reliability of this conclusion. The methods presented offer tools and frameworks for further research in pain perception and can be extended to investigations into chronic pain processes.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a valuable insight into a computational mechanism of pain perception. The evidence supporting the authors’ claims is solid, although the inclusion of 1) more diverse candidate computational models, 2) more systematic analysis of the temporal regularity effects on the model fit, and 3) tests on clinical samples would have strengthened the study. The work will be of interest to pain researchers working on computational models and cognitive mechanisms of pain in a Bayesian framework.

      Thank you very much again for considering the manuscript and judging it as a valuable contribution to understanding mechanisms of pain perception. We recognise the above-mentioned points of improvement and elaborate on them in the initial response to the reviewers.

      Response to the reviewers

      Reviewer 1:

      Reviewer Comment 1.1 — Selection of candidate computational models: While the paper juxtaposes the simple model-free RL model against a Kalman Filter model in the context of pain perception, the rationale behind this choice remains ambiguous. It prompts the question: could other RL-based models, such as model-based RL or hierarchical RL, offer additional insights? A more detailed explanation of their computational model selection would provide greater clarity and depth to the study.

      Initial reply: Thank you for this point. Our models were selected a-priori, following the modelling strategy from Jepma et al. (2018) and hence considered the same set of core models for clear extension of the analysis to our non-cue paradigm. The key question for us was whether expectations were used to weight the behavioural estimates, so our main interest was to compare expectation vs non-expectation weighted models.

      Model-based and hierarchical RL are very broad terms that can be used to refer to many different models, and we are not clear about which specific models the reviewer is referring to. Our Bayesian models are generative models, i.e. they learn the generative statistics of the environment (which is characterised by inherent stochasticity and volatility) and hence operate model-based analyses of the stimulus dynamics. In our case, this happened hierarchically and it was combined with a simple RL rule.

      Revised reply: We clarified our modelling choices in the ”Modelling strategy” subsection of the results section.

      Reviewer Comment 1.2 — Effects of varying levels of volatility and stochasticity: The study commendably integrates varying levels of volatility and stochasticity into its experimental design. However, the depth of analysis concerning the effects of these variables on model fit appears shallow. A looming concern is whether the superior performance of the expectation-weighted Kalman Filter model might be a natural outcome of the experimental design. While the non-significant difference between eKF and eRL for the high stochasticity condition somewhat alleviates this concern, it raises another query: Would a more granular analysis of volatility and stochasticity effects reveal fine-grained model fit patterns?

      Initial reply: We are sorry that the reviewer finds shallow ”the depth of analysis concerning the effects of these variables on model fit”. We are not sure which analysis the reviewer has in mind when suggesting a ”more granular analysis of volatility and stochasticity effects” to ”reveal fine-grained model fit patterns”. Therefore, we find it difficult to improve our manuscript in this regard. We are happy to add analyses to our paper but we would be greatful for some specific pointers. We have already provided:

      •    Analysis of model-naive performance across different levels of stochasticity and volatility (section 2.3, figure 3, supplementary information section 1.1 and tables S1-2)

      •    Model fitting for each stochasticity/volatility condition (section 2.4.1, figure 4, supplementary table S5)

      •    Group-level and individual-level differences of each model parameter across stochasticity/volatility conditions (supplementary information section 7, figures S4-S5).

      •    Effect of confidence on scaling factor for each stochasticity/volatility condition (figure 5)

      Reviewer Comment 1.3 — Rating instruction: According to Fig. 1A, participants were prompted to rate their responses to the question, ”How much pain DID you just feel?” and to specify their confidence level regarding their pain. It is difficult for me to understand the meaning of confidence in this context, given that they were asked to report their *subjective* feelings. It might have been better to query participants about perceived stimulus intensity levels. This perspective is seemingly echoed in lines 100-101, ”the primary aim of the experiment was to determine whether the expectations participants hold about the sequence inform their perceptual beliefs about the intensity of the stimuli.”

      Initial reply: Thank you for raising this question, which allows us to clarify our paradigm. On half of the trials, participants were asked to report the perceived intensity of the previous stimulus; on the remaining trials, participants were requested to predict the intensity of the next stimulus. Therefore, we did query ”participants about perceived stimulus intensity levels”, as described at lines 49-55, 296-303, and depicted in figure 1.

      The confidence refers to the level of confidence that participants have regarding their rating - how sure they are. This is done in addition to their perceived stimulus intensity and it has been used in a large body of previous studies in any sensory modality.

      Reviewer Comment 1.4 — Relevance to clinical pain: While the authors underscore the relevance of their findings to chronic pain, they did not include data pertaining to clinical pain. Notably, their initial preprint seemed to encompass data from a clinical sample (https://www.medrxiv.org /content/10.1101/2023.03.23.23287656v1), which, for reasons unexplained, has been omitted in the current version. Clarification on this discrepancy would be instrumental in discerning the true relevance of the study’s findings to clinical pain scenarios.

      Initial reply: The preprint that the Reviewer is referring to was an older version of the manuscript in which we combined two different experiments, which were initially born as separate studies: the one that we submitted to eLife (done in the lab, with noxious stimuli in healthy participants) and an online study with a different statistical learning paradigm (without noxious stimuli, in chronic back pain participants). Unfortunately, the paradigms were different and not directly comparable. Indeed, following submission to a different journal, the manuscript was criticised for this reason. We therefore split the paper in two, and submitted the first study to eLife. We are now planning to perform the same lab-based experiment with noxious stimuli on chronic back pain participants. Progress on this front has been slowed down by the fact that I (Flavia Mancini) am on maternity leave, but it remains top priority once back to work.

      Reviewer Comment 1.5 — Paper organization: The paper’s organization appears a little bit weird, possibly due to the removal of significant content from their initial preprint. Sections 2.12.2 and 2.4 seem more suitable for the Methods section, while 2.3 and 2.4.1 are the only parts that present results. In addition, enhancing clarity through graphical diagrams, especially for the experimental design and computational models, would be quite beneficial. A reference point could be Fig. 1 and Fig. 5 from Jepma et al. (2018), which similarly explored RL and KF models.

      Initial reply: Thank you for these suggestions. We will consider restructuring the paper in the revised version.

      Revised reply: We restructured introduction, results and parts of the methods. We followed the reviewer’s suggestion regarding enhancing clarity through graphical diagrams. We have visualised the experimental design in Figure 1D. Furthemore, we have visualised the two main computational models (eRL and eKF) in Figure 2, following from Jepma et al. (2018). As a result, we have updated the notation in Section 4.4 to be clearer and consistent with the graphical representation (rename the variable referring to observed thermal input from Ot to Nt).

      Reviewer Comment 1.6 — In lines 99-100, the statement ”following the work by [23]” would be more helpful if it included a concise summary of the main concepts from the referenced work.

      - It would be helpful to have descriptions of the conditions that Figure 1C is elaborating on.

      - In line 364, the ”N {t}” in the sentence ”The observation on trial t, N {t}”, should be O {t}.

      Initial reply: Thank you for spotting these and for providing the suggestions. We will include the correction in the revised version.

      Revised reply: We have added the following regarding the lines 99-100:

      ”We build on the work by [23], who show that pain perception is strongly influenced by expectations as defined by a cue that predicts high or low pain. In contrast to the cue-paradigm from [23], the primary aim of our experiment was to determine whether the expectations participants hold about the sequence itself inform their perceptual beliefs about the intensity of the stimuli.”

      See comment in the previous reply, regarding the notation change from Ot to Nt.

      Reviewer 2:

      Reviewer Comment 2.1 — This is a highly interesting and novel finding with potential implications for the understanding and treatment of chronic pain where pain regulation is deficient. The paradigm is clear, the analysis is state-of-the-art, the results are convincing, and the interpretation is adequate.

      Initial reply: Thank you very much for these positive comments.

      Reviewer 3:


      I am pleased to have had the opportunity to review this manuscript, which investigated the role of statistical learning in the modulation of pain perception. In short, the study showed that statistical aspects of temperature sequences, with respect to specific manipulations of stochasticity (i.e., randomness of a sequence) and volatility (i.e., speed at which a sequence unfolded) influenced pain perception. Computational modelling of perceptual variables (i.e., multi-dimensional ratings of perceived or predicted stimuli) indicated that models of perception weighted by expectations were the best explanation for the data. My comments below are not intended to undermine or question the quality of this research. Rather, they are offered with the intention of enhancing what is already a significant contribution to the pain neuroscience field. Below, I highlight the strengths and weaknesses of the manuscript and offer suggestions for incorporating additional methodological details.


      The manuscript is articulate, coherent, and skilfully written, making it accessible and engaging.

      - The innovative stimulation paradigm enables the exploration of expectancy effects on perception without depending on external cues, lending a unique angle to the research.

      - By including participants’ ratings of both perceptual aspects and their confidence in what they perceived or predicted, the study provides an additional layer of information to the understanding of perceptual decision-making. This information was thoughtfully incorporated into the modelling, enabling the investigation of how confidence influences learning.

      - The computational modelling techniques utilised here are methodologically robust. I commend the authors for their attention to model and parameter recovery, a facet often neglected in previous computational neuroscience studies.

      - The well-chosen citations not only reflect a clear grasp of the current research landscape but also contribute thoughtfully to ongoing discussions within the field of pain neuroscience.

      Initial reply: We are really grateful for reviewer’s insightful comments and for providing useful guidance regarding our methodology. We are also thankful for highlighting the strengths of our manuscript. Below we respond to individual weakness mentioned in the reviews report.

      Reviewer Comment 3.1 — In Figure 1, panel C, the authors illustrate the stimulation intensity, perceived intensity, and prediction intensity on the same scale, facilitating a more direct comparison. It appears that the stimulation intensity has been mathematically transformed to fit a scale from 0 to 100, aligning it with the intensity ratings corresponding to either past or future stimuli. Given that the pain threshold is specifically marked at 50 on this scale, one could logically infer that all ratings falling below this value should be deemed non-painful. However, I find myself uncertain about this interpretation, especially in relation to the term ”arbitrary units” used in the figure. I would greatly appreciate clarification on how to accurately interpret these units, as well as an explanation of the relationship between these values and the definition of pain threshold in this experiment.

      Initial reply: Indeed, as detailed in the Methods section 4.3, the stimulation intensity was originally transformed from the 1-13 scale to 0-100 scale to match the scales in the participant response screens.

      Following the method used to establish the pain threshold, we set the stimulus intensity of 7 as the threshold on the original 1-13 scale. However, during the rating part of the experiment, several of the participants never or very rarely selected a value above 50 (their individually defined pain threshold), despite previously indicating a moment during pain threshold procedure when a stimulus becomes painful. This then results in the re-scaled intensity values as well the perception rating, both on the same 0-100 scale of arbitrary units, to never go above the pain threshold. Please see all participant ratings and inputs in the Figure below. We see that it would be more illustrative to re-plot Figure 1 with a different exemplary participant, whose ratings go above the pain threshold, perhaps with an input intensity on the 1-13 scale on the additional right-hand-side y-axis. We will add this in the revised version as well as highlight the fact above.

      Importantly, while values below 50 are deemed non-painful by participants, the thermal stimulation still activates C-fibres involved in nociception, and we would argue that the modelling framework and analysis still applies in this case.

      Revised reply: We re-plotted Figure 1E-F with a different exemplary participant, whose rating go above the pain threshold. We also included all participant pain perception and prediction ratings, noxious input sequences and confidence ratings in the supplement in Figures S1-S3.

      Reviewer Comment 3.2 — The method of generating fluctuations in stimulation temperatures, along with the handling of perceptual uncertainty in modelling, requires further elucidation. The current models appear to presume that participants perceive each stimulus accurately, introducing noise only at the response stage. This assumption may fail to capture the inherent uncertainty in the perception of each stimulus intensity, especially when differences in consecutive temperatures are as minimal as 1°C.

      Initial reply: We agree with the reviewer that there are multiple sources of uncertainty involved in the process of rating the intensity of thermal stimuli - including the perception uncertainty. In order to include an account of inaccurate perception, one would have to consider different sources that contribute to this, which there may be many. In our approach, we consider one, which is captured in the expectation weighted model, more clearly exemplified in the expectation-weighted Kalman-Filter model (eKF). The model assumes participants perception of input as an imperfect indicator of the true level of pain. In this case, it turns out that perception is corrupted as a result of the expectation participants hold about the upcoming stimuli. The extent of this effect is partly governed by a subjective level of noise ϵ, which may also subsume other sources of uncertainty beyond the expectation effect. Moreover, the response noise ξ, could also subsume any other unexplained sources of noise.

      Author response image 1.

      Stimulis intensity transformation

      Revised reply: We clarified our modelling choices in the ”2.2 Modelling strategy” subsection.

      Reviewer Comment 3.3 — A key conclusion drawn is that eKF is a better model than eRL. However, a closer examination of the results reveals that the two models behave very similarly, and it is not clear that they can be readily distinguished based on model recovery and model comparison results.

      Initial reply: While, the eKF appears to rank higher than the eRL in terms of LOOIC and sigma effects, we don’t wish to make make sweeping statements regarding significance of differences between eRL and eKF, but merely point to the trend in the data. We shall make this clearer in the revised version of the manuscript. However, the most important result is that the models involving expectation-weighing are arguably better capturing the data.

      Revised reply: We elaborated on the significance statements in the ”Modelling Results” subsection:

      • We considered at least a 2 sigma effect as indication of a significant difference. In each condition, the expectation weighted models (eKF and eRL) provided better fit than models without this element (KF and RL; approx. 2-4 sigma difference, as reported in Figure 5A-D). This suggests that regardless of the levels of volatility and stochasticity, participants still weigh perception of the stimuli with their expectation.

      and in the first paragraph of the Discussion:

      • When varying different levels of inherent uncertainty in the sequences of stimuli (stochasticity and volatility), the expectation and confidence weighted models fitted the data better than models weighted for confidence but not for expectations (Figure 5A-D). The expectation-weighted bayesian (KF) model offered a better fit than the expectation-weighted, model-free RL model, although in conditions of high stochasticity this difference was short of significance. Overall, this suggests that participants’ expectations play a significant role in the perception of sequences of noxious stimuli.

      We are aware of the limitations and lack of clear guidance regarding using sigma effects to establish significance (as per reviewer’s suggestion: https://discourse.mc-stan.org/t/loo-comparison-in-referenceto-standard-error/4009). Here we decided to use the above-mentioned threshold of 2-sigma as an indication of significance, but note the potential limitations of the inferences - especially when distinguishing between eRL/eKF models.

      Reviewer Comment 3.4 — Regarding model recovery, the distinction between the eKF and eRL models seems blurred. When the simulation is based on the eKF, there is no ability to distinguish whether either eKF or eRL is better. When the simulation is based on the eRL, the eRL appears to be the best model, but the difference with eKF is small. This raises a few more questions. What is the range of the parameters used for the simulations?

      Initial reply: We agree that the distinction between eKF and eRL in the model recovery is not that clean-cut, which may in turn point to the similarity between the two models. To simulate the data for the model and parameter recovery analysis, we used the group means and variances estimated on the participant data to sample individual parameter values.

      Reviewer Comment 3.5 — Is it possible that either eRL or eKF are best when different parameters are simulated? Additionally, increasing the number of simulations to at least 100 could provide more convincing model recovery results.

      Initial reply: It could be a possibility, but would require further investigation and comparison of fits for different bins/ranges of parameters to see if there is any consistent advantage of one model over another is each bin. We will consider adding this analysis, and provide an additional 50 simulations to paint a more convincing picture.

      Revised reply: We increased the number of simulations per model pair to ≈ 100 (after rejecting fits based on diagnostics criteria - E-BFMI and divergent transitions) and updated the confusion matrix (Table S4). Although the confusion between eRL and eKF remains, the model recovery shows good distinction between expectation weighted vs non-expectation weighted (and Random) models, which supports our main conclusion in the paper.

      Reviewer Comment 3.6 — Regarding model comparison, the authors reported that ”the expectation-weighted KF model offered a better fit than the eRL, although in conditions of high stochasticity, this difference was short of significance against the eRL model.” This interpretation is based on a significance test that hinges on the ratio between the ELPD and the surrounding standard error (SE). Unfortunately, there’s no agreed-upon threshold of SEs that determines significance, but a general guideline is to consider ”several SEs,” with a higher number typically viewed as more robust. However, the text lacks clarity regarding the specific number of SEs applied in this test. At a cursory glance, it appears that the authors may have employed 2 SEs in their interpretation, while only depicting 1 SE in Figure 4.

      Initial reply: Indeed, we considered 2 sigma effect as a threshold, however we recognise that there is no agreed-upon threshold, and shall make this and our interpretation clearer regarding the trend in the data, in the revision.

      Revised reply: We clarify this further, as per our revised response to Comment 3.3 above. We have also added the following statement in section 4.5.1 (Methods, Model comparison): ”There’s no agreed-upon threshold of SEs that determines significance, but the higher the sigma difference, the more robust is the effect.”

      Reviewer Comment 3.7 — With respect to parameter recovery, a few additional details could be included for completeness. Specifically, while the range of the learning rate is understandably confined between 0 and 1, the range of other simulated parameters, particularly those without clear boundaries, remains ambiguous. Including scatter plots with the simulated parameters on the xaxis and the recovered parameters on the y-axis would effectively convey this missing information.

      Furthermore, it would be beneficial for the authors to clarify whether the same priors were used for both the modelling results presented in the main paper and the parameter recovery presented in the supplementary material.

      Initial reply: Thanks for this comment and for the suggestions. To simulate the data for the model and parameter recovery analysis, we used the group means and variances estimated on the participant data to sample individual parameter values. The priors on the group and individual-level parameters in the recovery analysis where the same as in the fitting procedure. We will include the requested scatter plots in the next iteration of the manuscript.

      Revised reply: We included parameter recovery scatter plots for each model and parameter in the Supplement Figures S7-S11.

      Reviewer Comment 3.8 — While the reliance on R-hat values for convergence in model fitting is standard, a more comprehensive assessment could include estimates of the effective sample size (bulk ESS and/or tail ESS) and the Estimated Bayesian Fraction of Missing Information (EBFMI), to show efficient sampling across the distribution. Consideration of divergences, if any, would further enhance the reliability of the results.

      Initial reply: Thank you very much for this suggestion, we will aim to include these measures in the revised version.

      Revised reply: We have considered the suggested diagnostics and include bulk and tail ESS values for each condition, model, parameter in the Supplement Tables S6-S9. We also report number of chain with low E-BFMI (0), number of divergent transitions (0) and the E-BFMI values per chain in Table S10.

      Reviewer Comment 3.9 — The authors write: ”Going beyond conditioning paradigms based in cuing of pain outcomes, our findings offer a more accurate description of endogenous pain regulation.” Unfortunately, this statement isn’t substantiated by the results. The authors did not engage in a direct comparison between conditioning and sequence-based paradigms. Moreover, even if such a comparison had been made, it remains unclear what would constitute the gold standard for quantifying ”endogenous pain regulation.”

      Initial reply: This is valid point, indeed we do not compare paradigms in our study, and will remove this statement in the future version.

      Revised reply: We have removed this statement from the revised version.

      Reviewer Comment 3.10 — In relation to the comment on model comparison in my public review, I believe the following link may provide further insight and clarify the basis for my observation. It discusses the use of standard error in model comparison and may be useful for the authors in addressing this particular point: https://discourse.mc-stan.org/t/loo-comparison-in-referenceto-standard-error/4009

      Initial reply: Thank you for this suggestion, we will consider the forum discussion in our manuscript.

    1. eLife assessment

      This useful study reports how neuronal activity in the prefrontal cortex maps time intervals during which animals have to wait until reaching a reward and how this mapping is preserved across days. However, the evidence supporting the claims is incomplete as these sequential neuronal patterns do not necessarily represent time but instead may be correlated with stereotypical behavior and restraint from impulsive decision, which would require further controls (e.g. behavioral analysis) to clarify the main message. The study will be of interest to neuroscientists interested in decision making and motor control.

    2. Reviewer #1 (Public Review):


      This paper investigates the neural population activity patterns of the medial frontal cortex in rats performing a nose poking timing task using in vivo calcium imaging. The results showed neurons that were active at the beginning and end of the nose poking and neurons that formed sequential patterns of activation that covaried with the timed interval during nose poking on a trial-by-trial basis. The former were not stable across sessions, while the latter tended to remain stable over weeks. The analysis on incorrect trials suggests the shorter non-rewarded intervals were due to errors in the scaling of the sequential pattern of activity.


      This study measured stable signals using in vivo calcium imaging during experimental sessions that were separated by many days in animals performing a nose poking timing task. The correlation analysis on the activation profile to separate the cells in the three groups was effective and the functional dissociation between beginning and end, and duration cells was revealing. The analysis on the stability of decoding of both the nose poking state and poking time was very informative. Hence, this study dissected a neural population that formed sequential patterns of activation that encoded timed intervals.


      It is not clear whether animals had enough simultaneously recorded cells to perform the analyzes of Figures 2-4. In fact, rat 3 had 18 responsive neurons which probably is not enough to get robust neural sequences for the trial-by-trial analysis and the correct and incorrect trial analysis. In addition, the analysis of behavioral errors could be improved. The analysis in Figure 4A could be replaced by a detailed analysis on the speed, and the geometry of neural population trajectories for correct and incorrect trials. In the case of Figure 4G is not clear why the density of errors formed two clusters instead of having a linear relation with the produce duration. I would be recommendable to compute the scaling factor on neuronal population trajectories and single cell activity or the computation of the center of mass to test the type III errors.

      Due to the slow time resolution of calcium imaging, it is difficult to perform robust analysis on ramping activity. Therefore, I recommend downplaying the conclusion that: "Together, our data suggest that sequential activity might be a more relevant coding regime than the ramping activity in representing time under physiological conditions."

    3. Reviewer #2 (Public Review):

      In this manuscript, Li and collaborators set out to investigate the neuronal mechanisms underlying "subjective time estimation" in rats. For this purpose, they conducted calcium imaging in the prefrontal cortex of water-restricted rats that were required to perform an action (nosepoking) for a short duration to obtain drops of water. The authors provided evidence that animals progressively improved in performing their task. They subsequently analyzed the calcium imaging activity of neurons and identify start, duration, and stop cells associated with the nose poke. Specifically, they focused on duration cells and demonstrated that these cells served as a good proxy for timing on a trial-by-trial basis, scaling their pattern of actvity in accordance with changes in behavioral performance. In summary, as stated in the title, the authors claim to provide mechanistic insights into subjective time estimation in rats, a function they deem important for various cognitive conditions.

      This study aligns with a wide range of studies in system neuroscience that presume that rodents solve timing tasks through an explicit internal estimation of duration, underpinned by neuronal representations of time. Within this framework, the authors performed complex and challenging experiments, along with advanced data analysis, which undoubtedly merits acknowledgement. However, the question of time perception is a challenging one, and caution should be exercised when applying abstract ideas derived from human cognition to animals. Studying so-called time perception in rats has significant shortcomings because, whether acknowledged or not, rats do not passively estimate time in their heads. They are constantly in motion. Moreover, rats do not perform the task for the sake of estimating time but to obtain their rewards are they water restricted. Their behavior will therefore reflects their motivation and urgency to obtain rewards. Unfortunately, it appears that the authors are not aware of these shortcomings. These alternative processes (motivation, sensorimotor dynamics) that occur during task performance are likely to influence neuronal activity. Consequently, my review will be rather critical. It is not however intended to be dismissive. I acknowledge that the authors may have been influenced by numerous published studies that already draw similar conclusions. Unfortunately, all the data presented in this study can be explained without invoking the concept of time estimation. Therefore, I hope the authors will find my comments constructive and understand that as scientists, we cannot ignore alternative interpretations, even if they conflict with our a priori philosophical stance (e.g., duration can be explicitly estimated by reading neuronal representation of time) and anthropomorphic assumptions (e.g., rats estimate time as humans do). While space is limited in a review, if the authors are interested, they can refer to a lengthy review I recently published on this topic, which demonstrates that my criticism is supported by a wide range of timing experiments across species (Robbe, 2023). In addition to this major conceptual issue that cast doubt on most of the conclusions of the study, there are also several major statistical issues.

      Main Concerns

      (#1) The authors used a task in which rats must poke for a minimal amount of time (300 ms and then 1500 ms) to be able to obtain a drop of water delivered a few centimeters right below the nosepoke. They claim that their task is a time estimation task. However, they forget that they work with thirsty rats that are eager to get water sooner than later (there is a reason why they start by a short duration!). This task is mainly probing the animals ability to wait (that is impulse control) rather than time estimation per se. Second, the task does not require to estimate precisely time because there appear to be no penalties when the nosepokes are too short or when they exceed. So it will be unclear if the variation in nosepoke reflects motivational changes rather than time estimation changes. The fact that this behavioral task is a poor assay for time estimation and rather reflects impulse control is shown by the tendency of animals to perform nose-pokes that are too short, the very slow improvement in their performance (Figure 1, with most of the mice making short responses), and the huge variability. Not only do the behavioral data not support the claim of the authors in terms of what the animals are actually doing (estimating time), but this also completely annhilates the interpretation of the Ca++ imaging data, which can be explained by motivational factors (changes in neuronal activity occurring while the animals nose poke may reflect a growing sens of urgency to check if water is available).

      (#2) A second issue is that the authors seem to assume that rats are perfectly immobile and perform like some kind of robots that would initiate nose pokes, maintain them, and remove them in a very discretized manner. However, in this kind of task, rats are constantly moving from the reward magazine to the nose poke. They also move while nose-poking (either their body or their mouth), and when they come out of the nose poke, they immediately move toward the reward spout. Thus, there is a continuous stream of movements, including fidgeting, that will covary with timing. Numerous studies have shown that sensorimotor dynamics influence neural activity, even in the prefrontal cortex. Therefore, the authors cannot rule out that what the records reflect are movements (and the scaling of movement) rather than underlying processes of time estimation (some kind of timer). Concretely, start cells could represent the ending of the movement going from the water spout to the nosepoke, and end cells could be neurons that initiate (if one can really isolate any initiation, which I doubt) the movement from the nosepoke to the water spout. Duration cells could reflect fidgeting or orofacial movements combined with an increasing urgency to leave the nose pokes.

      (#3) The statistics should be rethought for both the behavioral and neuronal data. They should be conducted separately for all the rats, as there is likely interindividual variability in the impulsivity of the animals.

      (#4) The fact that neuronal activity reflects an integration of movement and motivational factors rather than some abstract timing appears to be well compatible with the analysis conducted on the error trials (Figure 4), considering that the sensorimotor and motivational dynamics will rescale with the durations of the nose poke.

      (#5) The authors should mention upfront in the main text (result section) the temporal resolution allowed by their Ca+ probe and discuss whether it is fast enough in regard of behavioral dynamics occurring in the task.

    4. Author response:

      eLife assessment

      This useful study reports how neuronal activity in the prefrontal cortex maps time intervals during which animals have to wait until reaching a reward and how this mapping is preserved across days. However, the evidence supporting the claims is incomplete as these sequential neuronal patterns do not necessarily represent time but instead may be correlated with stereotypical behavior and restraint from impulsive decision, which would require further controls (e.g. behavioral analysis) to clarify the main message. The study will be of interest to neuroscientists interested in decision making and motor control. 

      We thank the editors and reviewers for the constructive comments. In light of the questions mentioned by the reviewers, we plan to perform additional analyses in our revision, particularly aiming to address issues related to single-cell scalability, and effects of motivation and movement. We believe these additional data will greatly improve the rigor and clarity of our study. We are grateful for the review process of eLife.

      Public Reviews:

      Reviewer #1 (Public Review):


      This paper investigates the neural population activity patterns of the medial frontal cortex in rats performing a nose poking timing task using in vivo calcium imaging. The results showed neurons that were active at the beginning and end of the nose poking and neurons that formed sequential patterns of activation that covaried with the timed interval during nose poking on a trial-by-trial basis. The former were not stable across sessions, while the latter tended to remain stable over weeks. The analysis on incorrect trials suggests the shorter non-rewarded intervals were due to errors in the scaling of the sequential pattern of activity. 


      This study measured stable signals using in vivo calcium imaging during experimental sessions that were separated by many days in animals performing a nose poking timing task. The correlation analysis on the activation profile to separate the cells in the three groups was effective and the functional dissociation between beginning and end, and duration cells was revealing. The analysis on the stability of decoding of both the nose poking state and poking time was very informative. Hence, this study dissected a neural population that formed sequential patterns of activation that encoded timed intervals. 

      We thank the reviewer for the positive comments.


      It is not clear whether animals had enough simultaneously recorded cells to perform the analyzes of Figures 2-4. In fact, rat 3 had 18 responsive neurons which probably is not enough to get robust neural sequences for the trial-by-trial analysis and the correct and incorrect trial analysis. 

      We thank the reviewer for the comment. We would like to mention that the 18 cells plotted in Supplementary figure 1 were only from the duration cell category. To improve the clarity of our results, we are going to provide information regarding the number of cells from each rat in our revision. In general, we imaged more than 50 cells from each rat. We would also like to point to the data from individual trials in Supplementary figure 1B showing robust sequentiality.

      In addition, the analysis of behavioral errors could be improved. The analysis in Figure 4A could be replaced by a detailed analysis on the speed, and the geometry of neural population trajectories for correct and incorrect trials.

      We thank the reviewer for the suggestions. We are going to conduct the analysis as the reviewer recommended. We agree with the reviewer that better presentation of the neural activity will be helpful for the readers.

      In the case of Figure 4G is not clear why the density of errors formed two clusters instead of having a linear relation with the produce duration. I would be recommendable to compute the scaling factor on neuronal population trajectories and single cell activity or the computation of the center of mass to test the type III errors. 

      We would like to mention that the prediction errors plotted in this graph were calculated from two types of trials. The correct trials tended to show positive time estimation errors while the incorrect trials showed negative time estimation errors. We believe that the polarity switch between these two types suggested a possible use of this neural mechanism to time the action of the rats.

      In addition, we are going to perform the analysis suggested by the reviewer in our revision. We agree that different ways of analyzing the data would provide better characterization of the scaling effect.

      Due to the slow time resolution of calcium imaging, it is difficult to perform robust analysis on ramping activity. Therefore, I recommend downplaying the conclusion that: "Together, our data suggest that sequential activity might be a more relevant coding regime than the ramping activity in representing time under physiological conditions." 

      We agree with the reviewer and we have mentioned this caveat in our original manuscript. We are going to rephrase the sentence as the reviewer suggested during our revision.

      Reviewer #2 (Public Review):

      In this manuscript, Li and collaborators set out to investigate the neuronal mechanisms underlying "subjective time estimation" in rats. For this purpose, they conducted calcium imaging in the prefrontal cortex of water-restricted rats that were required to perform an action (nosepoking) for a short duration to obtain drops of water. The authors provided evidence that animals progressively improved in performing their task. They subsequently analyzed the calcium imaging activity of neurons and identify start, duration, and stop cells associated with the nose poke. Specifically, they focused on duration cells and demonstrated that these cells served as a good proxy for timing on a trial-by-trial basis, scaling their pattern of actvity in accordance with changes in behavioral performance. In summary, as stated in the title, the authors claim to provide mechanistic insights into subjective time estimation in rats, a function they deem important for various cognitive conditions. 

      This study aligns with a wide range of studies in system neuroscience that presume that rodents solve timing tasks through an explicit internal estimation of duration, underpinned by neuronal representations of time. Within this framework, the authors performed complex and challenging experiments, along with advanced data analysis, which undoubtedly merits acknowledgement. However, the question of time perception is a challenging one, and caution should be exercised when applying abstract ideas derived from human cognition to animals. Studying so-called time perception in rats has significant shortcomings because, whether acknowledged or not, rats do not passively estimate time in their heads. They are constantly in motion. Moreover, rats do not perform the task for the sake of estimating time but to obtain their rewards are they water restricted. Their behavior will therefore reflects their motivation and urgency to obtain rewards. Unfortunately, it appears that the authors are not aware of these shortcomings. These alternative processes (motivation, sensorimotor dynamics) that occur during task performance are likely to influence neuronal activity. Consequently, my review will be rather critical. It is not however intended to be dismissive. I acknowledge that the authors may have been influenced by numerous published studies that already draw similar conclusions. Unfortunately, all the data presented in this study can be explained without invoking the concept of time estimation. Therefore, I hope the authors will find my comments constructive and understand that as scientists, we cannot ignore alternative interpretations, even if they conflict with our a priori philosophical stance (e.g., duration can be explicitly estimated by reading neuronal representation of time) and anthropomorphic assumptions (e.g., rats estimate time as humans do). While space is limited in a review, if the authors are interested, they can refer to a lengthy review I recently published on this topic, which demonstrates that my criticism is supported by a wide range of timing experiments across species (Robbe, 2023). In addition to this major conceptual issue that cast doubt on most of the conclusions of the study, there are also several major statistical issues. 

      Main Concerns 

      (1) The authors used a task in which rats must poke for a minimal amount of time (300 ms and then 1500 ms) to be able to obtain a drop of water delivered a few centimeters right below the nosepoke. They claim that their task is a time estimation task. However, they forget that they work with thirsty rats that are eager to get water sooner than later (there is a reason why they start by a short duration!). This task is mainly probing the animals ability to wait (that is impulse control) rather than time estimation per se. Second, the task does not require to estimate precisely time because there appear to be no penalties when the nosepokes are too short or when they exceed. So it will be unclear if the variation in nosepoke reflects motivational changes rather than time estimation changes. The fact that this behavioral task is a poor assay for time estimation and rather reflects impulse control is shown by the tendency of animals to perform nose-pokes that are too short, the very slow improvement in their performance (Figure 1, with most of the mice making short responses), and the huge variability. Not only do the behavioral data not support the claim of the authors in terms of what the animals are actually doing (estimating time), but this also completely annhilates the interpretation of the Ca++ imaging data, which can be explained by motivational factors (changes in neuronal activity occurring while the animals nose poke may reflect a growing sens of urgency to check if water is available). 

      We would like to respond to the reviewer’s comments 1, 2 and 4 together since they all focus on the same issue. We thank the reviewer for the very thoughtful comments and for sharing his detailed reasoning from a recently published review (Robbe, 2023). A lot of the discussion goes beyond the scope of this study and we agree that whether there is an explicit representation of time (an internal clock) in the brain is a difficult question to answer, particularly by using animal behaviors. In fact, even with fully conscious humans and elaborated task design, we think it is still questionable to clearly dissociate the neural substrate of “timing” from “motor”. In the end, it may as well be that as the reviewer cited from Bergson’s article, the experience of time cannot be measured.

      Studying the neural representation of any internal state may suffer from the same ambiguity. With all due respect, however, we would like to limit our response in the scope of our results. According to the reviewer, two alternative interpretations of the task-related sequential activity exist: 1, duration cells may represent fidgeting or orofacial movements and 2, duration cells may represent motivation or motion plan of the rats. To test the first alternative interpretation, we will perform a more comprehensive analysis of the behavior data at all the limbs and visible body parts of the rat during nose poke and analyze its periodicity among different trials, although the orofacial movements may not be visible to us.

      Regarding the second alternative interpretation, we think our data in the original Figure 4G argues against it. In this graph, we plotted the decoding error of time using the duration cells’ activity against the actual duration of the trials. If the sequential activity of durations cells only represents motivation, then the errors should distribute evenly across different trial times, or linearly modulated by trial durations. The unimodal distribution we observed (Figure 4G and see Author response image 1 below for a re-plot without signs) suggests that the scaling factor of the sequential activity represents information related to time. And the fact that this unimodal distribution centered at the time threshold of the task provides strong evidence for the active use of scaling factor for time estimation. In order to further test the relationship to motivation, we will measure the time interval between exiting nose poke to the start of licking water reward as an independent measurement of motivation for each trial. We will analyze and report whether this measurement correlates with the nose poking durations in our data in the revision.

      Author response image 1.

      Furthermore, whether the scaling sequential activity we report represents behavioral timing or true time estimation, the reviewer would agree that these activities correlate with the animal’s nose poking durations, and a previous study has showed that PFC silencing led to disruption of the mouse’s timing behavior (PMID: 24367075). The main surprising finding of the paper is that these duration cells are different from the start and end cells in terms of their coding stability. Thus, future studies dissecting the anatomical microcircuit of these duration cells may provide further clue regarding whether they receive inputs from thirst or reward-related brain regions. This may help partially resolve the “time” vs. “motor” debate the reviewer mentioned.

      (2) A second issue is that the authors seem to assume that rats are perfectly immobile and perform like some kind of robots that would initiate nose pokes, maintain them, and remove them in a very discretized manner. However, in this kind of task, rats are constantly moving from the reward magazine to the nose poke. They also move while nose-poking (either their body or their mouth), and when they come out of the nose poke, they immediately move toward the reward spout. Thus, there is a continuous stream of movements, including fidgeting, that will covary with timing. Numerous studies have shown that sensorimotor dynamics influence neural activity, even in the prefrontal cortex. Therefore, the authors cannot rule out that what the records reflect are movements (and the scaling of movement) rather than underlying processes of time estimation (some kind of timer). Concretely, start cells could represent the ending of the movement going from the water spout to the nosepoke, and end cells could be neurons that initiate (if one can really isolate any initiation, which I doubt) the movement from the nosepoke to the water spout. Duration cells could reflect fidgeting or orofacial movements combined with an increasing urgency to leave the nose pokes.

      (3)The statistics should be rethought for both the behavioral and neuronal data. They should be conducted separately for all the rats, as there is likely interindividual variability in the impulsivity of the animals.

      We thank the reviewer for the comment, yet we are not quite sure what specifically was asked by the reviewer. There is undoubtedly variance among individual animals. One of the core reasons for statistical comparison is to compare the group difference with the variance due to sampling. It appears that the reviewer would like to require we conduct our analysis using each rat individually. We will conduct and report analysis with individual rat in Figure 1C, Figure 2C, G, K, Figure 4F in our revised manuscript.

      (4) The fact that neuronal activity reflects an integration of movement and motivational factors rather than some abstract timing appears to be well compatible with the analysis conducted on the error trials (Figure 4), considering that the sensorimotor and motivational dynamics will rescale with the durations of the nose poke. 

      (5) The authors should mention upfront in the main text (result section) the temporal resolution allowed by their Ca+ probe and discuss whether it is fast enough in regard of behavioral dynamics occurring in the task. 

      We thank the reviewer for the suggestion. We have originally mentioned the caveat of calcium imaging in the interpretation of our results. We will incorporate more texts for this purpose during our revision. In terms of behavioral dynamics (start and end of nose poke in this case), we think calcium imaging could provide sufficient kinetics. However, the more refined dynamics related to the reproducibility of the sequential activity or the precise representation of individual cells on the scaled duration may be benefited from improved time resolution.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Please refer explicitly to the three types of cells in the abstract. 

      We will modify the abstract as suggested during revision.

      (2) Please refer to the work of Betancourt et al., 2023 Cell Reports, where a trial-by-trail analysis on the correlation between neural trajectory dynamics in MPC and timing behavior is reported. In that same paper the stability of neural sequences across task parameters is reported. 

      We will cite and discuss this study in our revised paper.

      (3) Please state the number of studied animals at the beginning of the results section. 

      We will provide this information as requested. The number of animals were also plotted in Figure 1D for each analysis.

      (4) Why do the middle and right panels of Figure 2E show duration cells. 

      Figure 2E was intended to show examples of duration cells’ activity. We included different examples of cells that peak at different points in the scaled duration. We believe these multiple examples would give the readers a straight forward impression of these cells’ activity patterns.

      (5) Which behavioral sessions of Figure 1B were analyzed further. 

      We will label the analyzed sessions in Figure 1B during our revision.

      (6) In Figure 3A-C please increase the time before the beginning of the trial in order to visualize properly the activation patterns of the start cells. 

      We thank the reviewer for the suggestion and will modify the figure accordingly during revision.

      (7) Please state what could be the behavioral and functional effect of the ablation of the cortical tissue on top of mPFC. 

      We thank the reviewer for the question. In our experience, mice with lens implanted in mPFC did not show observable different to mice without surgery regarding the acquisition of the task and the distribution of the nose-poke durations. Although we could not rule out the effect on other cognitive process, the mice appeared to be intact in the scope of our task. We will provide these behavior data during our revision.

    1. eLife assessment

      This study presents a useful exploration of the complex relationship between structure and function in the developing human brain using a large-scale imaging dataset from the Human Connectome Project in Development and gene expression profiles from the Allen Brain Atlas. The evidence supporting the claims of the authors is solid, although the inclusion of more systematic analyses of structural and functional connectivity with respect to myelin measures and oligodendrocyte-related genes, and also more details regarding the imaging analyses, cognitive scores, and design and validation strategies, would have strengthened the paper. The work will be of interest to developmental biologists and neuroscientists seeking to elucidate structure-function relationships in the human brain.

    2. Reviewer #1 (Public Review):


      This work studies spatio-temporal patterns of structure-function coupling in developing brains, using a large set of imaging data acquired from children aged 5-22. Magnetic resonance imaging data of brain structure and function were obtained from a publicly available database, from which structural and functional features and measures were derived. The authors examined the spatial patterns of structure-function coupling and how they evolve with brain development. This work further sought correlations of brain structure-function coupling with behavior and explored evolutionary, microarchitectural and genetic bases that could potentially account for the observed patterns.


      The strength of this work is the use of currently available state-of-the-art analysis methods, along with a large set of high-quality imaging data, and comprehensive examinations of structure-function coupling in developing brains. The results are comprehensive and illuminating.


      As with most other studies, transcriptomic and cellular architectures of structure-function coupling were characterized only on the basis of a common atlas in this work.

      The authors have achieved their aims in this study, and the findings provide mechanistic insights into brain development, which will inspire further basic and clinical studies along this line.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) Lines 40-42: The sentence "The coupling of structural connectome (SC) and functional connectome (FC) varies greatly across different cortical regions reflecting anatomical and functional hierarchies as well as individual differences in cognitive function, and is regulated by genes" is a misstatement. Regional variations of structure-function coupling do not really reflect differences in cognitive function among individuals, but inter-subject variations do.

      Thank you for your comment. We have made revisions to the sentence to correct its misstatement. Please see lines 40-43: “The coupling of structural connectome (SC) and functional connectome (FC) varies greatly across different cortical regions reflecting anatomical and functional hierarchies[1, 6-9] and is regulated by genes[6, 8], as well as its individual differences relates to cognitive function[8, 9].”

      (2) In Figure 1, the graph showing the relation between intensity and cortical depth needs explanation.

      Thank you for your comment. We have added necessary explanation, please see lines 133-134: “The MPC was used to map similarity networks of intracortical microstructure (voxel intensity sampled in different cortical depth) for each cortical node.”

      (3) Line 167: Change "increased" to "increase".

      We have corrected it, please see lines 173-174: “…networks significantly increased with age and exhibited greater increase.”

      (4) Line 195: Remove "were".

      We have corrected it, please see line 204: “…default mode networks significantly contributed to the prediction…”

      (5) Lines 233-240, Reproducibility analyses: Comparisons of parcellation templates were not made with respect to gene weights. Is there any particular reason?

      Thank you for your comment. We have quantified the gene weights based on HCPMMP using the same procedures. We identified a correlation (r \= 0.25, p<0.001) between the gene weights in HCPMMP and BNA. Given that this is a relatively weak correlation, we need to clarify the following points.

      Based on HCPMMP, we produced an averaged gene expression profile for 10,027 genes covering 176 left cortical regions[1]. The excluding 4 cortical regions that had an insufficient number of assigned samples may lead to different templates having a relatively weak correlation of gene associations. Moreover, the effect of different template resolutions on the results of human connectome-transcriptome association is still unclear.

      In brain connectome analysis, the choice of parcellation templates can indeed influence the subsequent findings to some extent. A methodological study[2] provided referenced correlations about 0.4~0.6 for white matter connectivity and 0.2~0.4 for white matter nodal property between two templates (refer to Figure 4 and 5 in [2]). Therefore, the age-related coupling changes as a downstream analysis was calculated using multimodal connectome and correlated with gene expression profiles, which may be influenced by the choice of templates. 

      We have further supplemented gene weights results obtained from HCPMMP to explicitly clarify the dependency of parcellation templates.

      Please see lines 251-252: “The gene weights of HCPMMP was consistent with that of BNA (r = 0.25, p < 0.001).”

      Author response image 1.

      The consistency of gene weights between HCPMMP and BNA.

      Please see lines 601-604: “Finally, we produced an averaged gene expression profile for 10,027 genes covering 176 left cortical regions based on HCPMMP and obtained the gene weights by PLS analysis. We performed Pearson's correlation analyses to assess the consistency of gene weights between HCPMMP and BNA.”

      Reviewer #2 (Recommendations For The Authors):

      Your paper is interesting to read and I found your efforts to evaluate the robustness of the results of different parcellation strategies and tractography methods very valuable. The work is globally easy to navigate and well written with informative good-quality figures, although I think some additional clarifications will be useful to improve readability. My suggestions and questions are detailed below (I aimed to group them by topic which did not always succeed so apologies if the comments are difficult to navigate, but I hope they will be useful for reflection and to incorporate in your work).

      * L34: 'developmental disorder'

      ** As far as I understand, the subjects in HCP-D are mostly healthy (L87). Thus, while your study provides interesting insights into typical brain development, I wonder if references to 'disorder' might be premature. In the future, it would be interesting to extend your approach to the atypical populations. In any case, it would be extremely helpful and appreciated if you included a figure visualising the distribution of behavioural scores within your population and in relationship to age at scan for your subjects (and to include a more detailed description of the assessment in the methods section) given that large part of your paper focuses on their prediction using coupling inputs (especially given a large drop of predictive performance after age correction). Such figures would allow the reader to better understand the cognitive variability within your data, but also potential age relationships, and generally give a better overview of your cohort.

      We agree with your comment that references to 'disorder' is premature. We have made revisions in abstract and conclusion. 

      Please see lines 33-34: “This study offers insight into the maturational principles of SC-FC coupling in typical development.”

      Please see lines 395-396: “Further investigations are needed to fully explore the clinical implications of SC-FC coupling for a range of developmental disorders.”

      In addition, we have included a more detailed description of the cognitive scores in the methods section and provided a figure to visualize the distributions of cognitive scores and in relationship to age for subjects. Please see lines 407-413: “Cognitive scores. We included 11 cognitive scores which were assessed with the National Institutes of Health (NIH) Toolbox Cognition Battery (https://www.healthmeasures.net/exploremeasurement-systems/nih-toolbox), including episodic memory, executive function/cognitive flexibility, executive function/inhibition, language/reading decoding, processing speed, language/vocabulary comprehension, working memory, fluid intelligence composite score, crystal intelligence composite score, early child intelligence composite score and total intelligence composite score. Distributions of these cognitive scores and their relationship with age are illustrated in Figure S12.”

      Author response image 2.

      Cognitive scores and age distributions of scans.

      * SC-FC coupling

      ** L162: 'Regarding functional subnetworks, SC-FC coupling increased disproportionately with age (Figure 3C)'.

      *** As far as I understand, in Figure 3C, the points are the correlation with age for a given ROI within the subnetwork. Is this correct? If yes, I am not sure how this shows a disproportionate increase in coupling. It seems that there is great variability of SC-FC correlation with age across regions within subnetworks, more so than the differences between networks. This would suggest that the coupling with age is regionally dependent rather than network-dependent? Maybe you could clarify?

      The points are the correlation with age for a given ROI within the subnetwork in Figure 3C. We have revised the description, please see lines 168-174: “Age correlation coefficients distributed within functional subnetworks were shown in Figure 3C. Regarding mean SC-FC coupling within functional subnetworks, the somatomotor (𝛽𝑎𝑔𝑒\=2.39E-03, F=4.73, p\=3.10E-06, r\=0.25, p\=1.67E07, Figure 3E), dorsal attention (𝛽𝑎𝑔𝑒\=1.40E-03, F=4.63, p\=4.86E-06, r\=0.24, p\=2.91E-07, Figure 3F), frontoparietal (𝛽𝑎𝑔𝑒 =2.11E-03, F=6.46, p\=2.80E-10, r\=0.33, p\=1.64E-12, Figure 3I) and default mode (𝛽𝑎𝑔𝑒 =9.71E-04, F=2.90, p\=3.94E-03, r\=0.15, p\=1.19E-03, Figure 3J) networks significantly increased with age and exhibited greater increase.” In addition, we agree with your comment that the coupling with age is more likely region-dependent than network-dependent. We have added the description, please see lines 329-332: “We also found the SC-FC coupling with age across regions within subnetworks has more variability than the differences between networks, suggesting that the coupling with age is more likely region-dependent than network-dependent.” This is why our subsequent analysis focused on regional coupling.  

      *** Additionally, we see from Figure 3C that regions within networks have very different changes with age. Given this variability (especially in the subnetworks where you show both positive and negative correlations with age for specific ROIs (i.e. all of them)), does it make sense then to show mean coupling over regions within the subnetworks which erases the differences in coupling with age relationships across regions (Figures 3D-J)?

      Considering the interest and interpretation for SC-FC coupling, showing the mean coupling at subnetwork scales with age correlation is needed, although this eliminates variability at regional scale. These results at different scales confirmed that coupling changes with age at this age group are mainly increased.

      *** Also, I think it would be interesting to show correlation coefficients across all regions, not only the significant ones (3B). Is there a spatially related tendency of increases/decreases (rather than a 'network' relationship)? Would it be interesting to show a similar figure to Figure S7 instead of only the significant regions?

      As your comment, we have supplemented the graph which shows correlation coefficients across all regions into Figure 3B. Similarly, we supplemented to the other figures (Figure S3-S6).

      Author response image 3.

      Aged-related changes in SC-FC coupling. (A) Increases in whole-brain coupling with age. (B) Correlation of age with SC-FC coupling across all regions and significant regions (p<0.05, FDR corrected). (C) Comparisons of age-related changes in SC-FC coupling among functional networks. The boxes show the median and interquartile range (IQR; 25–75%), and the whiskers depict 1.5× IQR from the first or third quartile. (D-J) Correlation of age with SC-FC coupling across the VIS, SM, DA, VA, LIM, FP and DM. VIS, visual network; SM, somatomotor network; DA, dorsal attention network; VA, ventral attention network; LIM, limbic network; FP, frontoparietal network; DM, default mode network.

      *** For the quantification of MPC.

      **** L421: you reconstructed 14 cortical surfaces from the wm to pial surface. If we take the max thickness of the cortex to be 4.5mm (Fischl & Dale, 2000), the sampling is above the resolution of your anatomical images (0.8mm). Could you expand on what the interest is in sampling such a higher number of surfaces given that the resolution is not enough to provide additional information?

      The surface reconstruction was based on state-of-the-art equivolumetric surface construction techniques[3] which provides a simplified recapitulation of cellular changes across the putative laminar structure of the cortex. By referencing a 100-μm resolution Merkerstained 3D histological reconstruction of an entire post mortem human brain (BigBrain: https://bigbrain.loris.ca/main.php), a methodological study[4] systematically evaluated MPC stability with four to 30 intracortical surfaces when the resolution of anatomical image was 0.7 mm, and selected 14 surfaces as the most stable solution. Importantly, it has been proved the in vivo approach can serve as a lower resolution yet biologically meaningful extension of the histological work[4]. 

      **** L424: did you aggregate intensities over regions using mean/median or other statistics?

      It might be useful to specify.

      Thank you for your careful comment. We have revised the description in lines 446-447: “We averaged the intensity profiles of vertices over 210 cortical regions according to the BNA”.

      **** L426: personal curiosity, why did you decide to remove the negative correlation of the intensity profiles from the MPC? Although this is a common practice in functional analyses (where the interpretation of negatives is debated), within the context of cortical correlations, the negative values might be interesting and informative on the level of microstructural relationships across regions (if you want to remove negative signs it might be worth taking their absolute values instead).

      We agree with your comment that the interpretation of negative correlation is debated in MPC. Considering that MPC is a nascent approach to network modeling, we adopted a more conservative strategy that removing negative correlation by referring to the study [4] that proposed the approach. As your comment, the negative correlation might be informative. We will also continue to explore the intrinsic information on the negative correlation reflecting microstructural relationships.

      **** L465: could you please expand on the notion of self-connections, it is not completely evident what this refers to.

      We have revised the description in lines 493-494: “𝑁𝑐 is the number of connection (𝑁𝑐 = 245 for BNA)”.

      **** Paragraph starting on L467: did you evaluate the multicollinearities between communication models? It is possibly rather high (especially for the same models with similar parameters (listed on L440-444)). Such dependence between variables might affect the estimates of feature importance (given the predictive models only care to minimize error, highly correlated features can be selected as a strong predictor while the impact of other features with similarly strong relationships with the target is minimized thus impacting the identification of reliable 'predictors').

      We agree with your comment. The covariance structure (multicollinearities) among the communication models have a high probability to lead to unreliable predictor weights. In our study, we applied Haufe's inversion transform[5] which resolves this issue by computing the covariance between the predicted FC and each communication models in the training set. More details for Haufe's inversion transform please see [5]. We further clarified in the manuscript, please see in lines 497-499: “And covariance structure among the predictors may lead to unreliable predictor weights. Thus, we applied Haufe's inversion transform[38] to address these issues and identify reliable communication mechanisms.”

      **** L474: I am not completely familiar with spin tests but to my understanding, this is a spatial permutation test. I am not sure how this applies to the evaluation of the robustness of feature weight estimates per region (if this was performed per region), it would be useful to provide a bit more detail to make it clearer.

      As your comment, we have supplemented the detail, please see lines 503-507: “Next, we generated 1,000 FC permutations through a spin test[86] for each nodal prediction in each subject and obtained random distributions of model weights. These weights were averaged over the group and were investigated the enrichment of the highest weights per region to assess whether the number of highest weights across communication models was significantly larger than that in a random discovery.”

      **** L477: 'significant communication models were used to represent WMC...', but in L103 you mention you select 3 models: communicability, mean first passage, and flow graphs. Do you want to say that only 3 models were 'significant' and these were exactly the same across all regions (and data splits/ parcellation strategies/ tractography methods)? In the methods, you describe a lot of analysis and testing but it is not completely clear how you come to the selection of the final 3, it would be beneficial to clarify. Also, the final 3 were selected on the whole dataset first and then the pipeline of SC-FC coupling/age assessment/behaviour predictions was run for every (WD, S1, S2) for both parcellations schemes and tractography methods or did you end up with different sets each time? It would be good to make the pipeline and design choices, including the validation bit clearer (a figure detailing all the steps which extend Figure 1 would be very useful to understand the design/choices and how they relate to different runs of the validation).

      Thank you for your comment. In all reproducibility analyses, we used the same 3 models which was selected on the main pipeline (probabilistic tractography and BNA parcellation). According to your comment, we produced a figure that included the pipeline of model selection as the extend of Figure 1. And the description please see lines 106-108: “We used these three models to represent the extracortical connectivity properties in subsequent discovery and reproducibility analyses (Figure S1).” 

      Author response image 4.

      Pipeline of model selection and reproducibility analyses.

      **** Might the imbalance of features between structural connectivity and MPC affect the revealed SC-FC relationships (3 vs 1)? Why did you decide on this ratio rather than for example best WM structural descriptor + MPC?

      We understand your concern. The WMC communication models represent diverse geometric, topological, or dynamic factors. In order to describe the properties of WMC as best as possible, we selected three communication models after controlling covariance structure that can significantly predict FC from the 27 models. Compared to MPC, this does present a potential feature imbalance problem. However, this still supports the conclusion that coupling models that incorporate microarchitectural properties yield more accurate predictions of FC from SC[6, 7]. The relevant experiments are shown in Figure S2 below. If only the best WM structural descriptor is used, this may lose some communication properties of WMC.

      **** L515: were intracranial volume and in-scanner head motion related to behavioural measures? These variables likely impact the inputs, do you expect them to influence the outcome assessments? Or is there a mistake on L518 and you actually corrected the input features rather than the behaviour measures?

      The in-scanner head motion and intracranial volume are related to some age-adjusted behavioural measures, as shown in the following table. The process of regression of covariates from cognitive measures was based on these two cognitive prediction studies [8, 9]. Please see lines 549-554: “Prior to applying the nested fivefold cross-validation framework to each behaviour measure, we regressed out covariates including sex, intracranial volume, and in-scanner head motion from the behaviour measure[59, 69]. Specifically, we estimated the regression coefficients of the covariates using the training set and applied them to the testing set. This regression procedure was repeated for each fold.”

      Author response table 1.

      ** Additionally, in the paper, you propose that the incorporation of cortical microstructural (myelin-related) descriptors with white-matter connectivity to explain FC provides for 'a more comprehensive perspective for characterizing the development of SC-FC coupling' (L60). This combination of cortical and white-matter structure is indeed interesting, however the benefits of incorporating different descriptors could be studied further. For example, comparing results of using only the white matter connectivity (assessed through selected communication models) ~ FC vs (white matter + MPC) ~ FC vs MPC ~ FC. Which descriptors better explain FC? Are the 'coupling trends' similar (or the same)? If yes, what is the additional benefit of using the more complex combination? This would also add strength to your statement at L317: 'These discrepancies likely arise from differences in coupling methods, highlighting the complementarity of our methods with existing findings'. Yes, discrepancies might be explained by the use of different SC inputs. However, it is difficult to see how discrepancies highlight complementarity - does MCP (and combination with wm) provide additional information to using wm structural alone?~

      According to your comment, we have added the analyses based on different models using only the myelin-related predictor or WM connectivity to predict FC, and further compared the results among different models. please see lines 519-521: “In addition, we have constructed the models using only MPC or SCs to predict FC, respectively. Spearman’s correlation was used to assess the consistency between spatial patterns based on different models.” 

      Please see lines 128-130: “In addition, the coupling pattern based on other models (using only MPC or only SCs to predict FC) and the comparison between the models were shown in Figure S2A-C.” Please see lines 178-179: “The age-related patterns of SC-FC coupling based other coupling models were shown in Figure S2D-F.”

      Although we found that there were spatial consistencies in the coupling patterns between different models, the incorporation of MPC with SC connectivity can improve the prediction of FC than the models based on only MPC or SC. For age-related changes in coupling, the differences between the models was further amplified. We agree with you that the complementarity cannot be explicitly quantified and we have revised the description, please see line 329: “These discrepancies likely arise from differences in coupling methods.”

      Author response image 5.

      Comparison results between different models. Spatial pattern of mean SC-FC coupling based on MPC ~ FC (A), SCs ~ FC (B), and MPC + SCs ~ FC (C). Correlation of age with SC-FC coupling across cortex based on MPC ~ FC (D), SCs ~ FC (E), and MPC + SCs ~ FC (F).

      ** For the interpretation of results: L31 'SC-FC coupling is positively associated with genes in oligodendrocyte-related pathways and negatively associated with astrocyte-related gene'; L124: positive myelin content with SC-FC coupling...and similarly on L81, L219, L299, L342, and L490:

      ***You use a T1/T2 ratio which is (in large part) a measure of myelin to estimate the coupling between SC and FC. Evaluation with SC-FC coupling with myeline described in Figure 2E is possibly biased by the choice of this feature. Similarly, it is possible that reported positive associations with oligodendrocyte-related pathways and SC-FC coupling in your work could in part result from a bias introduced by the 'myelin descriptor' (conversely, picking up the oligodendrocyte-related genes is a nice corroboration for the T1/T2 ration being a myelin descriptor, so that's nice). However, it is possible that if you used a different descriptor of the cortical microstructure, you might find different expression patterns associated with the SCFC coupling (for example using neurite density index might pick up neuronal-related genes?). As mentioned in my previous suggestions, I think it would be of interest to first use only the white matter structural connectivity feature to assess coupling to FC and assess the gene expression in the cortical regions to see if the same genes are related, and subsequently incorporate MPC to dissociate potential bias of using a myelin measure from genetic findings.

      Thank you for your insightful comments. In this paper, however, the core method of measuring coupling is to predict functional connections using multimodal structural connections, which may yield more information than a single modal. We agree with your comment that separating SCs and MPC to look at the genes involved in both separately could lead to interesting discoveries. We will continue to explore this in the future.

      ** Generally, I find it difficult to understand the interpretation of SC-FC coupling measures and would be interested to hear your thinking about this. As you mention on L290-294, how well SC predicts FC depends on which input features are used for the coupling assessment (more complex communication models, incorporating additional microstructural information etc 'yield more accurate predictions of FC' L291) - thus, calculated coupling can be interpreted as a measure of how well a particular set of input features explain FC (different sets will explain FC more or less well) ~ coupling is related to a measure of 'missing' information on the SC-FC relationship which is not contained within the particular set of structural descriptors - with this approach, the goal might be to determine the set that best, i.e. completely, explains FC to understand the link between structure and function. When you use the coupling measures for comparisons with age, cognition prediction etc, the 'status' of the SC-FC changes, it is no longer the amount of FC explained by the given SC descriptor set, but it's considered a descriptor in itself (rather than an effect of feature selection / SC-FC information overlap) - how do you interpret/argue for this shift of use?

      Thank you for your comment. In this paper, we obtain reasonable SC-FC coupling by determining the optimal set of structural features to explain the function. The coupling essentially measures the direct correspondence between structure and function. To study the relationship between coupling and age and cognition is actually to study the age correlation and cognitive correlation of this direct correspondence between structure and function. 

      ** In a similar vein to the above comment, I am interested to hear what you think: on L305 you mention that 'perfect SC-FC coupling may be unlikely'. Would this reasoning suggest that functional activity takes place through other means than (and is therefore somehow independent of) biological (structural) substrates? For now, I think one can only say that we have imperfect descriptors of the structure so there is always information missing to explain function, this however does not mean the SC and FC are not perfectly coupled (only that we look at insufficient structural descriptors - limitations of what imaging can assess, what we measure etc). This is in line with L305 where you mention that 'Moreover, our results suggested that regional preferential contributions across different SCs lead to variations in the underlying communication process'. This suggests that locally different areas might use different communication models which are not reflected in the measures of SC-FC coupling that was employed, not that the 'coupling' is lower or higher (or coupling is not perfect). This is also a change in approach to L293: 'This configuration effectively releases the association cortex from strong structural constraints' - the 'release' might only be in light of the particular structural descriptors you use - is it conceivable that a different communication model would be more appropriate (and show high coupling) in these areas.

      Thank you for your insightful comments. We have changed the description, please see lines 315317: “SC-FC coupling is dynamic and changes throughout the lifespan[7], particularly during adolescence[6,9], suggesting that perfect SC-FC coupling may require sufficient structural descriptors.” 

      *Cognitive predictions:

      ** From a practical stand-point, do you think SC-FC coupling is a better (more accurate) indicator of cognitive outcomes (for example for future prediction studies) than each modality alone (which is practically easier to obtain and process)? It would be useful to check the behavioural outcome predictions for each modality separately (as suggested above for coupling estimates). In case SC-FC coupling does not outperform each modality separately, what is the benefit of using their coupling? Similarly, it would be useful to compare to using only cortical myelin for the prediction (which you showed to increase in importance for the coupling). In the case of myelin->coupling-> intelligence, if you are able to predict outcomes with the same performance from myelin without the need for coupling measures, what is the benefit of coupling?

      From a predictive performance point of view, we do not believe that SC-FC coupling is a better indicator than a single mode (voxel, network or other indicator). Our starting point is to assess whether SC-FC coupling is related to the individual differences of cognitive performances rather than to prove its predictive power over other measures. As you suggest, it's a very interesting perspective on the predictive power of cognition by separating the various modalities and comparing them. We will continue to explore this issue in the future study.

      ** The statement on L187 'suggesting that increased SC-FC coupling during development is associated with higher intelligence' might not be completely appropriate before age corrections (especially given the large drop in performance that suggests confounding effects of age).

      According to your comment, we have removed the statement.

      ** L188: it might be useful to report the range of R across the outer cross-validation folds as from Figure 4A it is not completely clear that the predictive performance is above the random (0) threshold. (For the sake of clarity, on L180 it might be useful for the reader if you directly report that other outcomes were not above the random threshold).

      According to your comment, we have added the range of R and revised the description, please see lines 195-198: “Furthermore, even after controlling for age, SC-FC coupling remained a significant predictor of general intelligence better than at chance (Pearson’s r\=0.11±0.04, p\=0.01, FDR corrected, Figure 4A). For fluid intelligence and crystal intelligence, the predictive performances of SC-FC coupling were not better than at chance (Figure 4A).”

      In a similar vein, in the text, you report Pearson's R for the predictive results but Figure 4A shows predictive accuracy - accuracy is a different (categorical) metric. It would be good to homogenise to clarify predictive results.

      We have made the corresponding changes in Figure 4.

      Author response image 6.

      Encoding individual differences in intelligence using regional SC-FC coupling. (A) Predictive accuracy of fluid, crystallized, and general intelligence composite scores. (B) Regional distribution of predictive weight. (C) Predictive contribution of functional networks. The boxes show the median and interquartile range (IQR; 25–75%), and the whiskers depict the 1.5× IQR from the first or third quartile.

      *Methods and QC:


      ** It would be useful to mention briefly how the BNA was applied to the data and if any quality checks were performed for the resulting parcellations, especially for the youngest subjects which might be most dissimilar to the population used to derive the atlas (healthy adults HCP subjects) ~ question of parcellation quality.

      We have added the description, please see lines 434-436: “The BNA[31] was projected on native space according to the official scripts (http://www.brainnetome.org/resource/) and the native BNA was checked by visual inspection.” 

      ** Additionally, the appropriateness of structurally defined regions for the functional analysis is also a topic of important debate. It might be useful to mention the above as limitations (which apply to most studies with similar focus).

      We have added your comment to the methodological issues, please see lines 378-379: “Third, the appropriateness of structurally defined regions for the functional analysis is also a topic of important debate.”

      - Tractography

      ** L432: it might be useful to name the method you used (probtrackx).

      We have added this name to the description, please see lines 455-456: “probabilistic tractography (probtrackx)[78, 79] was implemented in the FDT toolbox …”

      ** L434: 'dividing the total fibres number in source region' - dividing by what?

      We have revised the description, please see line 458: “dividing by the total fibres number in source region.”

      ** L436: 'connections in subcortical areas were removed' - why did you trace connections to subcortical areas in the first place if you then removed them (to match with cortical MPC areas I suspect)? Or do you mean there were spurious streamlines through subcortical regions that you filtered?

      On the one hand we need to match the MPC, and on the other hand, as we stated in methodological issues, the challenge of accurately resolving the connections of small structures within subcortical regions using whole-brain diffusion imaging and tractography techniques[10, 11]. 

      ** Following on the above, did you use any exclusion masks during the tracing? In general, more information about quality checks for the tractography would be useful. For example, L437: did you do any quality evaluations based on the removed spurious streamlines? For example, were there any trends between spurious streamlines and the age of the subject? Distance between regions/size of the regions?

      We did not use any exclusion masks. We performed visual inspection for the tractography quality and did not assess the relationship between spurious streamlines and age or distance between regions/size of the regions.

      ** L439: 'weighted probabilistic network' - this was weighted by the filtered connectivity densities or something else?

      The probabilistic network is weighted by the filtered connectivity densities.

      ** I appreciate the short description of the communication models in Text S1, it is very useful.

      Thank you for your comment.

      ** In addition to limitations mentioned in L368 - during reconstruction, have you noticed problems resolving short inter-hemispheric connections?

      We have not considered this issue, we have added it to the limitation, please see lines 383-384: “In addition, the reconstruction of short connections between hemispheres is a notable challenge.”

      - Functional analysis:

      ** There is a difference in acquisition times between participants below and above 8 years (21 vs 26 min), does the different length of acquisition affect the quality of the processed data?

      We have made relatively strict quality control to ensure the quality of the processed data.  

      ** L446 'regressed out nuisance variables' - it would be informative to describe in more detail what you used to perform this.

      We have provided more detail about the regression of nuisance variables, please see lines 476-477: “The nuisance variables were removed from time series based on general linear model.”

      ** L450-452: it would be useful to add the number of excluded participants to get an intuition for the overall quality of the functional data. Have you checked if the quality is associated with the age of the participant (which might be related to motion etc). Adding a distribution of remaining frames across participants (vs age) would be useful to see in the supplementary methods to better understand the data you are using.

      We have supplemented the exclusion information of the subjects during the data processing, and the distribution and aged correlation of motion and remaining frames. Please see lines 481-485: “Quality control. The exclusion of participants in the whole multimodal data processing pipeline was depicted in Figure S13. In the context of fMRI data, we computed Pearson’s correlation between motion and age, as well as between the number of remaining frames and age, for the included participants aged 5 to 22 years and 8 to 22 years, respectively. These correlations were presented in Figure S14.”

      Author response image 7.

      Exclusion of participants in the whole multimodal data processing pipeline.  

      Author response image 8.

      Figure S14. Correlations between motion and age and number of remaining frames and age.

      ** L454: 'Pearson's correlation's... ' In contrast to MPC you did not remove negative correlations in the functional matrices. Why this choice?

      Whether the negative correlation connection of functional signal is removed or not has always been a controversial issue. Referring to previous studies of SC-FC coupling[12-14], we find that the practice of retaining negative correlation connections has been widely used. In order to retain more information, we chose this strategy. Considering that MPC is a nascent approach to network modeling, we adopted a more conservative strategy that removing negative correlation by referring to the study [4] that proposed the approach.

      - Gene expression:

      ** L635, you focus on the left cortex, is this common? Do you expect the gene expression to be fully symmetric (given reported functional hemispheric asymmetries)? It might be good to expand on the reasoning.

      An important consideration regarding sample assignment arises from the fact that only two out of six brains were sampled from both hemispheres and four brains have samples collected only in the left. This sparse sampling should be carefully considered when combining data across donors[1]. We have supplemented the description, please see lines 569-571: “Restricting analyses to the left hemisphere will minimize variability across regions (and hemispheres) in terms of the number of samples available[40].”

      ** Paragraph of L537: you use evolution of coupling with age (correlation) and compare to gene expression with adults (cohort of Allen Human Brain Atlas - no temporal evolution to the gene expressions) and on L369 you mention that 'relative spatial patterns of gene expressions remain stable after birth'. Of course this is not a place to question previous studies, but would you really expect the gene expression associated with the temporary processes to remain stable throughout the development? For example, myelination would follow different spatiotemporal gradient across brain regions, is it reasonable to expect that the expression patterns remain the same? How do you then interpret a changing measure of coupling (correlation with age) with a gene expression assessed statically?

      We agree with your comment that the spatial expression patterns is expected to vary at different periods. We have revised the previous description, please see lines 383-386: “Fifth, it is important to acknowledge that changes in gene expression levels during development may introduce bias in the results.”

      - Reproducibility analyses:

      ** Paragraph L576: are we to understand that you performed the entire pipeline 3 times (WD, S1, S2) for both parcellations schemes and tractography methods (~12 times) including the selection of communication models and you always got the same best three communication models and gene expression etc? Or did you make some design choices (i.e. selection of communication models) only on a specific set-up and transfer to other settings?

      The choice of communication model is established at the beginning, which we have clarified in the article, please see lines 106-108: “We used these three models to represent the extracortical connectivity properties in subsequent discovery and reproducibility analyses (Figure S1).” For reproducibility analyses (parcellation, tractography, and split-half validation), we fixed other settings and only assessed the impact of a single factor.

      ** Paragraph of L241: I really appreciate you evaluated the robustness of your results to different tractography strategies. It is reassuring to see the similarity in results for the two approaches. Did you notice any age-related effects on tractography quality for the two methods given the wide age range (did you check?)

      In our study, the tractography quality was checked by visual inspection. Using quantifiable tools to tractography quality in future studies could answer this question objectively.

      ** Additionally, I wonder how much of that overlap is driven by the changes in MPC which is the same between the two methods... especially given its high weight in the SC-FC coupling you reported earlier in the paper. It might be informative to directly compare the connectivity matrices derived from the two tracto methods directly. Generally, as mentioned in the previous comments, I think it would be interesting to assess coupling using different input settings (with WM structural and MPC separate and then combined).

      As your previous comment, we have examined the coupling patterns, coupling differences, coupling age correlation, and spatial correlations between the patterns based on different models, as shown in Figure S2. Please see our response to the previous comment for details.

      ** L251 - I also wonder if the random splitting is best adapted to validation in your case given you study relationships with age. Would it make more sense to make stratified splits to ensure a 'similar age coverage' across splits?

      In our study, we adopt the random splitting process which repeated 1,000 times to minimize bias due to data partitioning. The stratification you mentioned is a reasonable method, and keeping the age distribution even will lead to higher verification similarity than our validation method. However, from the validation results of our method, the similarity is sufficient to explain the generalization of our findings.

      Minor comments

      L42: 'is regulated by genes'

      ** Coupling (if having a functional role and being regulated at all) is possibly resulting from a complex interplay of different factors in addition to genes, for example, learning/environment, it might be more cautious to use 'regulated in part by genes' or similar.

      We have corrected it, please see line 42.

      L43 (and also L377): 'development of SC-FC coupling'

      ** I know this is very nitpicky and depends on your opinion about the nature of SC-FC coupling, but 'development of SC-FC coupling' gives an impression of something maturing that has a role 'in itself' (for example development of eye from neuroepithelium to mature organ etc.). For now, I am not sure it is fully certain that SC-FC coupling is more than a byproduct of the comparison between SC and FC, using 'changes in SC-FC coupling with development' might be more apt.

      We have corrected it, please see lines 43-44.

      L261 'SC-FC coupling was stronger ... [] ... and followed fundamental properties of cortical organization.' vs L168 'No significant correlations were found between developmental changes in SC-FC coupling and the fundamental properties of cortical organization'.

      **Which one is it? I think in the first you refer to mean coupling over all infants and in the second about correlation with age. How do you interpret the difference?

      Between the ages of 5 and 22 years, we found that the mean SC-FC coupling pattern has become similar to that of adults, consistent with the fundamental properties of cortical organization. However, the developmental changes in SC-FC coupling are heterogeneous and sequential and do not follow the mean coupling pattern to change in the same magnitude.

      L277: 'temporal and spatial complexity'

      ** Additionally, communication models have different assumptions about the flow within the structural network and will have different biological plausibility (they will be more or less


      Here temporal and spatial complexity is from a computational point of view.

      L283: 'We excluded a centralized model (shortest paths), which was not biologically plausible' ** But in Text S1 and Table S1 you specify the shortest paths models. Does this mean you computed them but did not incorporate them in the final coupling computations even if they were predictive?

      ** Generally, I find the selection of the final 3 communication models confusing. It would be very useful if you could clarify this further, for example in the methods section.

      We used all twenty-seven communication models (including shortest paths) to predict FC at the node level for each participant. Then we identified three communication models that can significantly predict FC. For the shortest path, he was excluded because he did not meet the significance criteria. We have further added methodological details to this section, please see lines 503-507.

      L332 'As we observed increasing coupling in these [frontoparietal network and default mode network] networks, this may have contributed to the improvements in general intelligence, highlighting the flexible and integrated role of these networks' vs L293 'SC-FC coupling in association areas, which have lower structural connectivity, was lower than that in sensory areas. This configuration effectively releases the association cortex from strong structural constraints imposed by early activity cascades, promoting higher cognitive functions that transcend simple sensori-motor exchanges'

      ** I am not sure I follow the reasoning. Could you expand on why it would be the decoupling promoting the cognitive function in one case (association areas generally), but on the reverse the increased coupling in frontoparietal promoting the cognition in the other (specifically frontoparietal)?

      We tried to explain the problem, for general intelligence, increased coupling in frontoparietal could allow more effective information integration enable efficient collaboration between different cognitive processes.

      * Formatting errors etc.

      L52: maybe rephrase?

      We have rephrased, please see lines 51-53: “The T1- to T2-weighted (T1w/T2w) ratio of MRI has been proposed as a means of quantifying microstructure profile covariance (MPC), which reflects a simplified recapitulation in cellular changes across intracortical laminar structure[6, 1215].”

      L68: specialization1,[20].

      We have corrected it.

      L167: 'networks significantly increased with age and exhibited greater increased' - needs rephrasing.

      We have corrected it.

      L194: 'networks were significantly predicted the general intelligence' - needs rephrasing.

      We have corrected it, please see lines 204-205: “we found that the weights of frontoparietal and default mode networks significantly contributed to the prediction of the general intelligence.”

      L447: 'and temporal bandpass filtering' - there is a verb missing.

      We have corrected it, please see line 471: “executed temporal bandpass filtering.”

      L448: 'greater than 0.15' - unit missing.

      We have corrected it, please see line 472: “greater than 0.15 mm”.

      L452: 'After censoring, regression of nuisance variables, and temporal bandpass filtering,' - no need to repeat the steps as you mentioned them 3 sentences earlier.

      We have removed it.

      L458-459: sorry I find this description slightly confusing. What do you mean by 'modal'? Connectional -> connectivity profile. The whole thing could be simplified, if I understand correctly your vector of independent variables is a set of wm and microstructural 'connectivity' of the given node... if this is not the case, please make it clearer.

      We have corrected it, please see line 488: “where 𝒔𝑖 is the 𝑖th SC profiles, 𝑛 is the number of SC profiles”.

      L479: 'values and system-specific of 480 coupling'.

      We have corrected it.

      L500: 'regular' - regularisation.

      We have changed it to “regularization”.

      L567: Do you mean that in contrast to probabilistic with FSL you use deterministic methods within Camino? For L570, you introduce communication models through 'such as': did you fit all models like before? If not, it might be clearer to just list the ones you estimated rather than introduce through 'such as'.

      We have changed the description to avoid ambiguity, please see lines 608-609: “We then calculated the communication properties of the WMC including communicability, mean first passage times of random walkers, and flow graphs (timescales=1).”

      Citation [12], it is unusual to include competing interests in the citation, moreover, Dr. Bullmore mentioned is not in the authors' list - this is most likely an error with citation import, it would be good to double-check.

      We have corrected it.

      L590: Python scripts used to perform PLS regression can 591 be found at https://scikitlearn.org/. The link leads to general documentation for sklearn.

      We have corrected it, please see lines 627-630: “Python scripts used to perform PLS regression can be found at https://scikit-learn.org/stable/modules/generated/sklearn.cross_decomposition.PLSRegression.html#sklearn.cro ss_decomposition.PLSRegression.”

      P26 and 27 - there are two related sections: Data and code availability and Code availability - it might be worth merging into one section if possible.

      We have corrected it, please see lines 623-633.


      (1) Arnatkeviciute A, Fulcher BD, Fornito A. A practical guide to linking brain-wide gene expression and neuroimaging data. Neuroimage. 2019;189:353-67. Epub 2019/01/17. doi: 10.1016/j.neuroimage.2019.01.011. PubMed PMID: 30648605.

      (2) Zhong S, He Y, Gong G. Convergence and divergence across construction methods for human brain white matter networks: an assessment based on individual differences. Hum Brain Mapp. 2015;36(5):1995-2013. Epub 2015/02/03. doi: 10.1002/hbm.22751. PubMed PMID: 25641208; PubMed Central PMCID: PMCPMC6869604.

      (3) Waehnert MD, Dinse J, Weiss M, Streicher MN, Waehnert P, Geyer S, et al. Anatomically motivated modeling of cortical laminae. Neuroimage. 2014;93 Pt 2:210-20. Epub 2013/04/23. doi: 10.1016/j.neuroimage.2013.03.078. PubMed PMID: 23603284.

      (4) Paquola C, Vos De Wael R, Wagstyl K, Bethlehem RAI, Hong SJ, Seidlitz J, et al. Microstructural and functional gradients are increasingly dissociated in transmodal cortices. PLoS Biol. 2019;17(5):e3000284. Epub 2019/05/21. doi: 10.1371/journal.pbio.3000284. PubMed PMID: 31107870.

      (5) Haufe S, Meinecke F, Gorgen K, Dahne S, Haynes JD, Blankertz B, et al. On the interpretation of weight vectors of linear models in multivariate neuroimaging. Neuroimage. 2014;87:96-110. Epub 2013/11/19. doi: 10.1016/j.neuroimage.2013.10.067. PubMed PMID: 24239590.

      (6) Demirtas M, Burt JB, Helmer M, Ji JL, Adkinson BD, Glasser MF, et al. Hierarchical Heterogeneity across Human Cortex Shapes Large-Scale Neural Dynamics. Neuron. 2019;101(6):1181-94 e13. Epub 2019/02/13. doi: 10.1016/j.neuron.2019.01.017. PubMed PMID: 30744986; PubMed Central PMCID: PMCPMC6447428.

      (7) Deco G, Kringelbach ML, Arnatkeviciute A, Oldham S, Sabaroedin K, Rogasch NC, et al. Dynamical consequences of regional heterogeneity in the brain's transcriptional landscape. Sci Adv. 2021;7(29). Epub 2021/07/16. doi: 10.1126/sciadv.abf4752. PubMed PMID: 34261652; PubMed Central PMCID: PMCPMC8279501.

      (8) Chen J, Tam A, Kebets V, Orban C, Ooi LQR, Asplund CL, et al. Shared and unique brain network features predict cognitive, personality, and mental health scores in the ABCD study. Nat Commun. 2022;13(1):2217. Epub 2022/04/27. doi: 10.1038/s41467-022-29766-8. PubMed PMID: 35468875; PubMed Central PMCID: PMCPMC9038754.

      (9) Li J, Bzdok D, Chen J, Tam A, Ooi LQR, Holmes AJ, et al. Cross-ethnicity/race generalization failure of behavioral prediction from resting-state functional connectivity. Sci Adv. 2022;8(11):eabj1812. Epub 2022/03/17. doi: 10.1126/sciadv.abj1812. PubMed PMID: 35294251; PubMed Central PMCID: PMCPMC8926333.

      (10) Thomas C, Ye FQ, Irfanoglu MO, Modi P, Saleem KS, Leopold DA, et al. Anatomical accuracy of brain connections derived from diffusion MRI tractography is inherently limited. Proc Natl Acad Sci U S A. 2014;111(46):16574-9. Epub 2014/11/05. doi: 10.1073/pnas.1405672111. PubMed PMID: 25368179; PubMed Central PMCID: PMCPMC4246325.

      (11) Reveley C, Seth AK, Pierpaoli C, Silva AC, Yu D, Saunders RC, et al. Superficial white matter fiber systems impede detection of long-range cortical connections in diffusion MR tractography. Proc Natl Acad Sci U S A. 2015;112(21):E2820-8. Epub 2015/05/13. doi: 10.1073/pnas.1418198112. PubMed PMID: 25964365; PubMed Central PMCID: PMCPMC4450402.

      (12) Gu Z, Jamison KW, Sabuncu MR, Kuceyeski A. Heritability and interindividual variability of regional structure-function coupling. Nat Commun. 2021;12(1):4894. Epub 2021/08/14. doi: 10.1038/s41467-021-25184-4. PubMed PMID: 34385454; PubMed Central PMCID: PMCPMC8361191.

      (13) Liu ZQ, Vazquez-Rodriguez B, Spreng RN, Bernhardt BC, Betzel RF, Misic B. Time-resolved structure-function coupling in brain networks. Commun Biol. 2022;5(1):532. Epub 2022/06/03. doi: 10.1038/s42003-022-03466-x. PubMed PMID: 35654886; PubMed Central PMCID: PMCPMC9163085.

      (14) Zamani Esfahlani F, Faskowitz J, Slack J, Misic B, Betzel RF. Local structure-function relationships in human brain networks across the lifespan. Nat Commun. 2022;13(1):2053. Epub 2022/04/21. doi: 10.1038/s41467-022-29770-y. PubMed PMID: 35440659; PubMed Central PMCID: PMCPMC9018911.

    1. eLife assessment

      This study addresses an important, understudied question using approaches that link molecular, circuit, and behavioral changes. The novel findings that Netrin-1 and UNC5c can guide dopaminergic innervation from the nucleus accumbens to the cortex during adolescence are solid. The data showing that the onset of Unc5 expression is sexually dimorphic in mice, and that in Siberian hamsters environmental effects on development are also sexually dimorphic are also solid. Reviewers identified some gaps in evidence for specificity of Netrin-1 expression, which, if filled, would strengthen the evidence for some of the claims. Future work would also benefit from Unc5C knockdown to corroborate the results and investigation of the cause-effect relationship. This paper will be of interest to those interested in neural development, sex differences, and/or dopamine function.

    1. eLife assessment

      The authors present a valuable computational platform, which aims to automate the workflow for coarse-grained simulations of biomolecules in the framework of the popular MARTINI model. The capability of the platform has been convincingly demonstrated by the application to a large number of proteins as well as macrocycles and polymers. On the other hand, because the developments have largely been based on the MARTINI model, some might argue that the general impact on the multi-scale simulation community is limited, leaving the support for the claimed significance incomplete.

    2. Reviewer #1 (Public Review):


      In this study, the authors provide a new computational platform called Vermouth to automate topology generation, a crucial step that any biomolecular simulation starts with. Given a wide arrange of chemical structures that need to be simulated, varying qualities of structural models as inputs obtained from various sources, and diverse force fields and molecular dynamics engines employed for simulations, automation of this fundamental step is challenging, especially for complex systems and in case that there is a need to conduct high-throughput simulations in the application of computer-aided drug design (CADD). To overcome this challenge, the authors develop a programing library composed of components that carry out various types of fundamental functionalities that are commonly encountered in topological generation. These components are intended to be general for any type of molecules and not to depend on any specific force field and MD engines. To demonstrate the applicability of this library, the authors employ those components to re-assemble a pipeline called Martinize2 used in topology generation for simulations with a widely used coarse-grained model (CG) MARTINI. This pipeline can fully recapitulate the functionality of its original version Martinize but exhibit greatly enhanced generality, as confirmed by the ability of the pipeline to faithfully generate topologies for two high-complexity benchmarking sets of proteins.


      The main strength of this work is the use of concepts and algorithms associated with induced subgraph in graph theory to automate several key but non-trivial steps of topology generation such as the identification of monomer residue units (MRU), the repair of input structures with missing atoms, the mapping of topologies between different resolutions, and the generation of parameters needed for describing interactions between MRUs. In addition, the documentation website provided by the authors is very informative, allowing users to get quickly started with Vermouth.


      Although the Vermouth library is designed as a general tool for topology generation for molecular simulations, only its applications with MARTINI have been demonstrated in the current study. Thus, the claimed generality of Vermouth remains to be exmained. The authors may consider to point out this in their manuscript.

    3. Reviewer #2 (Public Review):

      This work introduces a Vermouth library framework to enhance software development within the Martini community. Specifically, it presents a Vermouth-powered program, Martinize2, for generating coarse-grained structures and topologies from atomistic structures. In addition to introducing the Vermouth library and the Martinize2 program, this paper illustrates how Martinize2 identifies atoms, maps them to the Martini model, generates topology files, and identifies protonation states or post-translational modifications. Compared with the prior version, the authors provide a new figure to show that Martinize2 can be applied to various molecules, such as proteins, cofactors, and lipids. To demonstrate the general application, Martinize2 was used for converting 73% of 87,084 protein structures from the template library, with failed cases primarily blamed on missing coordinates.

      I was hoping to see some fundamental changes in the resubmitted version. To my disappointment, the manuscript remains largely unchanged (even the typo I pointed out previously was not fixed). I do not doubt that Martinize2 and Vermouth are useful to the Martini community, and this paper will have some impact. The manuscript is very technical and limited to the Martini community. The scientific insight for the general coarse-grained modeling community is unclear. The goal of the work is ambitious (such as high-throughput simulations and whole-cell modeling), but the results show just a validation of Martinize2. This version does not reverse my previous impression that it is incremental. As I pointed out in my previous review (and no response from the authors), all the issues associated with the Martini model are still there, e.g. the need for ENM. In this shape, I feel this manuscript is suitable for a specialized journal in computational biophysics or stays as part of the GitHub repository.

    4. Reviewer #3 (Public Review):

      The manuscript Kroon et al. described two algorithms, which when combined achieve high throughput automation of "martinizing" protein structures with selected protonation states and post-translational modifications. After the revisions provided by the authors, I recommend minor revision.

      The authors have addressed most of my concerns provided previously. Specifically, showcasing the capability of coarse-graining other types of molecules (Figure 7) is a useful addition, especially for the booming field of therapeutic macrocycles.

      My only additional concern is that to justify Martinize2 and Vermouth as a "high-throughput" method, the speed of these tools needs to be addressed in some form in the manuscript as a guideline to users.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):


      In this study, the authors provide a new computational platform called Vermouth to automate topology generation, a crucial step that any biomolecular simulation starts with. Given a wide arrange of chemical structures that need to be simulated, varying qualities of structural models as inputs obtained from various sources, and diverse force fields and molecular dynamics engines employed for simulations, automation of this fundamental step is challenging, especially for complex systems and in case that there is a need to conduct high-throughput simulations in the application of computer-aided drug design (CADD). To overcome this challenge, the authors develop a programming library composed of components that carry out various types of fundamental functionalities that are commonly encountered in topological generation. These components are intended to be general for any type of molecules and not to depend on any specific force field and MD engines. To demonstrate the applicability of this library, the authors employ those components to re-assemble a pipeline called Martinize2 used in topology generation for simulations with a widely used coarse-grained model (CG) MARTINI. This pipeline can fully recapitulate the functionality of its original version Martinize but exhibit greatly enhanced generality, as confirmed by the ability of the pipeline to faithfully generate topologies for two high-complexity benchmarking sets of proteins.


      The main strength of this work is the use of concepts and algorithms associated with induced subgraph in graph theory to automate several key but non-trivial steps of topology generation such as the identification of monomer residue units (MRU), the repair of input structures with missing atoms, the mapping of topologies between different resolutions, and the generation of parameters needed for describing interactions between MRUs.


      Although the Vermouth library appears promising as a general tool for topology generation, there is insufficient information in the current manuscript and a lack of documentation that may allow users to easily apply this library. More detailed explanation of various classes such as Processor, Molecule, Mapping, ForceField etc. that are mentioned is still needed, including inputs, output and associated operations of these classes. Some simple demonstration of application of these classes would be of great help to users. The formats of internal databases used to describe reference structures and force fields may also need to be clarified. This is particularly important when the Vermouth needs to be adapted for other AA/CG force fields and other MD engines.

      We thank the reviewer for pointing out the strengths of the presented work and agree that one of the current limitations is the lack of documentation about the library. In the revision, we point more clearly to the documentation page of the Vermouth library, which contains more detailed information on the various processors. The format of the internal databases has also been added to the documentation page. Providing a simple demonstration of applications of these classes is a great suggestion, however, we believe that it is more convenient to provide those in the form of code examples in the documentation or for instance jupyter notebooks rather than in the paper itself.  

      The successful automation of the Vermouth relies on the reference structures that need to be pre-determined. In case of the study of 43 small ligands, the reference structures and corresponding mapping to MARTINIcompatible representations for all these ligands have been already defined in the M3 force field and added into the Vermouth library. However, the authors need to comment on the scenario where significantly more ligands need to be considered and other force fields need to be used as CG representations with a lack of reference structures and mapping schemes.

      We acknowledge that vermouth/martinize2 is not capable of automatically generating Martini mappings or parameters on the fly for unknown structures that are not part of the database. However, this capability is not the purpose of the program, which is rather to distribute and manage existing parameters. Unlike atomistic force fields, which frequently have automated topology builders, Martini parameters are usually obtained for a set of specific molecules at a time and benchmarked accordingly. As more parameters are obtained by researchers, they can be added to the vermouth library via the GitHub interface in a controlled manner. This process allows the database to grow and in our opinion will quickly grow beyond the currently implemented parameters. Furthermore, the API of Vermouth is set up in a way that it can easily interface with automated topology builders which are currently being developed. Hence this limitation in our view does not diminish the applicability of vermouth to high-throughput applications with many ligands. The framework is existing and works, now only more parameters have to be added.

      Reviewer #2 (Public Review):


      This manuscript by Kroon, Grunewald, Marrink and coworkers present the development of Vermouth library for coarse grain assignment and parameterization and an updated version of python script, the Martinize2 program, to build Martini coarse grained (CG) models, primarily for protein systems.


      In contrast to many mature and widely used tools to build all-atom (AA) models, there are few well-accepted programs for CG model constructions and parameterization. The research reported in this manuscript is among the ongoing efforts to build such tools for Martini CG modeling, with a clear goal of high-throughput simulations of complex biomolecular systems and, ultimately, whole-cell simulations. Thus, this manuscript targets a practical problem in computational biophysics. The authors see such an effort to unify operations like CG mapping, parameterization, etc. as a vital step from the software engineering perspective.


      However, the manuscript in this shape is unclear in the scientific novelty and appears incremental upon existing methods and tools. The only "validation" (more like an example application) is to create Martini models with two protein structure sets (I-TASSER and AlphaFold). The success rate in building the models was only 73%, while the significant failure is due to incomplete AA coordinates. This suggests a dependence on the input AA models, which makes the results less attractive for high-throughput applications (for example, preparation/creation of the AA models can become the bottleneck). There seems to be an improvement in considering the protonation state and chemical modification, but convincing validation is still needed. Besides, limitations in the existing Martini models remain (like the restricted dynamics due to the elastic network, the electrostatic interactions or polarizability).

      We thank the reviewer for pointing out the strengths of the presented work, but respectfully disagree with the criticism that the presented work is only incremental upon existing methods and tools. All MD simulations of structured proteins regardless of the force field or resolution rely on a decent initial structure to produce valid results. Therefore, failure upon detection of malformed protein input structures is an essential feature for any high-throughput pipeline working with proteins, especially considering the computational cost of MD simulations. We note that programs such as the first version of Martinize generate reasonable-looking input parameters that lead to unphysical simulations and wasted CPU hours.

      The alpha-fold database for which we surveyed 200,000 structures only contained 7 problematic structures, which means that the success rate was 99% for this database. This example simply shows that users potentially have to add the step of fixing atomistic protein input structures, if they seek to run a high-throughput pipeline.

      But at least they can be assured that martinize2 will make sure to check that no issues persist.

      Furthermore, we note that the manuscript does not aim to validate or improve the existing Martini (protein) models. All example cases presented in the paper are subject to the limitations of the protein models for the reason that martinize2 is only the program to generate those parameters. Future improvements in the protein model, which are currently underway, will immediately be available through the program to the broader community.  

      Reviewer #3 (Public Review):


      The manuscript Kroon et al. described two algorithms, which when combined achieve high throughput automation of "martinizing" protein structures with selected protonation states and post-translational modifications.


      A large scale protein simulation was attempted, showing strong evidence that authors' algorithms work smoothly.

      The authors described the algorithms in detail and shared the open-source code under Apache 2.0 license on GitHub. This allows both reproducibility of extended usefulness within the field. These algorithms are potentially impactful if the authors can address some of the issues listed below.

      We thank the reviewer for pointing out the strengths.  


      One major caveat of the manuscript is that the authors claim their algorithms aim to "process any type of molecule or polymer, be it linear, cyclic, branched, or dendrimeric, and mixtures thereof" and "enable researchers to prepare simulation input files for arbitrary (bio)polymers". However, the examples provided by the manuscript only support one type of biopolymer, i.e. proteins. Despite the authors' recommendation of using polyply along with martinize2/vermouth, no concrete evidence has been provided to support the authors' claim. Therefore, the manuscript must be modified to either remove these claims or include new evidence.

      We acknowledge that the current manuscript is largely protein-centric. To some extent this results from the legacy of martinize version 1, which was also only used for proteins. However, to show that martinize2 also works for cyclic as well as branched molecules we implemented two additional test cases and updated formerly Figure 6 and now Figure 7. Crown ether is used as an example of a cyclic molecule whereas a small branched polyethylene molecule is a test case for branching. Needless to say both molecules are neither proteins nor biomolecules. 

      Method descriptions on Martinize2 and graph algorithms in SI should be core content of the manuscript. I argue that Figure S1 and Figure S2 are more important than Figure 3 (protonation state). I recommend the authors can make a workflow chart combining Figure S1 and S2 to explain Martinize2 and graph algorithms in main text.

      The reviewer's critique is fair. Given the already rather large manuscript, we tried to strike a balance between describing benchmark test cases, some practical usage information (e.g. the Histidine modification), and the algorithmic library side of the program. In particular, we chose to add the figure on protonation state, because how to deal with protonation states—in particular, Histidines—was amongst the top three raised issues by users on our GitHub page. Due to this large community interest, we consider the figure equally important. However, we moved Figure S1 from the Supporting Information into the manuscript and annotated the already mentioned text with the corresponding panels to more clearly illustrate the underlying procedure. 

      In Figure 3 (protonation state), the figure itself and the captions are ambiguous about whether at the end the residue is simply renamed from HIS to HIP, or if hydrogen is removed from HIP to recover HIS.

      Using either of the two routes yields the same parameters in the end, which are for the protonated Histidine. In the second route, the extra hydrogen on Histidine is detected as an additional atom and therefore a different logic flow is triggered. Atoms are never removed, but only compounded to a base block plus modification atoms. We adjusted the figure caption to point this out more clearly.  

      In "Incorporating a Ligand small-molecule Database", the authors are calling for a community effort to build a small-molecule database. Some guidance on when the current database/algorithm combination does or does not work will help the community in contributing.

      Any small molecule not part of the database will not work. However, martinize2 will quickly identify if there are missing components of the system and alert the users. At that point, the users can decide to make their files, guided by the new documentation pages. 

      A speed comparison is needed to compare Martinize2 and Martinize.

      We respectfully disagree that a speed comparison is needed. We already alerted in the manuscript discussion that martinize2 is slower, since it does more checks, is more general, and does not only implement a single protein model.

    1. eLife assessment

      This important study of artificial selection in microbial communities shows that the possibility of selecting a desired fraction of slow and fast-growing types is impacted by their initial fractions. The evidence, which relies on mathematical analysis and simulations of a stochastic model, is convincing. It highlights the tension between selection at the strain and the community level. This study should be of interest to researchers interested in ecology, both theoretical and experimental.

    2. Reviewer #1 (Public Review):


      The authors demonstrate with a simple stochastic model that the initial composition of the community is important in achieving a target frequency during the artificial selection of a community.


      To my knowledge, the intra-collective selection during artificial selection has not been seriously theoretically considered. However, in many cases, the species dynamics during the incubation of each selection cycle are important and relevant to the outcome of the artificial selection experiment. Stochasticity from birth and death (demographic stochasticity) plays a big role in these species' abundance dynamics. This work uses a simple framework to tackle this idea meticulously.

      This work may or may not be related to hysteresis (path dependency). If this is true, maybe it would be nice to have a discussion paragraph talking about how this may be the case. Then, this work would even attract the interest of people studying dynamic systems.


      (1) Connecting structure and function

      In typical artificial selection literature, most of them select the community based on collective function. Here in this paper, the authors are selecting a target composition. Although there is a schematic cartoon illustrating the relationship between collective function (y-axis) and the community composition in the main Figure 1, there is no explicit explanation or justification of what may be the origin of this relationship. I think giving the readers a naïve idea about how this structure-function relationship arises in the introduction section would help. This is because the conclusion of this paper is that the intra-collective selection makes it hard to artificially select a community that has an intermediate frequency of f (or s). If there is really evidence or theoretical derivation from this framework that indeed the highest function comes from the intermediate frequency of f, then the impact of this paper would increase because the conclusions of this stochastic model could allude to the reasons for the prevalent failures of artificial selection in literature.

      (2) Explain intra-collective and inter-collective selection better for readers.

      The abstract, the introduction, and the result section use these terms or intra-collective and inter-collective selection without much explanation. A clear definition in the beginning would help the audience grasp the importance of this paper, because these concepts are at the core of this work.

      (3) Achievable target frequency strongly depending on the degree of demographic stochasticity.

      I would expect that the experimentalists would find these results interesting and would want to consider these results during their artificial selection experiments. The main Figure 4 indicates that the Newborn size N0 is a very important factor to consider during the artificial selection experiment. This would be equivalent to how much bottleneck is imposed on the artificial selection process in every iteration step (i.e., the ratio of serial dilution experiment). However, with a low population size, all target frequencies can be achieved, and therefore in these regimes, the initial frequency now does not matter much. It would be great for the authors to provide what the N0 parameter actually means during the artificial selection experiments. Maybe relative to some other parameter in the model. I know this could be very hard. But without this, the main result of this paper (initial frequency matters) cannot be taken advantage of by the experimentalists.

      (4) Consideration of environmental stochasticity.

      The success (gold area of Figure 2d) in this framework mainly depends on the size of the demographic stochasticity (birth-only model) during the intra-collective selection. However, during experiments, a lot of environmental stochasticity appears to be occurring during artificial selection. This may be out of the scope of this study. But it would definitely be exciting to see how much environmental stochasticity relative to the demographic stochasticity (variation in the Gaussian distribution of F and S) matters in succeeding in achieving the target composition from artificial selection.

      (5) Assumption about mutation rates

      If setting the mutation rates to zero does not change the result of the simulations and the conclusion, what is the purpose of having the mutation rates \mu? Also, is the unidirectional (S -> F -> FF) mutation realistic? I didn't quite understand how the mutations could fit into the story of this paper.

      (6) Minor points

      In Figure 3b, it is not clear to me how the frequency difference for the Intra-collective and the Inter-collective selection is computed.

      In Figure 5b, the gold region (success) near the FF is not visible. Maybe increase the size of the figure or have an inset for zoom-in. Why is the region not as big as the bottom gold region?

    3. Reviewer #2 (Public Review):

      The authors provide an analytical framework to model the artificial selection of the composition of communities comprised of strains growing at different rates. Their approach takes into account the competition between the targeted selection at the level of the meta-community and the selection that automatically favors fast-growing cells within each replicate community. Their main finding is a tipping point or path-dependence effect, whereby compositions dominated by slow-growing types can only be reached by community-level selection if the community does not start and never crosses into a range of compositions dominated by fast growers during the dynamics.

      These results seem to us both technically correct and interesting. We commend the authors on their efforts to make their work reproducible even when it comes to calculations via extensive appendices, though perhaps a table of contents and a short description of these appendices at the start of SI would help navigate them.

      The main limitation in the current form of the article is that it could clarify how its assumptions and findings differ from and improve upon the rest of the literature:

      - Many studies discuss the interplay between community-level evolution and species- or strain-level evolution. But "evolution" can be a mix of various forces, including selection, drift/randomness, and mutation/innovation.

      - This work's specificity is that it focuses strictly on constant community-level selection versus constant strain-level selection, all other forces being negligible (neither stochasticity nor innovation/mutation matter at either level, as we try to clarify now).

      - Regarding constant community-level selection, it is only briefly noted that "once a target frequency is achieved, inter-collective selection is always required to maintain that frequency due to the fitness difference between the two types" [pg. 3 {section sign}2]. In other words, action from the selector is required indefinitely to maintain the community in the desired state. This assumption is found in a fraction of the literature, but is still worth clarifying from the start as it can inform the practical applicability of the results.

      - More importantly, strain-level evolution also boils down here to pure selection with a constant target, which is less usual in the relevant literature. Here, (1) drift from limited population sizes is very small, with no meaningful counterbalancing of selection, (2) pure exponential regime with constant fitness, no interactions, no density- or frequency-dependence, (3) there is no innovation in the sense that available types are unchanging through time (no evolution of traits such as growth rate or interactions) and (4) all the results presented seem unchanged when mutation rate mu = 0 (as noted in Appendix III), meaning that the conclusions are not "about" mutation in any meaningful way.

      - Furthermore, the choice of mutation mechanism is peculiar, as it happens only from slow to fast grower: more commonly, one assumes random non-directional mutations, rather than purely directional ones from less fit to fitter (which is more of a "Lamarckian" idea). Given that mutation does not seem to matter here, this choice might create unnecessary opposition from some readers or could be considered as just one possibility among others.

      It would be helpful to have all these points stated clearly so that it becomes easy to see where this article stands in an abundant literature and contributes to our understanding of multi-level evolution, and why it may have different conclusions or focus than others tackling very similar questions.

      Finally, a microbial context is given to the study, but the assumptions and results are in no way truly tied to that context, so it should be clear that this is just for flavor.

    4. Reviewer #3 (Public Review):

      The authors address the process of community evolution under collective-level selection for a prescribed community composition. They mostly consider communities composed of two types that reproduce at different rates, and that can mutate one into the other. Due to such differences in 'fitness' and to the absence of density dependence, within-collective selection is expected to always favour the fastest grower, but the collective-level selection can oppose this tendency, to a certain extent at least. By approximating the stochastic within-generation dynamics and solving it analytically, the authors show that not only high frequencies of fast growers can be reproducibly achieved, aligned with their fitness advantage. Small target frequencies can also be maintained, provided that the initial proportion of fast growers is sufficiently small. In this regime, similar to the 'stochastic corrector' model, variation upon which selection acts is maintained by a combination of demographic stochasticity and of sampling at reproduction. These two regions of achievable target compositions are separated by a gap, encompassing intermediate frequencies that are only achievable when the bottleneck size is small enough or the number of communities is (disproportionately) larger.

      A similar conclusion, that stochastic fluctuations can maintain the system over evolutionary time far from the prevalence of the faster-growing type, is then confirmed by analyzing a three-species community, suggesting that the qualitative conclusions of this study are generalizable to more complex communities.

      I expect that these results will be of broad interest to the community of researchers who strive to improve community-level selection, but are often limited to numerical explorations, with prohibitive costs for a full characterization of the parameter space of such embedded populations. The realization that not all target collective functions can be as easily achieved and that they should be adapted to the initial conditions and the selection protocol is also a sobering message for designing concrete applications.

      A major strength of this work is that the qualitative behaviour of the system is captured by an analytically solvable approximation so that the extent of the 'forbidden region' can be directly and generically related to the parameters of the selection protocol.

      I however found the description of the results too succinct and I think that more could be done to unpack the mathematical results in a way that is understandable to a broader audience. Moreover, the phenomenon the authors characterize is of purely ecological nature. Here, mutations of the growth rate are, in my understanding, neither necessary (non-trivial equilibria can be maintained also when \mu =0) nor sufficient (community-level selection is necessary to keep the system far from the absorbing state) for the phenomenon described. Calling this dynamics community evolution reflects a widespread ambiguity, and is not ascribable just to this work. I find that here the authors have the opportunity to make their message clearer by focusing on the case where the 'mutation' rate \mu vanishes (Equations 39 & 40 of the SI) - which is more easily interpretable, at least in some limits - while they may leave the more general equations 3 & 4 in the SI. Combined with an analysis of the deterministic equations, that capture the possibility of maintaining high frequencies of fast growers, the authors could elucidate the dynamics that are induced by the presence of a second level of selection, and speculate on what would be the result of real open-ended evolution (not encompassed by the simple 'switch mutations' generally considered in evolutionary game theory), for instance discussing the invasibility (or not) of mutant types with slightly different growth rates.

      The single most important model hypothesis that I would have liked to be discussed further is that the two types do not interact. Species interactions are not only essential to achieve inheritance of composition in the course of evolution but are generally expected to play a key role even on ecological time scales. I hope the authors plan to look at this in future work.

    1. eLife assessment

      This important study implicates Sempharon 4a in both mice and humans as a key suppressor of psoriatic inflammation. The data are in parts incomplete in defining the precise functionally relevant cellular source and mechanism. Nonetheless, this study brings new insight into psoriasis pathogenesis and a potential new therapeutic target.

    2. Reviewer #1 (Public Review):


      In this study, Kume et al examined the role of the protein Semaphorin 4a in steady-state skin homeostasis and how this relates to skin changes seen in human psoriasis and imiquimod-induced psoriasis-like disease in mice. The authors found that human psoriatic skin has reduced expression of Sema4a in the epidermis. While Sema4a has been shown to drive inflammatory activation in different immune populations, this finding suggested Sema4a might be important for negatively regulating Th17 inflammation in the skin. The authors go on to show that Sema4a knockout mice have skin changes in key keratinocyte genes, increased gdT cells, and increased IL-17 similar to differences seen in non-lesional psoriatic skin, and that bone marrow chimera mice with WT immune cells and Sema4a KO stromal cells develop worse IMQ-induced psoriasis-like disease, further linking expression of Sema4a in the skin to maintaining skin homeostasis. The authors next studied downstream pathways that might mediate the homeostatic effects of Sema4a, focusing on mTOR given its known role in keratinocyte function. As with the immune phenotypes, Sema4a KO mice had increased mTOR activation in the epidermis in a similar pattern to mTOR activation noted in non-lesional psoriatic skin. The authors next targeted the mTOR pathway and showed rapamycin could reverse some of the psoriasis-like skin changes in Sema4a KO mice, confirming the role of increased mTOR in contributing to the observed skin phenotype.


      The most interesting finding is the tissue-specific role for Sema4a, where it has previously been considered to play a mostly pro-inflammatory role in immune cells, this study shows that when expressed by keratinocytes, Sema4a plays a homeostatic role that when missing leads to the development of psoriasis-like skin changes. This has important implications in terms of targeting Sema4a pharmacologically. It also may yield a novel mouse model to study mechanisms of psoriasis development in mice separate from the commonly used IMQ model. The included experiments are well-controlled and executed rigorously.


      A weakness of the study is the lack of tissue-specific Sema4a knockout mice (e.g. in keratinocytes only). The authors did use bone marrow chimeras, but only in one experiment. This work implies that psoriasis may represent a Sema4a-deficient state in the epidermal cells, while the same might not be true for immune cells. Indeed, in their analysis of non-lesional psoriasis skin, Sema4a was not significantly decreased compared to control skin, possibly due to compensatory increased Sema4a from other cell types. Unbiased RNA-seq of Sema4a KO mouse skin for comparison to non-lesional skin might identify other similarities besides mTOR signaling. Indeed, targeting mTOR with rapamycin reveres some of the skin changes in Sema4a KO mice, but not skin thickness, so other pathways impacted by Sema4a may be better targets if they could be identified. Utilizing WTKO chimeras in addition to global KO mice in the experiments in Figures 6-8 would more strongly implicate the separate role of Sema4a in skin vs immune cell populations and might more closely mimic non-lesional psoriasis skin.

    3. Reviewer #2 (Public Review):


      Kume et al. found for the first time that Semaphorin 4A (Sema4A) was downregulated in both mRNA and protein levels in L and NL keratinocytes of psoriasis patients compared to control keratinocytes. In peripheral blood, they found that Sema4A is not only expressed in keratinocytes but is also upregulated in hematopoietic cells such as lymphocytes and monocytes in the blood of psoriasis patients. They investigated how the down-regulation of Sema4A expression in psoriatic epidermal cells affects the immunological inflammation of psoriasis by using a psoriasis mice model in which Sema4A KO mice were treated with IMQ. Kume et al. hypothesized that down-regulation of Sema4A expression in keratinocytes might be responsible for the augmentation of psoriasis inflammation. Using bone marrow chimeric mice, Kume et al. showed that KO of Sema4A in non-hematopoietic cells was responsible for the enhanced inflammation in psoriasis. The expression of CCL20, TNF, IL-17, and mTOR was upregulated in the Sema4AKO epidermis compared to the WT epidermis, and the infiltration of IL-17-producing T cells was also enhanced.


      Decreased Sema4A expression may be involved in psoriasis exacerbation through epidermal proliferation and enhanced infiltration of Th17 cells, which helps understand psoriasis immunopathogenesis.


      The mechanism by which decreased Sema4A expression may exacerbate psoriasis is unclear as yet.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):


      This study investigated the role of CD47 and TSP1 in extramedullary erythropoiesis by utilization of both global CD47-/- mice and TSP1-/- mice. 


      Flow cytometry combined with spleen bulk and single-cell transcriptomics were employed. The authors found that stress-induced erythropoiesis markers were increased in CD47-/- spleen cells, particularly genes that are required for terminal erythroid differentiation. Moreover, CD47 dependent erythroid precursors population was identified by spleen scRNA sequencing. In contrast, the same cells were not detected in TSP1-/- spleen. These findings provide strong evidence to support the conclusion that the differential role of CD47 and TSP1 in extramedullary erythropoiesis in mouse spleen. 


      Methods and data analysis are appropriate. However, some clarifications are required. The discussion section needs to be expanded.  

      (1) The sex of mice that were used in the study is unknown.  

      (2) In the method of Single-cell RNA sequencing (page 10), it mentioned that single cell suspensions from mouse spleens were depleted of all mature hematopoietic cell lineages by passing through CD8a microbeads and CD8a+ T cell isolation Kit. As described, it is confusing what cell types are obtained for performing scRNAseq. More information is required for clarity.  

      (3) The constitutive CD47 knockout mouse model is utilized in this study. The observed accumulation of erythroid precursors in the spleens of CD47-/- mice suggests a chronic effect of CD47 on spleen function. Can the current findings be extrapolated to acute scenarios involving CD47 knockdown or loss, as this may have more direct relevance to the potential side effects associated with an-CD47-mediated cancer therapy? Please expand on this topic in the discussion section.  

      (1) The missing mouse gender information is incorporated into the revised manuscript. For flow cytometry, two male and two female mice of each genotype were used. For single cell RNA sequencing, two female and one male mouse of each genotype were used. For the bulk RNA sequencing four male cd47−/− mice and four male wildtype mice were used.

      (2) We apologize for the confusing presentation, which has been corrected. The bulk RNA sequencing analysis identified elevated expression of erythropoietic genes in CD8+ spleen cells from cd47−/− versus wildtype mice that were obtained using magnetic bead depletion of all other lineages. Therefore, we used the same Miltenyi negative selection kit as the first step to prepare the cells for single cell RNA sequencing. These untouched cells were then depleted of most mature CD8 T cells using a Miltenyi CD8a(Ly2) antibody positive selection kit. An important consideration underlying this approach was recognizing that the commercial magnetic bead depletion kits used for preparing specific immune cell types are optimized to give relatively pure populations of the intended immune cells using wildtype mice. Our previous experience studying NK cell development in the cd47−/− mice taught us that NK precursors, which are rare in wildtype mouse spleens, accumulate in cd47−/− spleens and were not removed by the antibody cocktail optimized for wildtype spleen cells (Nath et al Front Immunol 2018). The present data indicate that erythroid precursors behave similarly.

      (3) The Discussion was edited as recommended. Anemia is a prevalent side effect of several CD47 therapeutic antibodies being developed for cancer therapy. This anemia would be expected to induce erythropoiesis in bone marrow and possibly at extramedullary sites. Human spleen cells are not accessible to directly evaluate extramedullary erythropoiesis in cancer patients, but analysis of circulating erythroid precursors or liquid biopsy methods could be useful to detect induction of extramedullary erythropoiesis by these therapeutics. We are currently investigating the ability of CD47 antibodies to directly induce erythropoiesis using a human in vitro model.

      Reviewer #2 (Public Review):


      The authors used existing mouse models to compare the effects of ablating the CD47 receptor and its signaling ligand Thrombospondin. The CD47-KO model used in this study was generated by Kim et al, 2018, where hemolytic anemia and splenomegaly was reported. This study analyzes the cell composition of the spleens from CD47-KO and Thsp-KO, focusing on early hematopoietic and erythroid populations. The data broadly shows that splenomegaly in the CD47-KO is largely due to an increase in committed erythroid progenitors as seen by Flow Cytometry and single-cell sequencing, whereas the Thsp-KO shows a slight depletion of committed erythroid progenitors but is otherwise similar to WT in splenic cell composition.  


      The techniques used are appropriate for the study and the data support the main conclusions of the study. This study provides novel insights into a putative role of Thsp-CD47 signaling in triggering definitive erythropoiesis in the mouse spleen in response to anemic stress and constitutes a good resource for researchers seeking to understand extramedullary erythropoiesis.  


      The Flow cytometry data alone supports the authors' main conclusion and single-cell sequencing confirms them but does not add further information, other than those already observed in the Flow data. The single-cell sequencing analysis and presentation could be improved by using alternate clustering methods as well as separating the data by genotype and displaying them in order for readers to fully grasp the nuanced differences in marker expression between the genotypes. Further, it is not clear from the authors' description of their results whether the increased splenic erythropoiesis is a direct consequence of CD47-KO or a response to the anemic stress in this mouse model. The enrichment of cKit+ Ter119+ Sca1- cells in CD47-KO indicates that these are likely stress erythroid progenitors. Another CD47-KO mouse model (Lindberg et al 1996) has no reported erythroid defects and was also not examined in this study.  

      (1) The reviewer asked, “whether the increased splenic erythropoiesis is a direct consequence of CD47-KO or a response to the anemic stress in this mouse model.” Our data supports both a direct role for CD47 and an indirect role resulting from the response to anemic stress. We cited our previous publications describing increased Sox2+ stem cells in spleens of Cd47 and Thbs1 knockout mice, but we neglected to emphasize another study where we found that bone marrow from cd47−/− mice subjected to the stress of ionizing radiation exhibited more colony forming units for erythroid (CFU-E) and burst-forming unit-erythroid (BFU-E) progenitors compared to bone marrow from irradiated wildtype mice (Maxhimer Sci Transl Med 2009). Taken together, our published data demonstrates that loss of CD47 results in an intrinsic protection of hematopoietic stem cells from genotoxic stress. This function of CD47 is thrombospondin-1-dependent and is consistent with the up-regulation of early erythroid precursors in the spleens of both knockout mice but cannot explain why the Thbs1−/−  mice have fewer committed erythroid precursors than wildtype. We cited studies that documented increased red cell turnover in cd47−/− mice but less red cell turnover in Thbs1−/−  mice compared to wildtype mice. Increased red cell clearance in cd47−/− mice is mediated by loss of the “don’t eat me” function of CD47 on red cells. In wildtype mice, clearance is augmented by thrombospondin-1 binding to the clustered CD47 on aging red cells (Wang, Aging Cell 2020). Thus, anemic stress in the mouse strains studied here decreases in the order cd47−/− > WT > Thbs−/−. This is consistent with the increased committed erythroid progenitors reported here in cd47−/− spleens and decreased committed progenitors in the Thbs1−/− spleens. 

      (2) Based on the reviewer’s question regarding alternative mechanisms and the publication of Yang et al 2022 identifying a role for CD47 in stress erythropoiesis though transfer of mitochondria to erythroblasts, we asked whether cd47-/- erythroid precursors  would show decreased mRNA expression for mitochondrial chromosome genes (new Figure 4−figure supplement 3C). Some of these mRNAs were more abundant in cd47-/- and thbs1-/- erythroid cells, which is the opposite of what we expected based on Yang 2022 but consistent with our previous publications identifying thrombospondin-1 and CD47 as negative regulators of mitochondrial homeostasis in muscle cells and T cells.

      (3) The cd47−/− mice used for the current study are the same strain as those reported by Lindberg et al in 1996, with additional backcrossing onto a C57BL/6 background.

      Recommendations For The Authors:

      Reviewer #2 (Recommendations For The Authors):

      Suggestions for improved or additional experiments, data, or analyses.  

      Significant efforts went into analyzing the type of erythroid progenitors by marker expression, but typical Flow cytometry strategies using Ter119 and CD44 combined with forward scatter can be used to stage the committed erythroid progenitors precisely.  

      We appreciate this suggestion to extend the flow data. However, the upcoming retirement of the PI required closing our breeding colony, and the mice are no longer available.  

      How can the difference between the erythroid phenotypes of the Lindberg et al 1996 CD47-KO (exon2 Neo knock-in) and Kim et al 2018 CD47-ko (exon1 26bp indel) be explained?  

      We are not convinced that the erythroid phenotypes of the Lindberg and Kim CD47-KO mice differ at the age used in our studies. Kim et al. focused on progressive hemolytic anemia and changes in T cells in spleen that emerge at 26 weeks age, whereas the mice used here were younger. The Lindberg and Kim mice have similar spleen enlargement at the age we used.

      Another manuscript under review from our lab suggests that cis-regulation of an adjacent colinear gene could contribute to some phenotypes observed when perturbing the Cd47 gene. The Lindberg mouse exhibits minimal perturbation of that adjacent gene, but we have no data regarding the Kim et al mouse. The reviewer’s question brought to our attention that we neglected to state in the Methods that the mice used here are the Lindberg mice, not the Kim mice. This omission is now corrected.

      The authors used Lindberg mouse for 2018 study on NK cells and observed splenomegaly. Did they check for extramedullary erythropoiesis there?  

      Retrospective examination of the RNAseq data for the spleen cells enriched in NK precursors used in our 2018 publication (Nath, 2018) reveals significantly elevated expression for a majority of the extramedullary erythroid markers listed in Table 1, but they were generally less abundant than observed for the lineage-depleted spleen cells used in the present manuscript.   

      Author response table 1.

      To clarify the stress erythropoiesis issue, it might be helpful to examine the sc-seq data for the expression of specific stress erythropoiesis markers in CD47-KO. Targets of BMP4 and Hedgehog signaling can also be examined. Further colony assays can help determine if stress BFU-Es are prevalent in the CD47-KO spleens and depleted in Thsp-KO  

      As noted in Table 1, twelve of the genes we studied are established markers of stress-induced extramedullary erythropoiesis, and most of these were included in the scRNA seq data presented. Our previous publication demonstrated that bone marrow from cd47−/− mice subjected to the stress of ionizing radiation exhibited more colony forming units for erythroid (CFU-E) and burst-forming unit-erythroid (BFU-E) progenitors compared to bone marrow from irradiated wildtype mice (Maxhimer Sci Transl Med 2009). We have not performed colony formation assays using spleen.

      To address the reviewer’s question regarding BMP4 and hedgehog signaling we performed gene set enrichment analysis for known BMP4 and hedgehog signaling signatures. Using GSE26351_UNSTIM_VS_BMP_PATHWAY_STIM_HEMATOPOIETIC_PROGENITORS, cd47-/- cells in cluster 12 or their CD34+ orCD34- subsets did not show significant enrichment for BMP4 targets compared to WT. Thbs1-/- cells in clusters 12 and 14 showed marginally significant depletion of the BMP4 signature (p=0.04 and p=0.023, respectively). Using the KEGG_HEDGEHOG_SIGNALING_PATHWAY, we did not find any significant enrichment. However, only a few genes in this pathway were detectable in the scRNAseq data. These data suggest that the BMP4 signaling may be regulated by thrombospondin-1, but properly testing this hypothesis would require achieving greater sequencing depth combined with a cell isolation method that better enriches the early hematopoietic progenitors that are known to utilize the BMP4 pathway.

      In the reclustering of erythroid progenitors in Figure 5, inclusion of Gata1 as a selection marker may help capture more of the early erythroid progenitors from the dataset and provide a more complete picture of the erythroid populations. 

      We thank the reviewer for suggesting inclusion of Gata1. We repeated the reclustering including Gata1 and found the selected cell count increased from 876 cells to 1007 cells. However, most of the increase was not in the erythroid cluster, which increased from 413 cells to 419 cells. Most of the increase represented Gata1+ T cells (548 cells including Gata1 versus 463 cells without). The revised manuscript presents genotype-dependent differential gene expression based on including Gata1 selection, but none of the specific conclusions were changed from the initial submission. The new Table 4 and Figure 7−figure supplement 1 enabled us to compare differential expression of erythropoietic genes obtained using supervised and unsupervised clustering and show that both methods yield comparable results.

      Just out of curiosity, was there an attempt to make a CD47 Thsp double KO? . Is it viable?  

      Cd47 KO mice are somewhat difficult breeders, and several previous attempts to cross with other transgenics have produced viable homozygous offspring that could not be propagated.

      Recommendations for improving the wring and presentation.  

      Perhaps readers would find it more intriguing if the paper led with the single-cell sequencing showing enrichment of erythroid populations in CD47-KO, and later confirmed with Flow Cytometry (even if this was not necessarily the order in which the experiments were done). 

      We considered this suggestion but believe that some of the flow cytometry data is needed to understand why we focused on CD34+ and CD34- subsets and proliferation markers when analyzing the scRNAseq data

      The single-cell sequencing data in Figure 3 might benefit from UMAP clustering as well. In addition, it would greatly help readers if the data points were separated by genotype and displayed after clustering. A similar analysis has been done in this paper: doi:10.1038/s41556-022-00898-9 by clustering different conditions together but displaying them separately by condition. 

      We initially explored tSNE and UMAP clustering and obtained similar results. We have added violin plots separated by genotype in Figure 4-figure supplement 2. We also included improved clusters separated by genotype in the revised Figure 3 panels C and D and for the reclustering in Figure 6D. UMAP plots provided better presentation for the reclustering (revised Figure 7). All data have been updated to the latest pipeline as noted in the Methods.

      Minor corrections to the text and figures.  

      Figure 4: Labels and plot legends are illegible in general, please relabel manually and if possible, redo plots with bigger font size and legends (relatively easy using ggplot2) 

      All figure panels were relabeled using larger fonts

      Figure 5D: Individual plots are stacked randomly atop each other and in many cases, gene names are not visible. Please restack the layers and ensure that the gene names are visible 

      Panel D was made a separate figure with enlarged labels (now Figure 7).

      Supp Fig 2: Layout can be organized a little better. Consider splitting into two figures for better organization  

      The figure was split as recommended. Now Figure 1-figure supplement 2 and Figure 2-figure supplement


      Abstract Line 10: "...mRNA expression of Kit, Ermap, and Tfrc, Induction of committed erythroid precursors is...". Replace comma after "Tfrc" with period   


      Discussion Page 9 Line 8: "...WT spleens, s. mRNAs for some markers of committed erythroid cells including Nr3c1 mRNA...". Remove ", s" after spleens.   


      This study presents a valuable finding on the cell composition in mouse spleen depleted for the CD47 receptor and its signaling ligand Thrombospondin in hematopoietic differentiation. The supporting evidence is convincing with analytical improvements on the individual contributions of the signaling components and with functional studies. This work has implications for the role of CD47/Thsp in extramedullary erythropoiesis in mouse spleen and will be of interest to medical biologists working on cell signaling, transfusion medicine, and cell therapy.

    3. Reviewer #1 (Public Review):


      This study investigated the role of CD47 and TSP1 in extramedullary erythropoiesis by utilization of both global CD47-/- mice and TSP1-/- mice.


      Flow cytometry combined with spleen bulk and single cell transcriptomics were employed. The authors found that stress-induced erythropoiesis markers were increased in CD47-/- spleen cells, particularly genes that are required for terminal erythroid differentiation. Moreover, CD47 dependent erythroid precursors population was identified by spleen scRNA sequencing. In contrast, the same cells were not detected in TSP1-/- spleen. These findings provide strong evidence to support the conclusion that differential role of CD47 and TSP1 in extramedullary erythropoiesis in mouse spleen. Furthermore, the relevance of the current finding to the prevalent side effect (anemia) of anti-CD47 mediated cancer therapy has been discussed in the Discussion section.

    4. Reviewer #3 (Public Review):

      The authors used existing mouse models to compare the effects of ablating the CD47 receptor and its signaling ligand Thrombospondin. They analyze the cell composition of the spleens from CD47-KO and Thsp-KO using Flow Cytometry and single cell sequencing and focus mostly on early hematopoietic and erythroid populations. The data broadly shows that splenomegaly in the CD47-KO is largely due to an increase in committed erythroid progenitors, whereas the Thsp-KO shows a slight depletion of committed erythroid progenitors but is otherwise similar to WT in splenic cell composition. Thus, both their datasets supports the main conclusions of the study. One caveat of the single-cell dataset is that, insofar as the authors have explored and presented it, a clear picture of the mechanism driving extra medullary erythropoiesis in CD47-KO is lacking. This would be extremely valuable since one of the stated translational implications of this study is to assess and remedy the anemia caused by anti-CD47 therapy used in subtypes of AML. Nevertheless, this study provides novel insights into a putative role of Thsp-CD47 signaling in triggering definitive erythropoiesis in the mouse spleen in response to anemic stress and constitutes a good resource for researchers seeking to understand extramedullary erythropoiesis. This study also has generated data that will enable exploration of the possible adverse effects of using anti-CD47 therapies to treat AML.

      This valuable study describes a new type of NAD+ and Zn2+-independent protein lysine deacetylase in prokaryotes. These results extend the understanding of regulatory mechanisms related to bacterial lysine acetylation modifications however, the experimental evidence is incomplete and does not fully support the conclusions made. The work will be of interest to microbiologists studying metabolism and post-translational modifications.

    2. Reviewer #1 (Public Review):


      This study by Wang et al. identifies a new type of deacetylase, CobQ, in Aeromonas hydrophila. Notably, the identification of this deacetylase reveals a lack of homology with eukaryotic counterparts, thus underscoring its unique evolutionary trajectory within the bacterial domain.


      The manuscript convincingly illustrates CobQ's deacetylase activity through robust in vitro experiments, establishing its distinctiveness from known prokaryotic deacetylases. Additionally, the authors elucidate CobQ's potential cooperation with other deacetylases in vivo to regulate bacterial cellular processes. Furthermore, the study highlights CobQ's significance in the regulation of acetylation within prokaryotic cells.


      While the manuscript is generally well-structured, some clarification and some minor corrections are needed.

    3. Reviewer #2 (Public Review):

      In recent years, lots of researchers have tried to explore the existence of new acetyltransferase and deacetylase by using specific antibody enrichment technologies and high-resolution mass spectrometry. This study adds to this effort. The authors studied a novel Zn2+- and NAD+-independent KDAC protein, AhCobQ, in Aeromonas hydrophila. They studied the biological function of AhCobQ by using a biochemistry method and used MS identification technology to confirm it. The results extend our understanding of the regulatory mechanism of bacterial lysine acetylation modifications. However, I find their conclusion to be a little speculative, and unfortunately, it also doesn't totally support the conclusion that the authors provided. In addition, regarding the figure arrangement, lots of the supplementary figures are not mentioned, and tables are not all placed in context.

      Major concerns:

      -In the opinion of this reviewer, is a little arbitrary to come to the title "Aeromonas hydrophila CobQ is a new type of NAD+- and Zn2+-independent protein lysine deacetylase in prokaryotes." This should be modified to delete the "in the prokaryotes", unless the authors get new or more evidence in the other prokaryotes for the existence of the AhCobQ.

      -I was confused about the arrangement of the supplementary results. There are no citations for Figures S9-S19.

      -No data are included for Tables S1-S6.

      -The load control is not all integrated. All of the load controls with whole PAGE gel or whole membrane western blot results should be provided. Without these whole results, it is not convincing to come to the conclusion that the authors have.

      -The materials & methods section should be thoroughly reviewed. It is unclear to me what exactly the authors are describing in the method. All the experimental designs and protocols should be described in detail, including growth conditions, assay conditions, purification conditions, etc.

      -Relevant information should be included about the experiments performed in the figure legends, such as experimental conditions, replicates, etc. Often it is not clear what was done based on the figure legend description.

    4. Reviewer #3 (Public Review):


      This study reports on a novel NAD+ and Zn2+-independent protein lysine deacetylase (KDAC) in Aeromonas hydrophila, termed AhCobQ (AHA_1389). This protein is annotated as a CobQ/CobB/MinD/ParA family protein and does not show similarity with known NAD+-dependent or Zn2+-dependent KDACs. The authors show that AhCobQ has NAD+ and Zn2+-independent deacetylase activity with acetylated BSA by western blot and MS analyses. They also provide evidence that the 195-245 aa region of AhCobQ is responsible for the deacetylase activity, which is conserved in some marine prokaryotes and has no similarity with eukaryotic proteins. They identified target proteins of AhCobQ deacetylase by proteomic analysis and verified the deacetylase activity using site-specific acetyllysine-incorporated target proteins. Finally, they show that AhCobQ activates isocitrate dehydrogenase by deacetylation at K388.


      The finding of a new type of KDAC has a valuable impact on the field of protein acetylation. The characters (NAD+ and Zn2+-independent deacetylase activity in an unknown domain) shown in this study are very unexpected.


      (1) As the characters of AhCobQ are very unexpected, to convince readers, MSMS data would be needed to exactly detect deacetylation at the target site in deacetylase activity assays. The authors show the MSMS data in assays with acetylated BSA, but other assays only rely on western blot.

      (2) They prepared site-specific Kac proteins and used them in deacetylase activity assays. The incorporation of acetyllysine at the target site needs to be confirmed by MSMS and shown as supplementary data.

      (3) The authors imply that the 195-245 aa region of AhCobQ may represent a new domain responsible for deacetylase activity. The feature of the region would be of interest but is not sufficiently described in Figure 5. The amino acid sequence alignments with representative proteins with conserved residues would be informative. It would be also informative if the modeled structure predicted by AlphaFold is shown and the structural similarity with known deacetylases is discussed.

      This paper reports a large drug repurposing screen based on an in vitro culture platform to identify compounds that can kill Plasmodium hypnozoites. This valuable work adds to the current repertoire of anti-hypnozoites agents and uncovers targetable epigenetic pathways to enhance our understanding of this mysterious stage of the Plasmodium life cycle. The data presented here are based on solid methodology and represent a starting point for further investigation of epigenetic inhibitors to treat P. vivax infection. This paper will be of interest to Plasmodium researchers and more broadly to readers in the fields of host-pathogen interactions and drug development.

    2. Reviewer #1 (Public Review):


      Plasmodium vivax can persist in the liver of infected individuals in the form of dormant hypnozoites, which cause malaria relapses and are resistant to most current antimalarial drugs. This highlights the need to develop new drugs active against hypnozoites that could be used for radical cure. Here, the authors capitalize on an in vitro culture system based on primary human hepatocytes infected with P. vivax sporozoites to screen libraries of repurposed molecules and compounds acting on epigenetic pathways. They identified a number of hits, including hydrazinophthalazine analogs. They propose that some of these compounds may act on epigenetic pathways potentially involved in parasite quiescence. To provide some support to this hypothesis, they document DNA methylation of parasite DNA based on 5-methylcytosine immunostaining, mass spectrometry, and bisulfite sequencing.

      Strengths:<br /> -The drug screen itself represents a huge amount of work and, given the complexity of the experimental model, is a tour de force.<br /> -The screening was performed in two different laboratories, with a third laboratory being involved in the confirmation of some of the hits, providing strong support that the results were reproducible.<br /> -The screening of repurposing libraries is highly relevant to accelerate the development of new radical cure strategies.


      -The manuscript is composed of two main parts, the drug screening itself and the description of DNA methylation in Plasmodium pre-erythrocytic stages. Unfortunately, these two parts are loosely connected. First, there is no evidence that the identified hits kill hypnozoites via epigenetic mechanisms. The hit compounds almost all act on schizonts in addition to hypnozoites, therefore it is unlikely that they target quiescence-specific pathways. At least one compound, colforsin, seems to selectively act on hypnozoites, but this observation still requires confirmation. Second, while the description of DNA methylation is per se interesting, its role in quiescence is not directly addressed here. Again, this is clearly not a specific feature of hypnozoites as it is also observed in P. vivax and P. cynomolgi hepatic schizonts and in P. falciparum blood stages. Therefore, the link between DNA methylation and hypnozoite formation is unclear. In addition, DNA methylation in sporozoites may not reflect epigenetic regulation occurring in the subsequent liver stages.

      -The mode of action of the hit compounds remains unknown. In particular, it is not clear whether the drugs act on the parasite or on the host cell. Merely counting host cell nuclei to evaluate the toxicity of the compounds is probably acceptable for the screen but may not be sufficient to rule out an effect on the host cell. A more thorough characterization of the toxicity of the selected hit compounds is required.

      -There is no convincing explanation for the differences observed between P. vivax and P. cynomolgi. The authors question the relevance of the simian model but the discrepancy could also be due to the P. vivax in vitro platform they used.

      -Many experiments were performed only once, not only during the screen (where most compounds were apparently tested in a single well) but also in other experiments. The quality of the data would be increased with more replication.

      -While the extended assay (12 days versus 8 days) represents an improvement of the screen, the relevance of adding inhibitors of core cytochrome activity is less clear, as under these conditions the culture system deviates from physiological conditions.

    3. Reviewer #2 (Public Review):


      In this manuscript, inhibitors of the P. vivax liver stages are identified from the Repurposing, Focused Rescue, and Accelerated Medchem (ReFRAME) library as well as a 773-member collection of epigenetic inhibitors. This study led to the discovery that epigenetics pathway inhibitors are selectively active against P. vivax and P. cynomolgi hypnozoites. Several inhibitors of histone post-translational modifications were found among the hits and genomic DNA methylation mapping revealed the modification on most genes. Experiments were completed to show that the level of methylation upstream of the gene (promoter or first exon) may impact gene expression. With the limited number of small molecules that act against hypnozoites, this work is critically important for future drug leads. Additionally, the authors gleaned biological insights from their molecules to advance the current understanding of essential molecular processes during this elusive parasite stage.

      Strengths:<br /> -This is a tremendously impactful study that assesses molecules for the ability to inhibit Plasmodium hypnozoites. The comparison of various species is especially relevant for probing biological processes and advancing drug leads.

      -The SI is wonderfully organized and includes relevant data/details. These results will inspire numerous studies beyond the current work.

    4. Reviewer #3 (Public Review):

      Although this work represents a massive screening effort to find new drugs targeting P. vivax hypnozoites, the authors should balance their statement that they identified targetable epigenetic pathways in hypnozoites.

      • They should emphasize the potential role of the host cell in the presentation of the results and the discussion, as it is known that other pathogens modify the epigenome of the host cell (i.e. toxoplasma, HIV) to prevent cell division. Also, hydrazinophtalazines target multiple pathways (notably modulation of calcium flux) and have been shown to inhibit DNA-methyl transferase 1 which is lacking in Plasmodium.

      • In a drug repurposing approach, the parasite target might also be different than the human target.

      • The authors state that host-cell apoptotic pathways are downregulated in P. vivax infected cells (p. 5 line 162). Maybe the HDAC inhibitors and DNA-methyltransferase inhibitors are reactivating these pathways, leading to parasite death, rather than targeting parasites directly.

      It would make the interpretation of the results easier if the authors used EC50 in µM rather than pEC50 in tables and main text. It is easy to calculate when it is a single-digit number but more complicated with multiple digits.

      Authors mention hypnozoite-specific effects but in most cases, compounds are as potent on hypnozoite and schizonts. They should rather use "liver stage specific" to refer to increased activity against hypnozoites and schizonts compared to the host cell. The same comment applies to line 351 when referring to MMV019721. Following the same idea, it is a bit far-fetched to call MMV019721 "specific" when the highest concentration tested for cytotoxicity is less than twice the EC50 obtained against hypnozoites and schizonts.

      Page 5 lines 187-189, the authors state "...hydrazinophtalazines were inactive when tested against P. berghei liver schizonts and P. falciparum asexual blood stages, suggesting that hypnozoite quiescence may be biologically distinct from developing schizonts". The data provided in Figure 1B show that these hydrazinophtalazines are as potent in P. vivax schizonts than in P. vivax hypnozoites, so the distinct activity seems to be Plasmodium species specific and/or host-cell specific (primary human hepatocytes rather than cell lines for P. berghei) rather than hypnozoite vs schizont specific.

      Why choose to focus on cadralazine if abandoned due to side effects? Also, why test the pharmacokinetics in monkeys? As it was a marketed drug, were no data available in humans?

      In the counterscreen mentioned on page 6, the authors should mention that the activity of poziotinib in P. berghei and P. cynomolgi is equivalent to cell toxicity, so likely not due to parasite specificity.

      To improve the clarity and flow of the manuscript, could the authors make a recapitulative table/figure for all the data obtained for poziotinib and hydrazinophtalazines in the different assays (8-days vs 12-days) and laboratory settings rather than separate tables in main and supplementary figures. Maybe also reorder the results section notably moving the 12-day assay before the DNA methylation part.

      The isobologram plot shows an additive effect rather than a synergistic effect between cadralazine and 5-azacytidine, please modify the paragraph title accordingly. Please put the same axis scale for both fractional EC50 in the isobologram graph (Figure 2A).

      Concerning the immunofluorescence detection of 5mC and 5hmC, the authors should be careful with their conclusions. The Hoechst signal of the parasites is indistinguishable because of the high signal given by the hepatocyte nuclei. The signal obtained with the anti-5hmC in hepatocyte nuclei is higher than with the anti-5mC, thus if a low signal is obtained in hypnozoites and schizonts, it might be difficult to dissociate from the background. In blood stages (Figure S18), the best to obtain a good signal is to lyse the red blood cell using saponin, before fixation and HCl treatment.

      To conclude that 5mC marks are the predominate DNA methylation mark in both P. falciparum and P. vivax, authors should also mention that they compare different stages of the life cycle, that might have different methylation levels.

      Also, the authors conclude that "[...] 5mC is present at low level in P. vivax and P. cynomolgi sporozoites and could control liver stage development and hypnozoite quiescence". Based on the data shown here, nothing, except presence the of 5mC marks, supports that DNA methylation could be implicated in liver stage development or hypnozoite quiescence.

      How many DNA-methyltransferase inhibitors were present in the epigenetic library? Out of those, none were identified as hits, maybe the hydrazinophtalazines effect is not linked to DNMT inhibition but another target pathway of these molecules like calcium transport?

      The authors state (line 344): "These results corroborate our hypothesis that epigenetic pathways regulate hypnozoites". This conclusion should be changed to "[...] that epigenetic pathways are involved in P. vivax liver stage survival" because:<br /> • The epigenetic inhibitors described here are as active on hypnozoite than liver schizonts.<br /> • Again, we cannot rule out that the host cell plays a role in this effect and that the compound may not act directly on the parasite.

      The same comment applies to the quote in lines 394 to 396. There is no proof in the results presented here that DNA methylation plays any role in the effect of hydrazinophtalazines in the anti-plasmodial activity obtained in the assay.

      The manuscript by Yang and coworkers presents valuable evidence that an in vitro brain blood barrier composed of endothelial cells, astrocytes, and neuroblastoma cells of human origin, would resemble better the in vivo condition. The presented results constitute solid evidence that GDNF induces the expression of VE-Cadherin and Claudin-5. Further, silencing of GDNF in the brain of mice altered brain blood barrier properties. This provides a new perspective on the interaction between neurons and endothelial cells and this model can be used to screen the permeability of the brain blood barrier to different drugs.

    2. Reviewer #1 (Public Review):


      In this manuscript, the authors established an in vitro triple co-culture BBB model and demonstrated its advantages compared with the mono or double co-culture BBB model. Further, the authors used their established in vitro BBB model and combined it with other methodologies to investigate the specific mechanism that co-culture with astrocytes but also neurons enhanced the integrity of endothelial cells.


      The results persuasively showed the established triple co-culture BBB model well mimicked several important characteristics of BBB compared with the mono-culture BBB model, including better barrier function and in vivo/in vitro correlation. The human-derived immortalized cells used made the model construction process faster and more efficient, and have a better in vivo correlation without species differences. This model is expected to be a useful high-throughput evaluation tool in the development of CNS drugs.

      Based on the previous experimental results, detailed studies investigated how co-culture with neurons and astrocytes promoted claudin-5 and VE-cadherin in endothelial cells, and the specific signaling mechanisms were also studied. Interestingly, the authors found that neurons also released GDNF to promote barrier properties of brain endothelial cells, as most current research has focused on the promoting effect of astrocytes-derived GDNF on BBB. Meanwhile, the author also validated the functions of GDNF for BBB integrity in vivo by silencing GDNF in mouse brains. Overall, the experiments and data presented support their claim that, in addition to astrocytes, neurons also have a promoting effect on the barrier function of endothelial cells through GDNF secretion.


      Although the authors demonstrated a highly usable for predicting the BBB permeability, recorded TEER measurements are still far from the human BBB in vivo reported measurements of TEER, and expression of transporters was not promoted by co-culture, which may lead to the model being unsuitable for studying drug transport mediated by transporters on BBB.

    3. Reviewer #2 (Public Review):


      Yang and colleagues developed a new in vitro blood-brain barrier model that is relatively simple yet outperforms previous models. By incorporating a neuroblastoma cell line, they demonstrated increased electrical resistance and decreased permeability to small molecules.


      The authors initially elucidated the soluble mediator responsible for enhancing endothelial functionality, namely GDNF. Subsequently, they elucidated the mechanisms by which GDNF upregulates the expression of VE-cadherin and Claudin-5. They further validated these findings in vivo, and demonstrated predictive value for molecular permeability as well. The study is meticulously conducted and easily comprehensible. The conclusions are firmly supported by the data, and the objectives are successfully achieved. This research is poised to advance future investigations in BBB permeability, leakage, dysfunction, disease modeling, and drug delivery, particularly in high-throughput experiments. I anticipate an enthusiastic reception from the community interested in this area. While other studies have produced similar results with tri-cultures (PMID: 25630899), this study notably enhances electrical resistance compared to previous attempts.


      Considerable effort has been directed towards developing in vitro models that more closely resemble their in vivo counterparts, utilizing stem cell-derived NVU cells. Although these examples are currently rudimentary, they offer better BBB mimicry than Yang's study.

      Additionally, some instances might benefit from more robust statistical tests; nonetheless, I do not think this would significantly alter the experimental conclusions.

      Similar experiments with tri-cultures yielding analogous results have been reported by other authors (PMID: 25630899). TEER values are a bit higher than the aforementioned experiments; however, this study has values at least one order of magnitude lower than physiological levels.

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife assessment

      This study presents a valuable finding on the mechanism to promote distant metastasis in breast cancer. The evidence supporting the claims of the authors is convincing. The work will be of interest to medical biologists working on breast cancer.

      Public Reviews:

      Reviewer #1 (Public Review):


      The paper has shown the expression of RGS10 is related to the molecular subtype, distant metastasis, and survival status of breast cancer. The study utilizes bioinformatic analyses, human tissue samples, and in vitro and in vivo experiments which strengthen the data. RGS10 was validated to inhibit EMT through a novel mechanism dependent on LCN2 and miR-539-5p, thereby reducing cancer cell proliferation, colony formation, invasion, and migration. The study elaborated the function of RGS10 in influencing the prognosis and biological behavior which could be considered as a potential drug target in breast cancer.


      The mechanism by which the miR-539-5p/RGS10/LCN2 axis may be related to the prognosis of cancer patients still needs to be elucidated. In addition, the sample size used is relatively limited. Especially, if further exploration of the related pathways and mechanisms of LCN2 can be carried out by using organoid models, as well as the potential of RGS10 as a biomarker for further clinical translation to verify its therapeutic target effect, which will make the data more convincing.

      Answer: Thank you for your comments and suggestions. In future research, we will utilize large clinical cohorts and organoid models to further explore relevant research mechanisms.

      Reviewer #2 (Public Review):

      Liu et al., by focusing on the regulation of G protein-signaling 10 (RGS10), reported that RGS10 expression was significantly lower in patients with breast cancer, compared with normal adjacent tissue. Genetic inhibition of RGS10 caused epithelial-mesenchymal transition, and enhanced cell proliferation, migration, and invasion, respectively. These results suggest an inhibitory role of RGS10 in tumor metastasis. Furthermore, bioinformatic analyses determined signaling cascades for RGS10-mediated breast cancer distant metastasis. More importantly, both in vitro and in vivo studies evidenced that alteration of RGS10 expression by modulating its upstream regulator miR-539-5p affects breast cancer metastasis. Altogether, these findings provide insight into the pathogenesis of breast tumors and hence identify potential therapeutic targets in breast cancer.

      The conclusions of this study are mostly well supported by data. However, there is a weakness in the study that needs to be clarified.

      In Figure 2A, although some references supported that SKBR3 and MCF-7 possess poorly aggressive and less invasive abilities, examining only RGS10 expression in those cells, it could not be concluded that 'RGS10 acts as a tumor suppressor in breast cancer'. It would be better to introduce a horizontal comparison of the invasive ability of these 3 types of cells using an invasion assay.

      Answer: Thank you for your comments and suggestions. MDA-MB-231, SKBR3, and MCF-7 originate from triple-negative breast cancer (high invasiveness), Her-2 receptor overexpression (relatively weak invasiveness), and luminal type breast cancer (relatively weak invasiveness) separately. Previous studies have demonstrated the invasive ability of these 3 types of cells. (PMID: 34390568)

      Reviewer #3 (Public Review):

      Distant metastasis is the major cause of death in patients with breast cancer. In this manuscript, Liu et al. show that RGS10 deficiency elicits distant metastasis via epithelial-mesenchymal transition in breast cancer. As a prognostic indicator of breast cancer, RGS10 regulates the progress of breast cancer and affects tumor phenotypes such as epithelial-mesenchymal transformation, invasion, and migration. The conclusions of this paper are mostly well supported by data, but some analyses need to be clarified.

      (1) Because diverse biomarkers have been identified for EMT, it is recommended to declare the advantages of using RGS10 as an EMT marker.

      Answer: Thank you for your comments. The dysregulation of RGS protein expression has been observed to be associated with various types of cancer. (PMID: 26293348). Previous studies have shown that RGS10 knocking down can lead to chemotherapy resistance of ovarian cancer cells to paclitaxel, cisplatin, and vincristine. In colorectal tumors, the transcription of RGS10 is regulated by DNA methylation and histone deacetylation. As a key regulatory factor in the G protein signaling pathway, RGS 10 is involved in tumor development including survival, polarization, adhesion, chemotaxis, and differentiation, these hints suggest RGS10 might be a marker for EMT in breast cancer.

      (2) The authors utilized databases to study the upstream regulatory mechanisms of RSG10. It is recommended to clarify why the authors focused on miRNAs rather than other epigenetic modifications.

      Answer: Thank you for your comments. miRNAs are short-chain non-coding RNA molecules that bind to the target mRNA's 3 'untranslated region (3'UTR) to cause mRNA degradation or translation inhibition, thus regulating gene expression in cells. These small molecules play a crucial role in regulating the expression of cancer-related genes and can act as tumor promoters or tumor suppressors. To further improve the molecular mechanism of malignant biological behavior of breast cancer cells with RGS10, we verified that miR-539-5p might be the upstream regulation target of RGS10 through bioinformatics prediction and in-vitro experiments.

      (3) The role of miR-539-5p in breast cancer has been described in previous studies. Hence, it is recommended to provide detailed elaboration on how miR-539-5p regulates the expression of RSG10.

      Answer: Thank you for your comments. To verify the effect of miRNA-539-5p regulating the expression of RSG10, we transfected miR-539-5p mimic, miR-539-5p mimic NC, miR-539-5p inhibitor, miR-539-5p inhibitor NC in SKBR3 cells and MDA-MB-231 cells respectively, and verified the expression of RGS10 through RT-qPCR and Western blot experiments. The results showed that compared with the transfected miR-539-5p mimic NC or wild-type SKBR3 cells, RGS10 m RNA and protein levels were significantly reduced. On the contrary, after MDA-MB-231 cells were transfected with miR-539-5p inhibitor to inhibit the expression of miR-539-5p, RGS10 mRNA and protein levels in MDA-MB-231 cells were significantly increased (Fig. 3.4A-C, Fig. 3.5A-C). This indicates that miR-539-5p can target and regulate RGS10.

      (4) To enhance the clarity and interpretability of the Western blot results, it would be advisable to mark the specific kilodalton (kDa) values of the proteins.

      Answer: Thank you for your comments and suggestions. We have corrected to mark the specific kilodalton (kDa) values of the proteins in WB.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The function of RGS10 in breast cancer was identified in the paper. However, some major issues in this paper need to be specified:

      (1) From reading the introduction section and its references, RGS proteins participate in multiple essential cellular processes and may be tumor initiators or suppressors (Li et al., 2023). This article focuses on the significance of RGS10 in breast cancer, it is recommended to show how the function of RGS10 exhibits therapeutic significance in other types of cancer.

      Answer: Thanks for your comments and suggestions on our findings. The dysregulation of RGS protein expression has been observed to be associated with various types of cancer. Especially in ovarian cancer cells. (PMID: 26293348). It has been found that the RGS10 expression is lower than that of normal ovarian cells. (PMID: 21044322). In addition, it has been found that knocking down RGS10 can enhance the vitality of ovarian cancer cells and promote chemoresistance by activating the Rheb GTP/mTOR signaling pathway. (PMID: 26319900). A study suggests that RGS10 mediates inflammation signaling regulation in SKOV-3 ovarian cancer cells with high expression of TNF and COX-2 after RGS10 knockdown. In colorectal tumors, RGS10 transcription is regulated by DNA methylation and histone deacetylation. (PMID: 35810565). RGS10 expression also are associated with poor prognosis in laryngeal cancer, hepatocellular carcinoma, and pediatric acute myeloid leukemia. (PMID: 32776811, PMID: 26516143, PMID: 30538250)

      (2) The authors characterize RGS10 protein expression in the breast cancer cell lines MDA-MB-231, MCF7, and SKBR3 in vitro Figure 2A. However, more information would strengthen the data - e.g. information on the expression of RGS10 protein and the survival in public databases, as well as the correlation between RGS10 and Her-2 expression.

      Answer: Thanks for your comments. we have checked the correlation of RGS10 expression and survival rate of Her-2 positive breast cancer patients in a public database. Although there is no significant difference in the “p” value, however, RGS10 high-expression patients have a favorable prognosis tendency than RGS10 low-expression patients after the 100th month.

      Author response image 1.

      (3) Regarding the current situation of clinical trials in the RGS family, the potential to develop RGS 10 for clinic translation is a driving factor for EMT.

      Answer: Thank you for your comments. The RGS (G protein signal transduction regulator) gene family provides an important "braking" function for the cell receptor family of G-protein coupled receptors (GPCR). GPCR controls hundreds of important functions in systemic cells and is the largest class of drug targets, with over one-third of FDA approved drugs treating diseases by binding to GPCR and altering its activity. When GPCRs are activated by hormones or neurotransmitters, they initiate signaling cascades within host cells through signal-carrying proteins called G proteins. The function of the RGS protein is to inactivate the G protein, thereby shutting down this signaling cascade reaction, which limits G protein signal transduction and allows cells to reset and receive new incoming signals. If it were not for it, the signals triggered by GPCR would inappropriately remain on, and the signal transduction would experience dysfunction (PMID: 33007266). The potential to develop RGS10 as a driving factor of EMT is meaningful for clinic translation.

      (4) In Figure 3A, the paper showed that differential gene expression revealed 70 genes were significantly upregulated in RGS10-depleted SKBR3 cells, The authors didn't show any data on the expression of other EMT-related proteins in pathway analysis.

      Answer: Thank you for your comments. The enrichment analysis of RNA sequencing in RGS10-depleted SKBR3 cells suggests that high correlation factors that are associated with EMT, such as TAGLN, TNFSF10, NDUFA4L2, CCN5, PHGDH, ST3GAL5, ANG, and LCN2.

      (5) In Figure 3B, the paper focuses on LCN2 in pathway analysis, however, the author did not elaborate on the significance of LCN2-related pathways in EMT.

      Answer: Thank you for your comments. Some studies have the significance of LCN2-related pathways in EMT. It was confirmed that LCN2 upregulation triggered by PTEN insufficiency induces EMT to promote migration and invasion in MCF7 cells (PMID: 27466505). The activation of STAT3 contributes to an increase in LCN2 expression, which activates ERK pathway-dependent EMT, thus promoting lung metastasis in MDA-MB-231 cells in breast cancer (PMID: 33473115). The silencing of LCN2 reduced the ability of migration and invasion of SUM149 cells and the proportion of tumor stem cells, suggesting that LCN2 may mediate the invasion and metastasis of cancer cells by regulating the stemness of breast cancer cells. The biological effects of LCN2 small molecule inhibitors ZINC00640089 and ZINC00784494 targeting IBC cells have been confirmed. The siRNA-mediated silencing of LCN2 in IBC cells significantly reduces cell proliferation, viability, migration, and invasion. (PMID: 34445288).

      (6) Minor: the author did not conduct a semi-quantitative analysis of the immunohistochemical results of RGS10.

      Answer: Thank you for your suggestion. We would like to demonstrate the qualitative analysis of RGS10 immunohistochemistry. The semi-quantitative analysis is not required in the paper.

      Reviewer #2 (Recommendations For The Authors):

      The role of RGS10 was well-characterized in this study, However, some minor points need to be modified.

      (1) Page 15 line 296, description of cell proliferation was missing, please modify.

      Answer: Thank you for your comments. We have corrected the description of cell proliferation on Page 15 highlighted in red.

      (2) In Figure 2C, the title of the Y-axis was missing.

      Answer: Thank you for your comments. We have corrected the description of the Y-axis title in Figure 2C.

      (3) Describe the transfection reagent that was used in this study, and incorporated into the methods section.

      Answer: Thank you for your comments. We have added the description of the transfection reagent to the methods section.

      (4) The manuscript needs proofreading.

      Answer: Thank you for your comments. We have proofread the manuscript.

    2. Reviewer #2 (Public Review):

      Liu et al., by focusing on the regulation of G protein-signaling 10 (RGS10), reported that RGS10 expression was significantly lower in patients with breast cancer, compared with normal adjacent tissue. Genetic inhibition of RGS10 caused epithelial-mesenchymal transition, and enhanced cell proliferation, migration, and invasion, respectively. These results suggest an inhibitory role of RGS10 in tumor metastasis. Furthermore, bioinformatic analyses determined signaling cascades for RGS10-mediated breast cancer distant metastasis. More importantly, both in vitro and in vivo studies evidenced that alteration of RGS10 expression by modulating its upstream regulator miR-539-5p affects breast cancer metastasis. Altogether, these findings provide insight into the pathogenesis of breast tumors and hence identify potential therapeutic targets in breast cancer.

      The conclusions of this study are mostly well supported by data.

    3. Reviewer #3 (Public Review):

      Distant metastasis is the major cause of death in patients with breast cancer. In this manuscript, Liu et al. show that RGS10 deficiency elicits distant metastasis via epithelial-mesenchymal transition in breast cancer. As a prognostic indicator of breast cancer, RGS10 regulates the progress of breast cancer and affects tumor phenotypes such as epithelial-mesenchymal transformation, invasion, and migration. The conclusions of this paper are mostly well supported by data.

    4. eLife assessment

      This valuable paper first demonstrated that RGS10 was identified as a biomarker to evaluate the prognosis of breast cancer. To prevent the loss of RGS10 theoretically provide a new strategy for the treatment of breast cancer. The evidence supporting the claims of the authors is solid, although inclusion of a larger number of patient samples and an animal model would have strengthened the study. The work will be of interest to clinicians working on breast cancer.

    5. Reviewer #1 (Public Review):

      The paper has shown the expression of RGS10 is related to the molecular subtype, distant metastasis, and survival status of breast cancer. The study utilizes bioinformatic analyses, human tissue samples, and in vitro and in vivo experiments which strengthen the data. RGS10 was validated to inhibit EMT through a novel mechanism dependent on LCN2 and miR-539-5p, thereby reducing cancer cell proliferation, colony formation, invasion, and migration. The study elaborated on the function of RGS10 in influencing the prognosis and biological behavior which could be considered as a potential drug target in breast cancer.

      This study investigates the role of the Cadherin Flamingo (Fmi) in cell competition in developing tissues in Drosophila melanogaster. The findings are valuable in that they show that Fmi is required in winning cells in several competitive contexts. The evidence supporting the conclusions is solid, as the authors identify Fmi as a potential new regulator of cell competition, however, they don't delve into a mechanistic understanding of how this occurs.

    2. Reviewer #1 (Public Review):


      This paper is focused on the role of Cadherin Flamingo (Fmi) - also called Starry night (stan) - in cell competition in developing Drosophila tissues. A primary genetic tool is monitoring tissue overgrowths caused by making clones in the eye disc that express activated Ras (RasV12) and that are depleted for the polarity gene scribble (scrib). The main system that they use is ey-flp, which makes continuous clones in the developing eye-antennal disc beginning at the earliest stages of disc development. It should be noted that RasV12, scrib-i (or lgl-i) clones only lead to tumors/overgrowths when generated by continuous clones, which presumably creates a privileged environment that insulates them from competition. Discrete (hs-flp) RasV12, lgl-i clones are in fact out-competed (PMID: 20679206), which is something to bear in mind.

      The authors show that clonal loss of Fmi by an allele or by RNAi in the RasV12, scrib-i tumors suppresses their growth in both the eye disc (continuous clones) and wing disc (discrete clones). The authors attributed this result to less killing of WT neighbors when Myc over-expressing clones lacking Fmi, but another interpretation (that Fmi regulates clonal growth) is equally as plausible with the current results. Next, the authors show that scrib-RNAi clones that are normally out-competed by WT cells prior to adult stages are present in higher numbers when WT cells are depleted for Fmi. They then examine death in RasV12, scrib-i ey-FLP clones, or in discrete hs-FLP UAS-Myc clones. They state that they see death in WT cells neighboring RasV12, scrib-i clones in the eye disc (Figures 4A-C). Next, they write that RasV12, scrib-I cells become losers (i.e., have apoptosis markers) when Fmi is removed. Neither of these results are quantified and thus are not compelling. They state that a similar result is observed for Myc over-expression clones that lack Fmi, but the image was not compelling, the results are not quantified and the controls are missing (Myc over-expressing clones alone and Fmi clones alone). They then want to test whether Myc over-expressing clones have more proliferation. They show an image of a wing disc that has many small Myc overexpressing clones with and without Fmi. The pHH3 results support their conclusion that Myc overexpressing clones have more pHH3, but I have reservations about the many clones in these panels (Figures 5L-N). They show that the cell competition roles of Fmi are not shared by another PCP component and are not due to the Cadherin domain of Fmi. The authors appear to interpret their results as Fmi is required for winner status. Overall, some of these results are potentially interesting and at least partially supported by the data, but others are not supported by the data.


      Fmi has been studied for its role in planar cell polarity, and its potential role in competition is interesting.


      (1) In the Myc over-expression experiments, the increased size of the Myc clones could be because they divide faster (but don't outcompete WT neighbors). If the authors want to conclude that the bigger size of the Myc clones is due to out-competition of WT neighbors, they should measure cell death across many discs of with these clones. They should also assess if reducing apoptosis (like using one copy of the H99 deficiency that removes hid, rpr, and grim) suppresses winner clone size. If cell death is not addressed experimentally and quantified rigorously, then their results could be explained by faster division of Myc over-expressing clones (and not death of neighbors). This could also apply to the RasV12, scrib-i results.

      (2) This same comment about Fmi affecting clone growth should be considered in the scrib RNAi clones in Figure 3.

      (3) I don't understand why the quantifications of clone areas in Figures 2D, 2H, 6D are log values. The simple ratio of GFP/RFP should be shown. Additionally, in some of the samples (e.g., fmiE59 >> Myc, only 5 discs and fmiE59 vs >Myc only 4 discs are quantified but other samples have more than 10 discs). I suggest that the authors increase the number of discs that they count in each genotype to at least 20 and then standardize this number.

      (4) There is a typo when referring to Figures 3C-D. It should be Figure 2C-D.

      (5) Figure 4 - shows examples of cell death. Cas3 is written on the figure but Dcp-1 is written in the results. Which antibody was used? The authors need to quantify these results. They also need to show that the death of cells is part of the phenotype, like an H99 deficiency, etc (see above).

      (6) It is well established that clones overexpressing Myc have increased cell death. The authors should consider this when interpreting their results.

      (7) A better characterization of discrete Fmi clones would also be helpful. I suggest inducing hs-flp clones in the eye or wing disc and then determining clone size vs twin spot size and also examining cell death etc. If such experiments have already been done and published, the authors should include a description of such work in the preprint.

      (8) We need more information about the expression pattern of Fmi. Is it expressed in all cells in imaginal discs? Are there any patterns of expression during larval and pupal development?

      (9) Overall, the paper is written for specialists who work in cell competition and is fairly difficult to follow, and I suggest re-writing the results to make it accessible to a broader audience.

    3. Reviewer #2 (Public Review):


      In this manuscript, Bosch et al. reveal Flamingo (Fmi), a planar cell polarity (PCP) protein, is essential for maintaining 'winner' cells in cell competition, using Drosophila imaginal epithelia as a model. They argue that tumor growth induced by scrib-RNAi and RasV12 competition is slowed by Fmi depletion. This effect is unique to Fmi, not seen with other PCP proteins. Additional cell competition models are applied to further confirm Fmi's role in 'winner' cells. The authors also show that Fmi's role in cell competition is separate from its function in PCP formation.


      (1) The identification of Fmi as a potential regulator of cell competition under various conditions is interesting.

      (2) The authors demonstrate that the involvement of Fmi in cell competition is distinct from its role in planar cell polarity (PCP) development.


      (1) The authors provide a superficial description of the related phenotypes, lacking a comprehensive mechanistic understanding. Induction of apoptosis and JNK activation are general outcomes, but it is important to determine how they are specifically induced in Fmi-depleted clones. The authors should take advantage of the power of fly genetics and conduct a series of genetic epistasis analyses.

      (2) The depletion of Fmi may not have had a significant impact on cell competition; instead, it is more likely to have solely facilitated the induction of apoptosis.

      (3) To make a solid conclusion for Figure 1, the authors should investigate whether complete removal of Fmi by a mutant allele affects tumor growth induced by expressing RasV12 and scrib RNAi throughout the eye.

      (4) The authors should test whether the expression level of Fmi (both mRNA and protein) changes during tumorigenesis and cell competition.

    4. Reviewer #3 (Public Review):


      In this manuscript, Bosch and colleagues describe an unexpected function of Flamingo, a core component of the planar cell polarity pathway, in cell competition in the Drosophila wing and eye disc. While Flamingo depletion has no impact on tumour growth (upon induction of Ras and depletion of Scribble throughout the eye disc), and no impact when depleted in WT cells, it specifically tunes down winner clone expansion in various genetic contexts, including the overexpression of Myc, the combination of Scribble depletion with activation of Ras in clones or the early clonal depletion of Scribble in eye disc. Flamingo depletion reduces the proliferation rate and increases the rate of apoptosis in the winner clones, hence reducing their competitiveness up to forcing their full elimination (hence becoming now "loser"). This function of Flamingo in cell competition is specific to Flamingo as it cannot be recapitulated with other components of the PCP pathway, and does not rely on the interaction of Flamingo in trans, nor on the presence of its cadherin domain. Thus, this function is likely to rely on a non-canonical function of Flamingo which may rely on downstream GPCR signaling.

      This unexpected function of Flamingo is by itself very interesting. In the framework of cell competition, these results are also important as they describe, to my knowledge, one of the only genetic conditions that specifically affect the winner cells without any impact when depleted in the loser cells. Moreover, Flamingo does not just suppress the competitive advantage of winner clones, but even turns them into putative losers. This specificity, while not clearly understood at this stage, opens a lot of exciting mechanistic questions, but also a very interesting long-term avenue for therapeutic purposes as targeting Flamingo should then affect very specifically the putative winner/oncogenic clones without any impact in WT cells.

      The data and the demonstration are very clean and compelling, with all the appropriate controls, proper quantification, and backed-up by observations in various tissues and genetic backgrounds. I don't see any weakness in the demonstration and all the points raised and claimed by the authors are all very well substantiated by the data. As such, I don't have any suggestions to reinforce the demonstration.

      While not necessary for the demonstration, documenting the subcellular localisation and levels of Flamingo in these different competition scenarios may have been relevant and provided some hints on the putative mechanism (specifically by comparing its localisation in winner and loser cells).

      Also, on a more interpretative note, the absence of the impact of Flamingo depletion on JNK activation does not exclude some interesting genetic interactions. JNK output can be very contextual (for instance depending on Hippo pathway status), and it would be interesting in the future to check if Flamingo depletion could somehow alter the effect of JNK in the winner cells and promote downstream activation of apoptosis (which might normally be suppressed). It would be interesting to check if Flamingo depletion could have an impact in other contexts involving JNK activation or upon mild activation of JNK in clones.


      - A clean and compelling demonstration of the function of Flamingo in winner cells during cell competition.

      - One of the rare genetic conditions that affects very specifically winner cells without any impact on losers, and then can completely switch the outcome of competition (which opens an interesting therapeutic perspective in the long term)


      - The mechanistic understanding obviously remains quite limited at this stage especially since the signaling does not go through the PCP pathway.

    5. Author response:

      We would like to thank the reviewers for their constructive feedback. We have thoroughly considered their concerns and comments and we aim to include some additional results in an updated version of this manuscript. In addition, we would like to address some of the comments, with which we respectfully disagree. Below is our point-by-point reply.

      Reviewer 1:


      This paper is focused on the role of Cadherin Flamingo (Fmi) - also called Starry night (stan) - in cell competition in developing Drosophila tissues. A primary genetic tool is monitoring tissue overgrowths caused by making clones in the eye disc that express activated Ras (RasV12) and that are depleted for the polarity gene scribble (scrib). The main system that they use is ey-flp, which makes continuous clones in the developing eye-antennal disc beginning at the earliest stages of disc development. It should be noted that RasV12, scrib-i (or lgl-i) clones only lead to tumors/overgrowths when generated by continuous clones, which presumably creates a privileged environment that insulates them from competition. Discrete (hs-flp) RasV12, lgl-i clones are in fact out-competed (PMID: 20679206), which is something to bear in mind. 

      We think it is unlikely that the outcome of RasV12, scrib (or lgl) competition depends on discrete vs. continuous clones or on creation of a privileged environment. As shown in the same reference mentioned by the reviewer, the outcome of RasV12, scrib (or lgl) tumors greatly depends on the clone being able to grow to a certain size. The authors show instances of discrete clones where larger RasV12, lgl clones outcompete the surrounding tissue and eliminate WT cells by apoptosis, whereas smaller clones behave more like losers. It is not clear what aspect of the environment determines the ability of some clones to grow larger than others, but in neither case are the clones prevented from competition. Other studies show that in mammalian cells, RasV12, scrib clones are capable of outcompeting the surrounding tissue, such as in Kohashi et al (2021), where cells carrying both mutations actively eliminate their neighbors.

      The authors show that clonal loss of Fmi by an allele or by RNAi in the RasV12, scrib-i tumors suppresses their growth in both the eye disc (continuous clones) and wing disc (discrete clones). The authors attributed this result to less killing of WT neighbors when Myc over-expressing clones lacking Fmi, but another interpretation (that Fmi regulates clonal growth) is equally as plausible with the current results.

      See point (1) for a discussion on this.

      Next, the authors show that scrib-RNAi clones that are normally out-competed by WT cells prior to adult stages are present in higher numbers when WT cells are depleted for Fmi. They then examine death in RasV12, scrib-i ey-FLP clones, or in discrete hs-FLP UAS-Myc clones. They state that they see death in WT cells neighboring RasV12, scrib-i clones in the eye disc (Figures 4A-C). Next, they write that RasV12, scrib-I cells become losers (i.e., have apoptosis markers) when Fmi is removed. Neither of these results are quantified and thus are not compelling. They state that a similar result is observed for Myc over-expression clones that lack Fmi, but the image was not compelling, the results are not quantified and the controls are missing (Myc over-expressing clones alone and Fmi clones alone).

      We assayed apoptosis in UAS-Myc clones in eye discs but neglected to include the results in Figure 4. We will include them in the updated manuscript. Regarding Fmi clones alone, we direct the reviewer’s attention to Fig. 2 Supplement 1 where we showed that fminull clones cause no competition. Dcp-1 staining showed low levels of apoptosis unrelated to the fminull clones or twin-spots, and we will comment on this in the revised manuscript.

      Regarding the quantification of apoptosis, we did not provide a quantification, in part because we observe a very clear visual difference between groups (Fig. 4A-K), and in part because it is challenging to come up with a rigorous quantification method. For example, how far from a winner clone can an apoptotic cell be and still be considered responsive to the clone? For UAS-Myc winner clones, we observe a modest amount of cell death both inside and outside the clones, consistent with prior observations. For fminull UAS-Myc clones, we observe vastly more cell death within the fminull UAS-Myc clones and modest death in nearby wildtype cells, and consequently a much higher ratio of cell death inside vs outside the clone. Because of the somewhat arbitrary nature of quantification, and the dramatic difference, we initially chose not to provide a quantification. However, given the request, we chose an arbitrary distance from the clone boundary in which to consider dying cells and counted the numbers for each condition. We view this as a very soft quantification, but will report it in a way that captures the phenomenon in the revised manuscript.

      They then want to test whether Myc over-expressing clones have more proliferation. They show an image of a wing disc that has many small Myc overexpressing clones with and without Fmi. The pHH3 results support their conclusion that Myc overexpressing clones have more pHH3, but I have reservations about the many clones in these panels (Figures 5L-N).

      As the reviewer’s reservations are not specified, we have no specific response.

      They show that the cell competition roles of Fmi are not shared by another PCP component and are not due to the Cadherin domain of Fmi. The authors appear to interpret their results as Fmi is required for winner status. Overall, some of these results are potentially interesting and at least partially supported by the data, but others are not supported by the data.


      Fmi has been studied for its role in planar cell polarity, and its potential role in competition is interesting.


      (1) In the Myc over-expression experiments, the increased size of the Myc clones could be because they divide faster (but don't outcompete WT neighbors). If the authors want to conclude that the bigger size of the Myc clones is due to out-competition of WT neighbors, they should measure cell death across many discs of with these clones. They should also assess if reducing apoptosis (like using one copy of the H99 deficiency that removes hid, rpr, and grim) suppresses winner clone size. If cell death is not addressed experimentally and quantified rigorously, then their results could be explained by faster division of Myc over-expressing clones (and not death of neighbors). This could also apply to the RasV12, scrib-i results.

      Indeed, Myc clones have been shown to divide faster than WT neighbors, but that is not the only reason clones are bigger. As shown in (de la Cova et al, 2004), Myc-overexpressing cells induce apoptosis in WT neighbors, and blocking this apoptosis results in larger wings due to increased presence of WT cells. Also, (Moreno and Basler, 2004) showed that Myc-overexpressing clones cause a reduction in WT clone size, as WT twin spots adjacent to 4xMyc clones are significantly smaller than WT twin spots adjacent to WT clones. In the same work, they show complete elimination of WT clones generated in a tub-Myc background. Since then, multiple papers have shown these same results. It is well established then that increased cell proliferation transforms Myc clones into supercompetitors and that in the absence of cell competition, Myc-overexpressing discs produce instead wings larger than usual.

      In (de la Cova et al, 2004) the authors already showed that blocking apoptosis with H99 hinders competition and causes wings with Myc clones to be larger than those where apoptosis wasn’t blocked. As these results are well established from prior literature, there is no need to repeat them here.

      (2) This same comment about Fmi affecting clone growth should be considered in the scrib RNAi clones in Figure 3.

      In later stages, scrib RNAi clones in the eye are eliminated by WT cells. While scrib RNAi clones are not substantially smaller in third instar when competing against fmi cells (Fig 3M), by adulthood we see that WT clones lacking Fmi have failed to remove scrib clones, unlike WT clones that have completely eliminated the scrib RNAi clones by this time. We therefore disagree that the only effect of Fmi could be related to rate of cell division.

      (3) I don't understand why the quantifications of clone areas in Figures 2D, 2H, 6D are log values. The simple ratio of GFP/RFP should be shown. Additionally, in some of the samples (e.g., fmiE59 >> Myc, only 5 discs and fmiE59 vs >Myc only 4 discs are quantified but other samples have more than 10 discs). I suggest that the authors increase the number of discs that they count in each genotype to at least 20 and then standardize this number.

      Log(ratio) values are easier to interpret than a linear scale. If represented linearly, 1 means equal ratios of A and B, while 2A/B is 2 and A/2B is 0.5. And the higher the ratio difference between A and B, the starker this effect becomes, making a linear scale deceiving to the eye, especially when decreased ratios are shown. Using log(ratios), a value of 0 means equal ratios, and increased and decreased ratios deviate equally from 0.

      Statistically, either analyzing a standardized number of discs for all conditions or a variable number not determined beforehand has no effect on the p-value, as long as the variable n number is not manipulated by p-hacking techniques, such as increasing the n of samples until a significant p-value has been obtained. While some of our groups have lower numbers, all statistical analyses were performed after all samples were collected. For all results obtained by cell counts, all samples had a minimum of 10 discs due to the inherent though modest variability of our automated cell counts, and we analyzed all the discs that we obtained from a given experiment, never “cherry-picking” examples. For the sake of transparency, all our graphs show individual values in addition to the distributions so that the reader knows the n values at a glance.

      (5) Figure 4 - shows examples of cell death. Cas3 is written on the figure but Dcp-1 is written in the results. Which antibody was used? The authors need to quantify these results. They also need to show that the death of cells is part of the phenotype, like an H99 deficiency, etc (see above).

      Thank you for flagging this error. We used cleaved Dcp-1 staining to detect cell death, not Cas3 (Drice in Drosophila). We will update all panels replacing Cas3 by Dcp-1.

      As described above, cell death is a well established consequence of myc overexpression induced cell death and we feel there is no need to repeat that result. To what extent loss of Fmi induces excess cell death or reduces proliferation in “would-be” winners, and to what extent it reduces “would-be” winners’ ability to eliminate competitors are interesting mechanistic questions that are beyond the scope of the current manuscript.

      (6) It is well established that clones overexpressing Myc have increased cell death. The authors should consider this when interpreting their results.

      We are aware that Myc-overexpressing clones have increased cell death, but it has also been demonstrated that despite that fact, they behave as winners and eliminate WT neighboring cells. And as mentioned in comment (1), WT clones generated in a 3x and 4x Myc background are eliminated and removed from the tissue, and blocking cell death increases the size of WT “losers” clones adjacent to Myc overexpressing clones.

      (7) A better characterization of discrete Fmi clones would also be helpful. I suggest inducing hs-flp clones in the eye or wing disc and then determining clone size vs twin spot size and also examining cell death etc. If such experiments have already been done and published, the authors should include a description of such work in the preprint.

      We have already analyzed the size of discrete Fmi clones and showed that they did not cause any competition, with fmi-null clones having the same size as WT clones in both eye and wing discs. We direct the reviewer’s attention to Figure 2 Supplement 1.

      (8) We need more information about the expression pattern of Fmi. Is it expressed in all cells in imaginal discs? Are there any patterns of expression during larval and pupal development?

      Fmi is equally expressed by all cells in all imaginal discs in Drosophila larva and pupa. We will include this information in the updated manuscript.

      (9) Overall, the paper is written for specialists who work in cell competition and is fairly difficult to follow, and I suggest re-writing the results to make it accessible to a broader audience.

      We have endeavored to both provide an accessible narrative and also describe in sufficient detail the data from multiple models of competition and complex genetic systems. We hope that most readers will be able, at a minimum, to follow our interpretations and the key takeaways, while those wishing to examine the nuts and bolts of the argument will find what they need presented as simply as possible.

      Reviewer 2:


      In this manuscript, Bosch et al. reveal Flamingo (Fmi), a planar cell polarity (PCP) protein, is essential for maintaining 'winner' cells in cell competition, using Drosophila imaginal epithelia as a model. They argue that tumor growth induced by scrib-RNAi and RasV12 competition is slowed by Fmi depletion. This effect is unique to Fmi, not seen with other PCP proteins. Additional cell competition models are applied to further confirm Fmi's role in 'winner' cells. The authors also show that Fmi's role in cell competition is separate from its function in PCP formation.

      We would like to thank the reviewer for their thoughtful and positive review.


      (1) The identification of Fmi as a potential regulator of cell competition under various conditions is interesting.

      (2) The authors demonstrate that the involvement of Fmi in cell competition is distinct from its role in planar cell polarity (PCP) development.


      (1) The authors provide a superficial description of the related phenotypes, lacking a comprehensive mechanistic understanding. Induction of apoptosis and JNK activation are general outcomes, but it is important to determine how they are specifically induced in Fmi-depleted clones. The authors should take advantage of the power of fly genetics and conduct a series of genetic epistasis analyses.

      We appreciate that this manuscript does not address the mechanism by which Fmi participates in cell competition. Our intent here is to demonstrate that Fmi is a key contributor to competition. We indeed aim to delve into mechanism, are currently directing our efforts to exploring how Fmi regulates competition, but the size of the project and required experiments are outside of the scope of this manuscript. We feel that our current findings are sufficiently valuable to merit sharing while we continue to investigate the mechanism linking Fmi to competition.

      (2) The depletion of Fmi may not have had a significant impact on cell competition; instead, it is more likely to have solely facilitated the induction of apoptosis.

      We respectfully disagree for several reasons. First, loss of Fmi is specific to winners; loss of Fmi has no effect on its own or in losers when confronting winners in competition. And in the Ras V12 tumor model, loss of Fmi did not perturb whole eye tumors – it only impaired tumor growth when tumors were confronted with competitors. We agree that induction of apoptosis is affected, but so too is proliferation, and only when in winners in competition.

      (3) To make a solid conclusion for Figure 1, the authors should investigate whether complete removal of Fmi by a mutant allele affects tumor growth induced by expressing RasV12 and scrib RNAi throughout the eye.

      We agree with the reviewer that this is a worthwhile experiment, given that RNAi has its limitations. However, as fmi is homozygous lethal at the embryo stage, one cannot create whole disc tumors mutant for fmi. As an approximation to this condition, we have introduced the GMR-Hid, cell-lethal combination to eliminate non-tumor tissue in the eye disc. Following elimination of non-tumor cells, there remains essentially a whole disc harboring fminull tumor. Indeed, this shows that whole fminull tumors overgrow similar to control tumors, confirming that the lack of Fmi only affects clonal tumors. We will provide those results in the updated manuscript.

      (4) The authors should test whether the expression level of Fmi (both mRNA and protein) changes during tumorigenesis and cell competition.

      This is an intriguing point that we would like to validate. We are currently performing immunostaining for Fmi in clones to confirm whether its levels change during competition. We will provide these results in the updated manuscript.

      Reviewer 3:

      Summary: <br /> In this manuscript, Bosch and colleagues describe an unexpected function of Flamingo, a core component of the planar cell polarity pathway, in cell competition in the Drosophila wing and eye disc. While Flamingo depletion has no impact on tumour growth (upon induction of Ras and depletion of Scribble throughout the eye disc), and no impact when depleted in WT cells, it specifically tunes down winner clone expansion in various genetic contexts, including the overexpression of Myc, the combination of Scribble depletion with activation of Ras in clones or the early clonal depletion of Scribble in eye disc. Flamingo depletion reduces the proliferation rate and increases the rate of apoptosis in the winner clones, hence reducing their competitiveness up to forcing their full elimination (hence becoming now "loser"). This function of Flamingo in cell competition is specific to Flamingo as it cannot be recapitulated with other components of the PCP pathway, and does not rely on the interaction of Flamingo in trans, nor on the presence of its cadherin domain. Thus, this function is likely to rely on a non-canonical function of Flamingo which may rely on downstream GPCR signaling.

      This unexpected function of Flamingo is by itself very interesting. In the framework of cell competition, these results are also important as they describe, to my knowledge, one of the only genetic conditions that specifically affect the winner cells without any impact when depleted in the loser cells. Moreover, Flamingo does not just suppress the competitive advantage of winner clones, but even turns them into putative losers. This specificity, while not clearly understood at this stage, opens a lot of exciting mechanistic questions, but also a very interesting long-term avenue for therapeutic purposes as targeting Flamingo should then affect very specifically the putative winner/oncogenic clones without any impact in WT cells.

      The data and the demonstration are very clean and compelling, with all the appropriate controls, proper quantification, and backed-up by observations in various tissues and genetic backgrounds. I don't see any weakness in the demonstration and all the points raised and claimed by the authors are all very well substantiated by the data. As such, I don't have any suggestions to reinforce the demonstration.

      While not necessary for the demonstration, documenting the subcellular localisation and levels of Flamingo in these different competition scenarios may have been relevant and provided some hints on the putative mechanism (specifically by comparing its localisation in winner and loser cells). 

      Also, on a more interpretative note, the absence of the impact of Flamingo depletion on JNK activation does not exclude some interesting genetic interactions. JNK output can be very contextual (for instance depending on Hippo pathway status), and it would be interesting in the future to check if Flamingo depletion could somehow alter the effect of JNK in the winner cells and promote downstream activation of apoptosis (which might normally be suppressed). It would be interesting to check if Flamingo depletion could have an impact in other contexts involving JNK activation or upon mild activation of JNK in clones.

      We would like to thank the reviewer for their thorough and positive review.


      - A clean and compelling demonstration of the function of Flamingo in winner cells during cell competition.

      - One of the rare genetic conditions that affects very specifically winner cells without any impact on losers, and then can completely switch the outcome of competition (which opens an interesting therapeutic perspective in the long term)


      - The mechanistic understanding obviously remains quite limited at this stage especially since the signaling does not go through the PCP pathway.

      Reviewer 2 made the same comment in their weakness (1), and we refer to that response. In future work, we are excited to better understand the pathways linking Fmi and competition.

      This manuscript reports a valuable new mechanism of regulation of the glutamine synthetase in the archaeon Methanosarcina mazei and clarifies the direct activation of glutamine synthetase activity by 2-oxoglutarate, thus introducing a novel understanding of how 2-oxoglutarate serves as a central indicator of carbon and nitrogen sensing. The authors provide solid evidence using mass photometry, specific activity measurements, and single particle cryo-EM data. This study is of interest to biologists working on the regulation of metabolism.

    2. Reviewer #1 (Public Review):


      This study shows a new mechanism of GS regulation in the archaean Methanosarcina maze and clarifies the direct activation of GS activity by 2-oxoglutarate, thus featuring another way in which 2-oxoglutarate acts as a central status reporter of C/N sensing.

      Mass photometry and single particle cryoEM structure analysis convincingly show the direct regulation of GS activity by 2-OG promoted formation of the dodecameric structure of GS. The previously recognized small proteins GlnK1 and Sp26 seem to play a subordinate role in GS regulation, which is in good agreement with previous data. Although these data are quite clear now, there remains one major open question: how does 2-OG further increase GS activity once the full dodecameric state is achieved (at 5 mM)? This point needs to be reconsidered.


      Mass photometry reveals a dynamic mode of the effect of 2-OG on the oligomerization state of GS. Single particle Cryo-EM reveals the mechanism of 2-OG mediated dodecamer formation.


      It is not entirely clear, how very high 2-OG concentrations activate GS beyond dodecamer formation.

      The data presented in this work are in stark contrast to the previously reported structure of M. mazei GS by the Schumacher lab. This is very confusing for the scientific community and requires clarification. The discussion should consider possible reasons for the contradictory results.

      Importantly, it is puzzling how Schumacher could achieve an apo-structire of dodecameeric GS? If 2-OG is necessary for dodecameric formation, this should be discussed. If GlnK1 doesn't form a complex with the dodecameric GS, how could such a complex be resolved there?

      In addition, the text is in principle clear but could be improved by professional editing. Most obviously there is insufficient comma placement.

    3. Reviewer #2 (Public Review):


      Herdering et al. introduced research on an archaeal glutamine synthetase (GS) from Methanosarcina mazei, which exhibits sensitivity to the environmental presence of 2-oxoglutarate (2-OG). While previous studies have indicated 2-OG's ability to enhance GS activity, the precise underlying mechanism remains unclear. Initially, the authors utilized biophysical characterization, primarily employing a nanomolar-scale detection method called mass photometry, to explore the molecular assembly of Methanosarcina mazei GS (M. mazei GS) in the absence or presence of 2-OG. Similar to other GS enzymes, the target M. mazei GS forms a stable dodecamer, with two hexameric rings stacked in tail-to-tail interactions. Despite approximately 40% of M. mazei GS existing as monomeric or dimeric entities in the detectable solution, the majority spontaneously assemble into a dodecameric state. Upon mixing 2-OG with M. mazei GS, the population of the dodecameric form increases proportionally with the concentration of 2-OG, indicating that 2-OG either promotes or stabilizes the assembly process. The cryo-electron microscopy (cryo-EM) structure reveals that 2-OG is positioned near the interface of two hexameric rings. At a resolution of 2.39 Å, the cryo-EM map vividly illustrates 2-OG forming hydrogen bonds with two individual GS subunits as well as with solvent water molecules. Moreover, local side-chain reorientation and conformational changes of loops in response to 2-OG further delineate the 2-OG-stabilized assembly of M. mazei GS.

      Strengths & Weaknesses:

      The investigation studies the impact of 2-oxoglutarate (2-OG) on the assembly of Methanosarcina mazei glutamine synthetase (M mazei GS). Utilizing cutting-edge mass photometry, the authors scrutinized the population dynamics of GS assembly in response to varying concentrations of 2-OG. Notably, the findings demonstrate a promising and straightforward correlation, revealing that dodecamer formation can be stimulated by 2-OG concentrations of up to 10 mM, although GS assembly never reaches 100% dodecamerization in this study. Furthermore, catalytic activities showed a remarkable enhancement, escalating from 0.0 U/mg to 7.8 U/mg with increasing concentrations of 2-OG, peaking at 12.5 mM. However, an intriguing gap arises between the incomplete dodecameric formation observed at 10 mM 2-OG, as revealed by mass photometry, and the continued increase in activity from 5 mM to 10 mM 2-OG for M mazei GS. This prompts questions regarding the inability of M mazei GS to achieve complete dodecamer formation and the underlying factors that further enhance GS activity within this concentration range of 2-OG.

      Moreover, the cryo-electron microscopy (cryo-EM) analysis provides additional support for the biophysical and biochemical characterization, elucidating the precise localization of 2-OG at the interface of two GS subunits within two hexameric rings. The observed correlation between GS assembly facilitated by 2-OG and its catalytic activity is substantiated by structural reorientations at the GS-GS interface, confirming the previously reported phenomenon of "funnel activation" in GS. However, the authors did not present the cryo-EM structure of M. mazei GS in complex with ATP and glutamate in the presence of 2-OG, which could have shed light on the differences in glutamine biosynthesis between previously reported GS enzymes and the 2-OG-bound M. mazei GS.

      Furthermore, besides revealing the cryo-EM structure of 2-OG-bound GS, the study also observed the filamentous form of GS, suggesting that filament formation may be a universal stacking mechanism across archaeal and bacterial species. However, efforts to enhance resolution to investigate whether the stacked polymer is induced by 2-OG or other factors such as ions or metabolites were not undertaken by the authors, leaving room for further exploration into the mechanisms underlying filament formation in GS.

    4. Reviewer #3 (Public Review):


      The current manuscript investigates the effect of 2-oxoglutarate and the Glk1 protein as modulators of the enzymatic reactivity of glutamine synthetase. To do this, the authors rely on mass photometry, specific activity measurements, and single-particle cryo-EM data.

      From the results obtained, the authors convey that glutamine synthetase from Methanosarcina mazei exists in a non-active monomeric/dimeric form under low concentrations of 2-oxoglutarate, and its oligomerization into a dodecameric complex is triggered by higher concentration of 2-oxoglutarate, also resulting in the enhancement of the enzyme activity.


      Glutamine synthetase is a crucial enzyme in all domains of life. The dodecameric fold of GS is recurrent amongst prokaryotic and archaea organisms, while the enzyme activity can be regulated in distinct ways. This is a very interesting work combining protein biochemistry with structural biology.

      The role of 2-OG is here highlighted as a crucial effector for enzyme oligomerization and full reactivity.


      Various opportunities to enhance the current state-of-the-art were missed. In particular, omissions of the ligand-bound state of GnK1 leave unexplained the lack of its interaction with GS (in contradiction with previous results from the authors). A finer dissection of the effect and role of 2-oxoglurate are missing and important questions remain unanswered (e.g. are dimers relevant during early stages of the interaction or why previous GS dodecameric structures do not show 2-oxoglutarate).

    5. Author response:

      Reviewer #1 (Public Review):

      We thank Reviewer #1 for the professional evaluation and raising important points. We will address those comments in the updated manuscript and especially improve the discussion in respect to the two points of concern.

      (1) How can GlnA1 activity further be stimulated with further increasing 2-OG after the dodecamer is already fully assembled at 5 mM 2-OG.

      We assume a two-step requirement for 2-OG, the dodecameric assembly and the priming of the active sites. The assembly step is based on cooperative effects of 2-OG and does not require the presence of 2-OG in all 2-OG-binding pockets: 2-OG-binding to one binding pocket also causes a domino effect of conformational changes in the adjacent 2-OG-unbound subunit, as also described for Methanothermococcus thermolithotrophicus GS in Müller et al. 2023. Due to the introduction of these conformational changes, the dodecameric form becomes more favourable even without all 2-OG binding sites being occupied. With higher 2-OG concentrations present (> 5mM), the activity increased further until finally all 2-OG-binding pockets were occupied, resulting in the priming of all active sites (all subunits) and thereby reaching the maximal activity.

      (2) The contradictory results with previously published data on the structure of M. mazei by Schumacher et al. 2023.

      We certainly agree that it is confusing that Schumacher et al. 2023 obtained a dodecameric structure without the addition of 2-OG, which we claim to be essential for the dodecameric form. 2-OG is a cellular metabolite that is naturally present in E. coli, the heterologous expression host both groups used. Since our main question focused on analysing the 2-OG effect on GS, we have performed thorough dialysis of the purified protein to remove all 2-OG before performing MP experiments. In the absence of 2-OG we never observed significant enzyme activity and always detected a fast disassembly after incubation on ice. We thus assume that a dodecamer without 2-OG in Schuhmacher et al. 2023 is an inactive oligomer of a once 2-OG-bound form, stabilized e.g. by the presence of 5 mM MgCl2.

      The GlnA1-GlnK1-structure (crystallography) by Schumacher et al. 2023 is in stark contrast to our findings that GlnK1 and GlnA1 do not interact as shown by mass photometry with purified proteins. A possible reason for this discrepancy might be that at the high protein concentrations used in the crystallization assay, complexes are formed based on hydrophobic or ionic protein interactions, which would not form under physiological concentrations.

      Reviewer #2 (Public Review):

      We thank Reviewer #2 for the detailed assessment and valuable input. We will address those comments in the updated manuscript and clarify the message.

      (1) The discrepancy of the dodecamer formation (max. at 5 mM 2-OG) and the enzyme activity (max. at 12.5 mM 2-OG).

      We assume that there are two effects caused by 2-OG: 1. cooperativity of binding (less 2-OG needed to facilitate dodecamer formation) and 2. priming of each active site. See also Reviewer #1 R.1). We assume this is the reason why the activity of dodecameric GlnA1 can be further enhanced by increased 2-OG concentration until all catalytic sites are primed.

      (2) The lack of the structure of a 2-OG and ATP-bound GlnA1.

      Although we strongly agree that this would be a highly interesting structure, it seems out of the scope of a typical revision to request new cryo-EM structures. We evaluate the findings of our present study concerning the 2-OG effects as important insights into the strongly discussed field of glutamine synthetase regulation, even without the requested additional structures.

      (3) The observed GlnA1-filaments are an interesting finding.

      We certainly agree with the referee on that point, that the stacked polymers are potentially induced by 2-OG or ions. However, it is out of the main focus of this manuscript to further explore those filaments. Nevertheless, this observation could serve as an interesting starting point for future experiments.

      Reviewer #3 (Public Review):

      We thank Reviewer #3 for the expert evaluation and inspiring criticism.

      (1) Encouragement to examine ligand-bound states of GlnK1.

      We agree and plan to perform the suggested experiments exploring the conditions under which GlnA1 and GlnK1 might interact. We will perform the MP experiments in the presence of ATP. In GlnA1 activity test assays when evaluating the presence/effects of GlnK1 on GlnA1 activity, however, ATP was always present in high concentrations and still we did not observe a significant effect of GlnK1 on the GlnA1 activity.

      (2) The exact role of 2-OG could have been dissected much better.

      We agree on that point and will improve the clarity of the manuscript. See also Reviewer #1 R.1.

      (3) The lack of studies on dimers.

      This is actually an interesting point, which we did not consider during writing the manuscript. Now, re-analysing all our MP data in this respect, GlnA1 is likely a dimer as smallest species. Consequently, we will add more supplementary data which supports this observation and change the text accordingly.

      (4) Previous studies und structures did not show the 2-OG.

      We assume that for other structures, no additional 2-OG was added, and the groups did not specifically analyse for this metabolite either. All methanoarchaea perform methanogenesis and contain the oxidative part of the TCA cycle exclusively for the generation of glutamate (anabolism) but not a closed TCA cycle enabling them to use internal 2-OG concentration as internal signal for nitrogen availability. In the case of bacterial GS from organisms with a closed TCA cycle used for energy metabolism (oxidation of acetyl CoA) like e.g. E. coli, the formation of an active dodecameric GS form underlies another mechanism independent of 2-OG. In case of the recent M. mazei GS structures published by Schumacher et al. 2023, the dodecameric structure is probably a result from the heterologous expression and purification from E. coli. (See also Reviewer #1 R.2). One example of methanoarchaeal glutamine synthetases that do in fact contain the 2-OG in the structure, is Müller et al. 2023.

      This landmark work by Lewis and Hegde represents the most significant breakthrough in membrane and secretory biogenesis in recent years. Their work reveals with outstanding clarity how nascent transmembrane segments can pass through the gate of Sec61 into the ER membrane through the coordinated motions of a conformationally and compositionally dynamic machine. Among many other insights, the authors discovered how a new factor, RAMP4, contributes to the formation and function of the lateral gate for certain substrates. The technical quality of the work is exceptional, setting the bar appropriately high.

    2. Reviewer #1 (Public Review):

      The paper meticulously explores various conformations and states of the ribosome-translocon complex. Employing advanced techniques such as cryoEM structural determination and AlphaFold modeling, the study delves into the dynamic nature of the ribosome-translocon complex. The findings from these analyses unveil crucial insights, significantly advancing our understanding of the co-translational translocation process in cellular mechanisms.

      To begin with, the authors employed a construct comprising the first two transmembrane domains of rhodopsin as a model for studying protein translocation. They conducted in vitro translation, followed by the purification of the ribosome-translocon complex, and determined its cryoEM structures. An in-depth analysis of their ribosome-translocon complex structure revealed that the nascent chain can pass through the lateral gate of translocon Sec61, akin to the behavior of a Signaling Peptide. Additionally, Sec61 was found to interact with 28S rRNA helix 24 and the ribosomal protein uL24. In summary, their structural model aligns with the through-pore model of insertion, contradicting the sliding model.

      Secondly, the authors successfully identified RAMP4 in their ribosome-translocon complex structure. Notably, the transmembrane domain of RAMP4 mimics the binding of a Signaling Peptide at the lateral gate of Sec61, albeit without unplugging. Intriguingly, RAMP4 is exclusively present in the non-multipass translocon ribosome-translocon complex, not in those containing multipass translocon. This observation suggests that co-translational translocation specifically occurs in the Sec61 channel that includes bound RAMP4. Additionally, the authors discovered an interaction between the C-tail of ribosomal proteins uL22 and the translocon Sec61, providing valuable insights into the nascent chain's behavior.

      Moving on to the third point, the focused classification unveiled TRAP complex interactions with various components. The authors propose that the extra density observed in their novel ribosome-translocon complex can be attributed to calnexin, a major binder of TRAP according to previous studies. Furthermore, the new structure reveals a TRAP-OSTA interaction. This newly identified TRAP-OSTA interaction offers a potential explanation for why patients with TRAP delta defects exhibit congenital disorders of glycosylation.

      In conclusion, this paper presents a robust contribution to the field with its thorough structural and modeling analyses. The significance of the findings is evident, providing valuable insights into the intricate mechanisms of protein co-translational translocation. The well-crafted writing, meticulous analyses, and clear figures collectively contribute to the overall strength of the paper.

    3. Reviewer #2 (Public Review):


      In the manuscript Lewis and Hegde present a structural study of the ribosome-bound multipass translocon (MPT) based on re-analysis of cryo-EM single particle data of ribosome-MPTs processing the multipass transmembrane substrate RhoTM2 from a previous publication (Smalinskaité et al, Nature 2022) and AlphaFold2 multimer modeling. Detailed analysis of the laterally open Sec61 is obtained from PAT-less particles.

      The following major claims are made:

      - TMs can bind similarly to the Sec61 lateral gate as signal peptides.

      - Ribosomal H59 is in immediate proximity to basic residues of TMs and signal peptides, suggesting it may contribute to the positive-inside rule.

      - RAMP4/SERP1 binds to the Sec61 lateral gate and the ribosome near 28S rRNA's helices 47, 57, and 59 as well as eL19, eL22, and eL31.

      - uL22 C-terminal tail binds H24/47 blocking a potential escape route for nascent peptides to the cytosol.

      - TRAP and BOS compete for binding to Sec61 hinge.

      - Calnexin TM binds to TRAPg.

      - NOMO wedges between TRAP and MPT.


      The manuscript contains numerous novel new structural analyses and their potential functional implications. While all findings are exciting, the highlight is the discovery of RAMP4/SERP1 near the Sec61 lateral gate. Overall, the strength is the thorough and extensive structural analysis of the different high-resolution RTC classes as well as the expert bioinformatic evolutionary analysis.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (public review and recommendations for the authors):

      Major points:

      (1) The identification of RAMP4 is a pivotal discovery in this paper. The sophisticated AlphaFold prediction, de novo model building of RAMP4's RBD domain, and sequence analyses provide strong evidence supporting the inclusion of RAMP4 in the ribosome-translocon complex structure.

      However, it is crucial to ensure the presence of RAMP4 in the purified sample. Particularly, a validation step such as western blotting for RAMP4 in the purified samples would strengthen the assertion that the ribosome-translocon complex indeed contains RAMP4. This is especially important given the purification steps involving stringent membrane solubilization and affinity column pull-down.

      As suggested, we have added Western blots showing that RAMP4 is retained at secretory translocons (and not multipass translocons) after solubilisation, affinity purification, and recovery of ribosome-translocon complexes (Fig. 3F). This data supports both our assignment of RAMP4 in ribosome-translocon complexes, and also the structure-based proposition that its occupancy is mutually exclusive with the multipass translocon (in particular, the PAT complex).  

      (2) Despite the comprehensive analyses conducted by the authors, it is challenging to accept the assertion that the extra density observed in TRAP class 1 corresponds to calnexin. The additional density in TRAP class 1 appears to be less well-resolved, and the evidence for assigning it as calnexin is insufficient. The extra density there can be any proteins that bind to TRAP. It is recommended that the authors examine the density on the ER lumen side. An investigation into whether calnexin's N-globular domain and P-domain are present in the ER lumen in TRAP class 1 would provide a clearer understanding.

      We agree that the Calnexin assignment is less confident than the other assignments in this manuscript, and that further support would be ideal. We have exhaustively searched our maps for any unexplained density connected with the putative Calnexin TMD, and have found none. This is consistent with Calnexin's lumenal domain being flexibly linked to its TMD, and thus would not be resolved in a ribosome-aligned reconstruction.

      Our assignment of this TMD to Calnexin was based on existing biochemical data (referenced in the paper) favouring this as the best working hypothesis by far: Calnexin is TRAP’s only abundant co-purifying factor, and their interaction is sensitive to point mutations in the Calnexin TMD. Recognising that this is not conclusive, we have ensured that the text and figures consistently describe this assignment as provisional or putative.

      (3) In the section titled 'TRAP competes and cooperates with different translocon subunits,' the authors present a compelling explanation for why TRAP delta defects can lead to congenital disorders of glycosylation. To enhance this explanation, it would be valuable if the authors could provide additional analyses based on mutations mentioned in the references. Specifically, examining whether these mutations align with the TRAP delta-OSTA structure models would strengthen the link between TRAP delta defects and the observed congenital disorders of glycosylation.

      We agree that mapping disease-causing point mutants to the TRAP delta structure could be potentially informative. Unfortunately, the referenced TRAP delta disease mutants act by simply impairing TRAP delta expression, and thus admit no such fine-grained analyses. However, sequence conservation is our next best guide to mutant function. We note in the text that the contact site charges on TRAP delta and RPN2 are conserved, and that the closest-juxtaposed interaction pair (K117 on TRAPδ and D386 on RPN2) is also the most conserved.

      Here are some minor points:

      (1) In the introduction, when the EMC, PAT, and BOS complexes were initially mentioned, it would be beneficial for the authors to provide more context or cite relevant references. This additional information will aid readers in better understanding these complexes, ensuring a smoother comprehension of their significance in the context of the study.

      The Introduction has been edited to provide more context with relevant references. 

      (2) In Figure 7, it would be valuable for the authors to include details on how they sampled the sequence alignments. 

      To clarify this methodological point, we have revised the Figure 7 caption to include these sentences: “The logo plots in panels A and D represent an HMM generated by jackHMMER upon convergence after querying UniProtKB’s metazoan sequences with the human TRAPα sequence. Only signal above background is shown, as rendered by Skylign.org.”

      Reviewer #2 (public review and recommendations for the authors):


      The manuscript contains numerous novel new structural analyses and their potential functional implications. While all findings are exciting, the highlight is the discovery of RAMP4/SERP1 near the Sec61 lateral gate. Overall, the strength is the thorough and extensive structural analysis of the different high-resolution RTC classes as well as the expert bioinformatic evolutionary analysis.


      A minor downside of the manuscript is the sheer volume of analyses and mechanistic hypotheses, which makes it sometimes difficult to follow. The authors might consider offloading some analyses based on weaker evidence to the supplement to maximize impact.

      We agree that the manuscript is long, but we have retained what we feel are the most important findings in the main text because the supplement is often undiscoverable via literature searches. Indeed, we chose eLife for its flexibility regarding article length and suitability for extended and detailed analyses. 


      - Figure S1 does not capture the fact that a PAT-free subset of particles is analyzed. The PAT classification step should be added.

      We apologise for having caused some confusion on this point: we do not show a PAT classification step because there was none. Instead we reanalysed the whole dataset with a focus on Sec61 and TRAP. The very little PAT present (9% of particles, per Smalinskaitė et al. 2022) appeared as a very weak density in some of the closed-Sec and weak-TRAP classes.

      - The assignment of calnexin appears highly speculative. As the authors acknowledge the EM density is clearly of insufficient resolution for identification, and also AF2 does not render orthogonal support for the interpretation. The binding to TRAPg also does not explain complex formation in lower eukaryotes that do not have TRAPg. The authors may consider moving the calnexin assignment and interpretation to the supplement as it appears highly speculative. In any case, it should not be referred to as a hypothesis and not a structure.

      We agree that the Calnexin assignment is less confident than the other assignments in this manuscript, and that further support would be ideal. Our assignment of this TMD to Calnexin was based on existing biochemical data (referenced in the paper) favouring this as the best working hypothesis by far: Calnexin is TRAP’s only abundant co-purifying factor, and their interaction is sensitive to point mutations in the Calnexin TMD. Recognising that this is not conclusive, we have ensured that the text and figures consistently describe this assignment as provisional or putative.

      - P. 8: "This extensive competition explains why prior studies found TRAP in only 40% of MPT complexes, but at high occupancy at all other RTCs29". The interpretation is at odds with a recent re-analysis of the same dataset (preprint: Gemmer et al 2023, https://doi.org/10.1101/2023.11.28.569136), which finds TRAP occupancy to negatively correlate with PAT, not BOS.

      The reviewer is correct that the Gemmer study demonstrates a negative correlation between PAT and TRAP occupancy, but it does not, as the reviewer claims, argue against a negative correlation between BOS and TRAP. In fact it agrees that Sec61•BOS•PAT complex would clash with TRAP, and that therefore “BOS could trigger release of TRAP from the multipass translocon.” Thus, there is no conflict between the two studies. The revised text in this passage now cites the Gemmer et al. preprint and clarifies that TRAP is partially displaced by competition with BOS, but retained at the translocon via its ribosome-binding domain.  

      - P. 7/8: the authors suggest that TRAPd may be important for OSTA recruitment and hence TRAPd deletion may cause glycosylation defects in patients by failure to recruit OSTA. However, cryo-ET studies (Pfeffer et al, Nat. Comms 2017) showed that OSTA still binds in patient-derived microsomes (and the OSTA-TRAPd interaction). The author should discuss their model in the light of these data.

      As explained in the text, our hypothesis predicts that TRAPδ is more important for OSTA’s recruitment to the RTC than for its RTC affinity: “OSTA’s attraction to TRAPδ is weak compared to its binding to the ribosome, but TRAPδ may nonetheless help recruit OSTA, since TRAPδ would attract OSTA from most possible angles of approach, whereas OSTA’s ribosome contacts are stereospecific.” Therefore the fact that Pfeffer et al. 2017 found OSTA at some TRAPδ-negative RTCs is not surprising. For confirmation we would look for TRAPδ-dependent glycosylation sites in fast-folding domains or otherwise kinetically sensitive loci, and indeed TRAP-dependence screens return complex profiles that could be consistent with such a mechanism (Phoomak et al. 2021).

      - Some confidence measure for the assignment of SERP1/RAMP4 should be provided adding support for the claim "The resolution of the RBD density was sufficient for de novo modelling". Indeed, the N-terminal ribosome-bound segment appears well resolved and programs like Modelangelo or FindMySequence should provide a confidence measure for the assignment of the density to SERP1. The TM part appears less well resolved, but the connectivity to the Nterminus may justify the assignment, which should be elaborated on.

      Although we appreciate the value of tools like Modelangelo or FindMySequence, and would have used them if we were resting our assignment of RAMP4 on its RBD alone, we feel that such analyses would be superfluous here. They would quantify only the buildability of RAMP4’s

      RBD, whereas the real question of RAMP4’s assignability is independently supported by AlphaFold’s confirmation of RAMP4’s TMD as the Sec61-binding density, and further biochemical data provided or cited in the paper.

      - P. 3: "Because PAT complex recruitment and MPT assembly are just beginning, ..." the implicit kinetic model seems to be that the MPT subcomplexes assemble on ribosome and Sec61. What is the evidence for this model and later recruitment of PAT (as opposed to GEL, BOS, and PAT binding pre-assembled)?

      The work of Sundaram et al. (PMID 36261522) established that PAT, GEL and BOS do not coassociate appreciably in the absence of the ribosome-Sec61 complex. This is consistent with the structural data in Smalinskaite et al. (PMID 36261528), which shows that PAT, GEL, and BOS each contact the ribosome (and Sec61 in the case of PAT and BOS), but have few if any specific contacts among themselves. Finally, data in both of these studies show that recruitment of each complex to the RNC is not lost when any of them is missing, arguing that each is capable of independent recruitment to ribosome-Sec61 complexes. 

      - p. 4: the meaning of the sentence "Stabilising interactions with this widely conserved motif may help Sec61 respond to its diverse substrates with a consistent open state." is not entirely clear. Published single-particle cryo-EM structures of RTC appear to have resulted in various degrees of openness.

      Here we were referring not to RTC structures in general, but to substrate-engaged RTCs in particular.  The two substrate-engaged RTC structures under discussion in this paragraph are nearly identical (Figure 2c) despite large differences in substrate sequence (RhoTM2 vs preprolactin’s SP). We were surprised to find that this engaged structure creates noncovalent bonds between the Sec61 N-half and the ribosome. This bonding would tend to stabilise this particular engaged structure, and this stabilisation helps explain why the newly observed TMengaged structure is so similar to the previously observed SP-engaged structure. Without this stabilising N-half interaction, one might instead expect to see more variability, such as the reviewer suggests.

      - A recent analysis of heimdallarchaea already hypothesized TRAP in these organisms and should be cited: Eme et al, Nature 618:992-999 (2023). The novel findings of this manuscript compared to Eme et al should be discussed.

      We thank the reviewer for bringing this relevant contemporaneous work to our attention. Reviewing the putative TRAP homologs identified by Eme et al, we find that most do not in fact appear to be TRAP homologs at all, judged by the measures used in our work (reciprocal HHpred queries against the human proteome and predicted structural similarity). This is not surprising since Eme et al. relied on low-threshold sequence similarity searches rather than structural measures. To acknowledge this work, we have added a sentence as follows (italics): “To test whether these candidates are also similar to TRAPαβγ in sequence, we used them to perform reciprocal HHpred queries of the human proteome, and in each case the corresponding human TRAP protein was the top hit (E = 0.031 for TRAPα, 9.4×10-14 for TRAP β, and 110 for

      TRAPγ). A contemporaneous study has also claimed to find TRAP homologs in

      Heimdallarchaeota (Eme et al. 2023), although some caution is warranted in these assignments because they do not seem to share predicted structural similarity to TRAP subunits and do not find human homologs in reciprocal HHpred queries.”

      - Given that the authors expand the evolutionary analysis of TRAP to archaea it would be helpful if sampling for RAMP4 were consistent (i.e., is TRAP present in the early eukaryotes that do not feature RAMP4? Is RAMP4 absent from heimdallarchaea?).

      As stated in the text, RAMP4’s absence from early-branching eukaryotic taxa indicates that it was also absent from their archaeal ancestors. We did of course run such queries for completeness and indeed find no archaeal RAMP4. TRAP, for its part, is generally present in early-branching eukaryotic taxa, as stated in the text, and this necessarily includes those from which RAMP4 is absent.

      - The authors may consider discussing (Gemmer et al 2023, https://doi.org/10.1101/2023.11.28.569136), which comes to similar conclusions for NEMO integration into the MPT.

      We thank the reviewer for bringing this relevant work to our attention. We have added the following sentence to the section on NOMO: “Contemporaneous work has arrived at a similar model for PLD10-12 but did not model PLD1 (Gemmer et al. 2023).”

      - The abundance approximation of RAMP4 in the native translocon by OccuPy should probably be taken with a grain of salt. The '80%' mentioned in the conclusion may stick around and could eventually turn out to be closer to 100%.

      It is certainly possible that the occupancy of RAMP4 is higher than OccuPy estimates.

      Unfortunately no available method can provide occupancy estimates with confidence intervals. The Western blots we have added to the revised manuscript are consistent with high occupancy, but cannot discriminate between 80 or 100%.


      - p. 5: The following sentence is incomplete: "Together, these factors explain why RAMP4's occupancy in prior cryo-EM maps was low enough to be overlooked, although in hindsight seems to be visible in several7,68,69"

      Thank you for catching this typo. We have revised the sentence as follows: “Together, these factors explain why RAMP4's occupancy in prior cryo-EM maps was low enough to be overlooked, although in hindsight it is visible in several of them.”

    1. eLife assessment

      The manuscript describes a valuable method to boost WNT signaling in a tissue-specific manner. The work extends previous data from the authors based on fusing an RSPO2 mutant protein to an antibody that binds ASGR1/2. In the current manuscript, two new antibodies with similar effects are described, that expand this solid approach and provide alternatives for potential future clinical applications. This manuscript will be of interest to all scientists studying protein engineering and cellular targeting.

    2. Reviewer #1 (Public Review):


      The authors have previously described a way to boost WNT/CTNNB1 signaling in a tissue-specific manner, by directing an RSPO2 mutant protein (RSPO2RA) to a liver-specific receptor (ASGR1/2). This is done by fusing the RSPO2RA to an antibody that binds ASGR1/2.

      Here the authors describe two new antibodies, 8M24 and 8G8, with similar effects. 8M24 shows specificity for ASGR1, while 8G8 has broader affinity for mouse/human ASGR1/2.<br /> The authors resolve and describe the crystal structure of the hASGR1CRD:8M24 complex and the hASGR2CRD:8G8 complex in great detail, which help explain the specificities of the 8M24 and 8G8 antibodies. Their epitopes are non-overlapping.<br /> Upon fusion of the antibodies to an RSPO2RA (an RSPO mutant), these antibodies are able to enhance WNT signaling by promoting the ASGR1-mediated clearance of ZNRF3/RNF43, thereby increasing cell surface expression of FZD. This has previously also been shown to be the case for RSPO2RA fused to an anti-ASGR1 antibody 4F3 - and the paper also tests how the antibodies compare to the 4F3 fusion.


      (1) One challenge in treating diseases, is the fact that one would like therapeutics to be highly specific - not just in terms of their target (e.g. aimed at a specific protein of interest) but also in terms of tissue specificity (i.e. affecting only tissue X but leaving all others unaffected). This study broadens the collection of antibodies that can be used for this purpose and thus expands a potential future clinical toolbox.

      (2) The authors have addressed questions raised after a first round of review, e.g. by showing that ASGR1 is itself indeed ubiquitinated.


      (1) Some questions remain as to how 8M24 and 8G8 compare to 4F3.

      (2) Some questions remain as to the specificity of the approach: the initial goal was not to also downregulate ASGR1 per se, so this targeting to a specific receptor/membrane protein is not trivial and/or neutral.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):


      The authors demonstrate that ASGR1 is degraded in response to RSPO2RA-antibody treatment through both the proteasomal and the lysosomal pathway, suggesting that this is due to the RSPO2RA-mediated recruitment of ZNRF3/RNF43, which have E3 ubiquitin ligase activity. The paper doesn't show, however, if ASGR1 is indeed ubiquitinated.

      We thank the reviewer for this comment. We have now conducted ASGR1 ubiquitination assays by immunoprecipitation (IP) of ubiquitin in the membrane protein extract, and immunoblotting (IB) ASGR1 after treating HepG2 cells with our SWEETS molecules or controls. The new data demonstrated ubiquitination of ASGR1 with SWEETS treatment (new Fig. S3A and S3B). Additionally, we blocked the potential ubiquitination of ASGR1 by mutating the two lysine residues in the cytoplasmic domain and compared the ASGR1 degradation after SWEETS treatment. The new data show that removing the potential ubiquitylation Lys sites prevented ASGR1 degradation post SWEETS treatment (new Fig. S3C). These new results provide direct evidence that ASGR1 is ubiquitinated to undergo lysosome or proteasome degradation.

      The authors conclude that the RSPO2A-Ab fusions can act as a targeted protein degredation platform, because they can degrade ASGR. While I agree with this statement, I would argue that the goal of these Abs would not be to degrade ASGR per se. The argumentation is a bit confusing here. This holds for both the results and the discussion section: The authors focus on the dual role of their agents, i.e. on promoting both WNT signaling AND on degrading ASGR1. They might want to reconsider how they present their data (e.g. it may be interesting to target ASGR1, but one would presumably then like to do this without also increasing WNT responsiveness?).

      We thank the reviewer for this comment. As the reviewer states, the initial goal of the RSPO2RA-ab fusions was to generate tissue-specific RSPO mimetics that focus on elimination of E3. As an unintended consequence, we observed enhanced elimination of ASGR as well. While this was unintended, the results did provide POC that when an E3 ligase is brought into proximity of another protein, ubiquitination and degradation of this protein may occur. Additionally, our results highlight that one needs to be careful in fully assessing the impact of bispecific molecules on the intended target as well as unintended targets to understand the potential side effects of such bispecific molecules. We have revised the manuscript to make this more clear, both in the Results and Discussion sections.

      Lines 326-331: The authors use a lot of abbreviations for all of the different protein targeting technologies, but since they are hinting at specific mechanisms, it would be better to actually describe the biological activity of LYTAC versus AbTAC/PROTAB/REULR so non-experts can follow.

      We thank the reviewer for this suggestion. We have added more details in the Discussion to highlight the different mechanisms of the various systems described.

      Can the authors comment on how 8M24 and 8G8 compare to 4F3? The latter seems a bit more specific (ie. lower background activity in the absence of ASGR1 in 5C)? Are there any differences/advances between 8M24 and 8G8 over 4F3? This remains unclear.

      These three antibodies bind different regions/epitopes on ASGR. 8M24 and 8G8 bind non-overlapping epitopes on the carbohydrate recognition domain (CRD), while 4F3 binds the stalk region outside of the CRD. This information is in the Results section of the manuscript. We do not believe that the difference in the ASGR binding epitopes contributes to the slight differences in the background activity. The slight differences may be due to differences in the conformation of the antibodies resulting from the differences in their primary sequences, and these differences may not be significant. We have now repeated the experiments in Fig. 5C and 5D to address the reviewer’s next comment on the axis. These new data (new Fig. 5C and 5D) show less background differences between the molecules.

      Can the authors ensure that the axes are labelled/numbered similarly for Fig 5B-D? This will make it easier to compare 5C and 5D.

      We thank the reviewer for this suggestion. The y-axes in Fig. 5B–D now have the same scale and number format. For Figs. 5C and 5D, we focus on the potency increases of the SWEETS molecules post ASGR1 overexpression.

      Reviewer #2 (Public Review):


      The authors show crystal structures for binding of these antibodies to ASGR1/2, and hypothesize about why specificity is mediated through specific residues. They do not test these hypotheses.

      We thank the reviewer for this comment. We did not further test the residue contributions to binding and specificity as this is not the main focus of the current manuscript. We have revised the section and tuned down the claims for specificity.

      The authors demonstrate in hepatocyte cell lines that these function as mimetics, and that they do not function in HEK cells, which do not express ASGR1. They do not perform an exhaustive screen of all non-hepatocyte cells, nor do they test these molecules in vivo.

      We agree with the reviewer. For the 4F3-based SWEETS molecule, additional in vitro and in vivo specificity characterized were performed and described in Zhang et al., Sci Rep, 2020. Since 8M24 is human specific and 8G8 only weakly interacts with mouse receptors, in vivo experiments in mouse were not performed. While we did not extensively test the 8M24- and 8G8-based SWEETS on additional cell lines or in vivo, we do believe the data presented strongly support the hepatocyte-specific effects of these molecules.

      Surprisingly, these molecules also induced loss of ASGR1, which the authors hypothesize is due to ubiquitination and degradation, initiated by the E3 ligases recruited to ASGR1. They demonstrate that inhibition of either the proteasome or lysosome abrogates this effect and that it is dependent on E1 ubiquitin ligases. They do not demonstrate direct ubiquitination of ASGR1 by ZNRF3/RNF43.

      We thank the reviewer for this comment. We have now conducted ASGR1 ubiquitination assays by immunoprecipitation (IP) of ubiquitin in the membrane protein extract, and immunoblotting (IB) ASGR1 after treating HepG2 cells with our SWEETS molecules or controls. The new data demonstrate ubiquitination of ASGR1 with SWEETS treatment (new Figs. S3A and S3B). Additionally, we blocked the potential ubiquitination of ASGR1 by mutating the two lysine residues in the cytoplasmic domain and compared the ASGR1 degradation after SWEETS treatment. The new data show that removing the potential ubiquitylation Lys sites prevented ASGR1 degradation post SWEETS treatment (new Fig. S3C). These new results provide direct evidence that ASGR1 is ubiquitinated to undergo lysosome or proteasome degradation.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      There are multiple instances where articles (i.e. the use of "the") are missing.

      We thank the reviewer for this comment. Following the suggestion, the manuscript has gone through a detailed review by an editorial service, and these and other grammatical errors have been corrected.

      Reviewer #2 (Recommendations For The Authors):

      The best I can think of is to inject these into Wnt reporter mice (or maybe humanized mice) and see if the liver lights up while other tissues do not.

      We thank the reviewer for this suggestion. The liver specificity was demonstrated in vivo in our earlier publication (SciRep, 10:13951, 2020) with the 4F3-RSPO2RA molecule. Unfortunately, as the results in this manuscript show, the new ASGR binders 8M24 and 8G8 either do not bind or only weakly interact with mouse receptors. Therefore, the in vivo experiments were not performed here.

      You could also consider addressing some of the statements in the manuscript that are currently hypothetical experimentally.

      We thank the reviewer for this comment. We did not further test the residues’ contribution to binding and specificity as this is not the main focus of the current manuscript. We have revised the section and tuned down the claims for specificity.

      It would be easier to compare the graphs in 5B-D if all Y-axes were the same scale, with the same scientific notation.

      We thank the reviewer for this suggestion. The y-axes in Fig. 5B-D now have the same scale and number format. For Figs. 5C and 5D, we focus on the potency increases of the SWEETS molecules post ASGR1 overexpression.

      Some of the western blots in Figure 6 do not have antibody/target labels, making them harder to interpret.

      All the Western blots antibody/target labels are on the right side of the blots for each panel, we have now made the text bold and thus easier to identify.

      Figure 6 and Supplementary Figure 2 are the same I think.

      Figure 6 and Supplementary Figure 2 show the same experimental set-up performed on two different cell lines, Fig. 6 is on Huh7 cells and Supplementary Fig. 2 is on HepG2 cells. The results from these two cell lines are quite consistent, making their appearance very similar.

      This important research article provides a novel approach to measure imaginal disc growth and uses this approach to explore the roles of Fat and Dachsous, two conserved protocadherins, in late larval development. The authors have addressed all referee concerns and the evidence supporting the authors' findings overall are compelling.

    2. Reviewer #1 (Public Review):

      The manuscript presents novel results on the regulation of Drosophila wing growth by the protocadherins Ds and Fat. The manuscript performs a more careful analysis of disc volume, larval size, and the relationship between the two, in normal and mutant larvae, and after localized knockdown or overexpression of Fat and Ds. Not all of the results are equally surprising given the previous work on Fat, Ds, and their regulation of disc growth, pupariation, and the Hippo pathway, but the presentation and detail of the presented data is new. The most novel results concern the scaling of gradients of Fat and Ds protein during development, a largely unstudied gradient of Fat protein, and using overexpression of Ds to argue that changes in the Ds gradient do not underlie the slowing and halting of cell divisions during development.

    3. Reviewer #2 (Public Review):

      This manuscript from Liu et al. examines the role of Fat and Dachsous, two transmembrane proto-cadherins that function both in planar cell polarity and in tissue growth control mediated by the Hippo pathway. The authors developed a new method for measuring growth of the wing imaginal disc during late larval development and then used this approach to examine the effects of disruption of Fat/Dachsous function on disc growth. The authors show that during mid to late third instar the wing imaginal disc normally grows in a linear rather than exponential fashion and that this occurs due to slowing of the mitotic cell cycle as the disc grows during this period. Consistent with their known role in regulating Hippo pathway activity, this slowing of growth is disrupted by loss of Fat/Dachsous function. The authors also observed a previously unreported gradient of Fat protein across the wing blade. However, graded expression of Fat or Dachsous is not necessary for proper growth regulation in the late third instar because ectopic Dachsous expression, which affects gradients of both Dachsous and Fat, has no growth phenotype.

    4. Author response:

      The following is the authors’ response to the original reviews.

      Response to Reviews

      All reviewers were positive about the rigor and impact of our work and offered a number of very helpful suggestions. We have done a number of suggested experiments, whose results have been added to the revision. We have also used their suggestions to improve the clarity and precision with which we describe and interpret our results.

      Reviewer 1 found the paper to be clearly written, with novel results, and the conclusions relevant and solid. This review offered many insights and thoughtful suggestions, which we have adopted to greatly improve the manuscript. The referee’s points are listed below with our responses.

      The study chooses to examine growth only in the prospective wing blade (the "pouch") rather than the wing disc as a whole. This can create biases, as fat and ds manipulations often cause stronger effects on growth, and on Hippo signaling targets, in the adjacent hinge regions of the disc. So I am curious about this choice. 

      Actually, several experiments described in the manuscript measured growth in regions of the wing disc that did not include the pouch (Fig 1 supplement 4). We found that in the second phase of allometric growth, growth of the pouch was greater than growth of the hinge-notum (Fig.1G and Fig 1 supplement 4).  We also looked at the effect of Ds and Fat on growth of the hinge-notum (Fig 4 supplement 1 and Fig 5 supplement 2). Loss of Ds or Fat also affected allometric growth of the pouch differently from their effects on allometric growth of the hinge-notum. We therefore treated analysis of each region independently. Greater focus was given to wing pouch growth because it was in this region that we detected the interesting gradient properties in Fat and Ds expression.

      The limitation to the wing region also creates some problems for the measurements themselves. The division between wing and pouch is not a strict lineage boundary, and thus cells can join or leave this region, creating two different reasons for changes in wing pouch size; growth of cells already in the region, or recruitment of cells into or out of the region. The authors do not discuss the second mechanism.

      We agree with this assessment that pouch growth can occur via lineage-restricted growth or by recruitment of cells into the region. This has now been clarified in the Introduction and the Discussion with discussion of the second mechanism.

      It is not at all clear that the markers for the pouch used by the authors are stable during development. One of these is Vg expression, or the Vg quadrant enhancer. But the Vgexpressing region is thought to increase by recruitment over late second and third instar through a feed-forward mechanism by which Vg-expressing cells induce Vg expression in adjacent cells. In fact, this process is thought to be driven in part by Fat and Ds (Zecca et al 2010). So when the authors manipulate Fat and Ds are they increasing growth or simply increasing Vg recruitment? I would prefer that this limitation be addressed. 

      There is the possibility that the feedforward recruitment of disc cells to express Vg leads to some expansion of the measured pouch domain. However, we argue that the recruitment mechanism may not be contributing significantly to the phenomena we measured in this study. 1) We limited our analysis of pouch growth to the third instar stage. In Fig.2, Zecca and Struhl (2007 doi 10.1242/dev.006411) found that recruitment was much stronger in clones induced at first instar rather than third instar, and so they limited their clonal analysis throughout the paper to first instar induced clones. Thus, it is unclear how much the feedforward recruitment mechanism contributes to pouch growth in the mid-to-late third instar. 2) We detected an effect of Ds and Fat on how rapidly the cell cycle slows down over time in pouch cells. The effect is entirely consistent with it having a causal effect on wing pouch growth. For example, nub>Ds(RNAi) causes the average third instar pouch cell to divide ~25% more rapidly than normal, when comparing the slopes in Figure 6. Note that at the beginning of the third instar, the average pouch cell has a similar doubling time whether lacking Ds or not (Figure 6). When we measured the final size of the wing pouch at the end of the third instar, nub>Ds(RNAi) caused the pouch to be ~30% larger than normal (Figure 5). This effect is quite comparable to the effect of Ds RNAi on cell doubling.

      To provide more rigorous evidence that the effect of Fat and Ds on cell cycle dynamics is primarily responsible for their effects on wing growth that we measured, we have adapted the simple growth modeling framework from Wartlick et al (2011) and fit our cell cycle measurements made for different genotypes. These fits give us estimates for instantaneous cell growth rates over time, and using these estimates, we simulated the theoretical growth trajectory of the entire wing pouch for wildtype and ds / fat RNAi animals. When we compare these model predictions of wing growth to our pouch volume measurements over time, they agree very well with one another. These

      analyses and results are now discussed in the Results and presented in Fig. 6 supplement 2. Overall, it supports a model that Fat and Ds regulate cell cycle dynamics in the wing pouch during third instar and this effect is primarily responsible for Fat and Ds’s effect on overall wing pouch growth in that timeframe. It does not rule out that Fat and Ds might also affect Vg recruitment at third instar, but such effects must be small relative to the primary effect on the cell cycle. It is feasible that Fat and Ds work via the feedforward mechanism at earlier larval stages. We have now discussed all this in detail in the Discussion considering the limitation of recruitment. 

      The second pouch marker the authors use is epithelial folding, but this also has problems, as Fat and Ds manipulations change folding. Even in wild type, the folding patterns are complex. For instance, to make folding fit the Vg-QE pattern at late third the authors appear to be jumping in the dorsal pouch between two different sets of folds (Fig 1S2A). The authors also do not show how they use folding patterns in younger, less folded discs, nor provide evidence that the location of the folds are the same and do not shift relative to the cells. They also do not explain how they use folds and measure at later wpp and bpp stages, as the discs unfold and evert, exposing cells that were previously hidden in the folds.

      The primary marker we used for the pouch boundary were the folds. We agree with the reviewer that our original description of how we defined the pouch boundary using the folds was inadequate. We now have substantially expanded the Methods section describing how we defined the boundary at all stages using the folds, including a supplementary figure (Fig 1 supplement 2). Importantly, in our measurements, we did not exclude the pouch regions within the folds but included them (see also the next point). Our microscopy detected fluorescence in the folds, and surface rendering allowed us to visualize fold structure and its contents. In younger discs with less folding, we defined the boundary by the location of the Wg inner ring. The folds were more prominent in older L3 larval discs and in the WPP and later stages since the wings had not fully everted yet. Therefore, we used accepted morphological definitions of the pouch boundary from the literature to define the boundaries. We were able to do so even though, as the reviewer notes, the fold architecture evolves as the larvae age. We agree with the reviewer that defining a boundary based on morphology could be error prone, especially prone to systematic error based on age. It is the main reason we directly compared the morphologically defined boundaries to boundaries defined by the Vg quadrant expression domain for many wing discs across all ages. As seen in Fig 1 supplement 3C, the two methods are in strong agreement with one another for discs of all ages. There is a slight overestimate of the pouch boundary using the morphological method, but the error is small (2.5%) and independent of disc size.  

      Finally, the authors limit their measurements to cells with exposed apical faces and thus a measurable area but apparently ignore the cells inside the folds. At late third, however, a substantial amount of the prospective wing blade is found within the folds, especially where they are deepest near the A/P compartment boundary. Using the third vein sensory organ precursors as markers, the L3-2 sensillum is found just distal to the fold, the L3-1 and the ACV sensilla are within the fold, and the GSR of the distal hinge is found just proximal to the fold. That puts the proximal half of the central wing blade in the fold, and apparently uncounted in their assays. These cells will however be exposed at wpp and especially bpp stages. How are the authors adjusting for this? 

      We apologize for not describing the methods of measurement thoroughly in the original submission. In fact, we did make measurements of cells located within the folds of the wing pouch at all stages. Z stacks of optical sections were collected that transversed the disc, including the folds. Using surface detection algorithms, we could make spatial measurements (xyz distances and areas) of the material within the folds enveloping the apical pouch. Therefore, we could measure the surface area and volume of the wing pouch that included the folds. This was indeed what we did and reported in the original submission. A much more complete description of the process has now been added to the Methods.

      On the other hand, we could not reliably measure Fat-GFP or Ds-GFP fluorescence intensity in cells deep in the folds due to light scattering. Therefore, we did not assay the entire gradient across the pouch. Of the cells we did measure, we know their relative distance to the center of the pouch, defined as the intersection of the AP and DV boundaries. Therefore, fluorescence intensities could be directly compared across stages since they were calibrated by the centerpoint of the pouch. We have added text to the Methods to clarify this.

      Stabilizing and destabilizing interactions between Fat and Ds- The authors describe a distal accumulation of Fat protein in the wing, and show that this is unlikely to be through Fat transcription. They further try to test whether the distal accumulation depends on destabilization of proximal Fat by proximal Ds by looking at Fat in ds mutant discs. However, the authors do not describe how they take into account the stabilizing effects of heterophilic binding between the extracellular domains (ECDs) of Fat and Ds; without one, the junctional levels and stability of the other is reduced (Ma et al., 2003; Hale et al. 2015). So when they show that the A-P gradient of Fat is reduced in a ds mutant, is this because of the loss of a destabilizing effect of Ds on Fat, as they assume, or is it because all junctional Fat has been destabilized by loss of extracelluarlar binding to Ds? The description of the Fat gradient in Ds mutants is also confusing (see note 6 below), making this section difficult for the reader to follow. 

      We did not intend to imply that Ds actively inhibits Fat. We now describe the implications of the result more clearly in the Results and Discussion with reference to the prior Hale and Ma study of heterophilic stabilization. It is worth noting that Ma et al 2003 saw elevated junctional Fat in ds mutant cells if they were surrounded by other ds mutant cells. This is consistent with our results. We also apologize for the confusion in describing the Fat gradient and have reworded the section in the Results to make it more clear.

      The authors do not propose or test a mechanism for the proposed destabilization. Fat and Ds bind not only through their ECDs, but binding has now also been demonstrated through their ICDs (Fulford et al. 2023)

      We now discuss possible mechanisms in the Discussion and include the Fulford reference in the Results.

      Ds gradient scales by volume, rather than cell number - This is an intriguing result, but the authors do not discuss possible mechanisms.

      We have now added discussion of possible mechanisms in the Discussion.

      Fat and Ds are already known to have autonomous effects on growth and Hippo signaling from clonal analyses and localized knockdowns. One novelty here is showing that localized knockdown does not delay pupariation in the way that whole animal knockdown does, although the mechanism is not investigated. Another novelty is that the authors find stronger wing pouch overgrowth after localized ds RNAi or whole disc loss of fat than after localized fat RNAi, the latter being only 11% larger. The fat RNAi result would have been strengthened by testing different fat RNAi stocks, which vary in their strength and are commonly weaker than null mutations, or stronger drivers such as the ap-gal4 they used for some of their ds-RNAi experiments or use of UAS-dcr2. Another reason for caution is that Garoia (2005) found much stronger overgrowth in fat mutant clones, which were about 75% larger than control clones.

      We thank the reviewer for this suggestion. Indeed, the weak effect of Fat RNAi had been due to the specific RNAi driver. We followed the reviewer’s suggestion and tested other RNAi stocks. We had in hand an RNAi driver against GFP that we had found in unrelated studies to be a very potent repressor of GFP expression. Since we had been using a knock-in allele of GFP inserted in frame to Fat throughout this study, we applied nub>Gal4 UAS-GFP RNAi to knock down homozygous Fat-GFP. The effect of the knockdown was very strong, as measured by residual 488nm fluorescence above background autofluorescence after knockdown. Correcting for background autofluorescence, we estimate that only 4.5% of Fat-GFP remained under RNAi conditions (Figure 5 - figure supplement 3). 

      Using the more potent RNAi reagent, we repeated the various experiments related to

      Fat. We observed a 42% increase in wing pouch growth, which is similar to that of Ds RNAi. We also observed an effect of Fat RNAi on the average cell cycle time of wing pouch cells. There was still a linear coupling between the cell cycle duration and wing pouch size, but the slope of the coupling was smaller with Fat RNAi. This was very similar to what Ds RNAi does to the cell cycle. Therefore, we have replaced the data from the original Fat RNAi experiments with the new data and modified the text throughout the manuscript to describe the new results.

      Flattening of Ds gradient does not slow growth. One model suggests that the flattening of the Ds gradient, and thus polarized Ds-Fat binding, account for slowed growth in older discs. The difficulty in the past has been that two ways of flattening the Ds gradient, either removing Ds or overexpressing Ds uniformly, give opposite results; the first increases growth, while the latter slows it. Both experiments have the problem of not just flattening the gradient, but also altering overall levels of Ds-Fat binding, which will likely alter growth independent of the gradients. Here, the authors instead use overexpression to create a strong Ds gradient (albeit a reversely oriented one) that does not flatten, and show that this does not prevent growth from slowing and arresting.

      To make sure that this is not some effect caused by using a reverse gradient, one might instead induce a more permanent normally oriented Ds gradient and see if this also does not alter growth; there is a ds Trojan gal4 line available that might work for this, and several other proximal drivers.

      Again, we thank the reviewer for this suggestion. We followed the reviewer’s suggestion and generated Trojan-Gal4 mediated overexpression of Ds. The Ds protein gradient was strongly amplified by Trojan-Gal4 but remained normally oriented. However, it only caused a modest (12%) increase in wing pouch volume. It did not significantly alter Fat expression dynamics nor the dynamics of cell cycle duration. This new data has been added to the Results (Fig. 7 and Fig 7 supplement 2) and discussed at length in the text.

      Another possible problem is that, unlike previous studies, the authors have not blocked the Four-jointed gradient; Fj alters Fat-Ds binding and might regulate polarity independently of Ds expression. A definitive test would be to perform the tests above in four-joined mutant discs.

      We examined a fj null mutant (fjp1/d1) and found that it did not alter final wing pouch size (Fig. 2 - figure supplement 3E). Moreover, neither Fat nor Ds expression were altered in the fj mutant (Fig. 2- figure supplement 3C,D). 

      The Discussion of these data should be improved. The authors state in the Discussion "The significance of these dynamics is unclear, but the flattening of the Fat gradient is not a trigger for growth cessation." While the Discussion mentions the effects of Ds on Fat distribution in some detail, this is the only phrase that discusses growth, which is surprising given how often the gradient model of growth control is mentioned elsewhere. The reader would be helped if details are given about what experiment supports this conclusion, the effect on not only growth cessation but cell cycle time, and why the result differs from those of Rogjula 2008 and Willecke 2008 using Ds and Fj overexpression.

      We have rewritten the Discussion to better reflect the results and incorporate the reviewer’s criticisms.

      The authors spend much of the discussion speculating on the possibility that Fat and Ds control growth by changing the wing's sensitivity to the BMP Dpp. As the manuscript contains no new data on Dpp, this is somewhat surprising. The discussion also ignores Schwank (2011), who argues that Fat and Dpp are relatively independent. There have also been studies showing genetic interactions between Fat and signaling pathways such as Wg (Cho and Irvine 2004) and EGF (Garoia 2005).

      We have modified the discussion to be more inclusive of mechanisms connecting Fat and other signaling pathways, and we deleted some of the speculation about Dpp. However, since Dpp is the only known growth factor whose local concentration linearly scales with average cell doubling time (the process we found Ds/Fat regulates), there is a logical connection that readers deserve to know about. Therefore, we have retained some discussion of the hypothesis that the two might be linked through cell cycle duration. It is for future studies to test that hypothesis as it is beyond the scope of this paper.

      That said, there are studies that discount the work of Wartlick’s Dpp model, eg. Schwank et al 2012, arguing that Dpp regulates growth permissively by limiting an antigrowth factor, Brinker. We have added this reference and the others in the Discussion to discuss alternative models where Fat/Ds act in parallel to Dpp. 

      Wpp and Bpp- First, the charts treat wpp as if it is a fixed number of hours after 5 day larvae, but this will not be true in fat and ds mutants with extended larval life. This should be mentioned.

      We have clarified this distinction in the figure legends.

      How are the authors limiting bpp to 1 hr from wpp? Prepupa are brown and lack air bubbles, but that spans 5 hours of disc changes from barely everted to fully wing-like.

      We deliberately chose 1 hour post WPP because we wanted to measure final wing volume with minimal eversion. We agree with the reviewer’s concerns with calling this BPP and we now call it WPP+1  

      "However, growth of the wing pouch ceased at the larva-pupa molt and its size remained constant".

      The transition from late third to wpp shown in the figure is not the pupal molt. Unlike in most insects, in Drosophila the larval cuticle is not molted away, it is remodeled during pupariation into the prepupal case. The pupal cuticle is not formed until 6 hr APF, which is why the initial stages are termed pre-pupal. Also, there is at least one more set of cell divisions that occur in later pupal stages (for instance, see recent work from the Buttitta lab).

      We have changed the reference of pupal molt to larva-prepupal transition throughout the manuscript.

      "In contrast, the notum-hinge exhibited simpler linear-like positive allometric growth (Fig. 1 - figure supplement 3C) 

      This oversimplifies, as there is still a strong inflection after the third time point, albeit not as large as with the wing because there is less notal growth.

      We have reworded the text as suggested. 

      "whereas at the WPP stage, dividing cells were only found in a narrow zone where sensory organ precursor cells undergo two divisions to generate future sensory organs (Fig. 1 - figure supplement 4C-E)."

      While there are more dividing cells at the anterior D/V, which will form sensory bristles, there are also dividing cells elsewhere, including in the posterior and scattered through the pouch, where there are no sensory precursors. Sensory organs are limited to the wing margin and the very few campaniform sensilla found on the prospective third vein. The Sens-GFP shown here, meant to identify sensory precursors, does not look much like the Sens expression in Nolo et al 2000. Anterior is on the left in 1S4A-D, but on the right in E.

      We thank the reviewer for this observation. Indeed, the Sens-GFP signal in the figure is too broad. This was owing to bleed-through of the PHH3 signal. Since the pattern of dividing cells at the WPP stage has been so well characterized in the literature, as has the pattern of Sens+ cells at that stage (ie, Nolo et al 2000), we have removed these panels and now simply cite the relevant literature.  

      "The gradient was asymmetric along the AP axis, being lower at the A margin than the P margin."

      The use of "margin" here is a bit confusing, as the term is usually used to describe the wing margin; that is, the D/V compartment boundary in the disc that forms the edge of the wing. Can the authors use a different term? It would also be helpful to point out that the A and P extremes are also, because of the geometry of the disc, the prospective proximal portions of the wing margin, and the hinge, especially since the authors are including the regions proximal to the most distal fold.

      We have reworded it as suggested.   

      The graphed loss of the Fat A-P gradient between day 5 third and wpp is dramatic. Given that the changes in folding at wpp might alter which cells are being graphed, can the authors show a photo?

      We have now included a photo of Fat-GFP at WPP in Fig 2 - figure supplement 2E.

      "Since Ds levels are highest and most steep near the margins, perhaps Ds inhibits Fat expression in a dose- or gradient-dependent manner. We also followed Fat-GFP dynamics in the ds mutant. We did not observe the progressive flattening of the FatGFP profile to the WPP wing (Fig. 2 - figure supplement 3A). Instead, the Fat-GFP profile was graded at the WPP stage and flattened somewhat more by the BPP stage (Fig. 2 - figure supplement 3B)."

      This description does not tell the reader if there is any less grading of Fat in the ds mutant compared with wild type; instead, it sounds like it is more graded, as gradation continues at wpp. This would then contradict the hypothesis that proximal Ds is required to create the distal Fat gradient.

      The Fat signals for the two genotypes are directly comparable as the samples were imaged together with the same microscope settings.  Fig 2M shows that the Fat gradient is less graded compared to the wildtype. We have reworded the text to make this more clear. But this graded expression persists longer into WPP, not the level of gradation. The reason for this is not understood.

      The figure, on the other hand, looks like Fat is less graded, although as noted above this could instead be caused by loss of the stable Ds-bound Fat normally found at junctions. 

      Fig 2M shows an increase in Fat levels at the proximal regions of the ds mutant pouch, where Ds is normally most concentrated. This makes the overall profile look less graded. 

      Confusingly, in the Discussion the authors state: "Loss of Ds affects the Fat gradient such that distribution of Fat is uniformly upregulated to peak levels." There is no mention of "peak levels" in the Results, and no mention of "graded" expression in the Discussion. I am unclear on how the absolute levels are being determined and would be surprised if there were peak levels after loss of Ds-bound Fat from junctions.

      The absolute levels between the genotypes were determined by carefully calibrated fluorescence of Fat-GFP from samples imaged at the same time with the same settings. We used the word peak to refer to the highest level of Fat-GFP within a given gradient profile. Clearly, the description is confusing and so we have deleted the word and modified the text to clarify the meaning.

      "Interestingly, the reversed Ds gradient caused a change in the Fat gradient (Fig. 7E). Its peak also became skewed to the anterior and did not normally flatten at the WPP stage."

      This result contradicts the author's earlier model that proximal Ds destabilizes Fat. Instead, the result fits the stabilization of Fat caused by binding to endogenous or overexpressed Ds or Ds ECD (Ma et al. 2003; Matakatsu and Blair, 2004; 2006; Hale et al. 2015).

      We agree that the reversed Ds affects Fat differently than the loss-of-function ds phenotype. We were not intending to propose a model based on the ds mutant, but a simple interpretation of the result. The reversed Ds experiment generates on its own a simple interpretation that is not consistent with the other. This speaks to the complexity of the system. We have changed the text in the Results to make this less confusing.

      Reviewer 2 found the paper to provide insights into normal growth of the wing and useful tools for measurement of growth features. This review offered many insights and thoughtful suggestions, which we have adopted to greatly improve the manuscript. The referee’s points are listed below with our responses.

      Although the approach used to measure volume is new to this study, the basic finding that imaginal disc growth slows at the mid-third instar stage has been known for some time from studies that counted disc cell number during larval development (Fain and Stevens, 1982; Graves and Schubiger, 1982). Although these studies did not directly measure disc volume, because cell size in the disc is not known to change during larval development, cell number is an accurate measure of tissue volume. However, it is worth noting that the approach used here does potentially allow for differential growth of different regions of the disc.

      We had cited the older literature in reference to our results. We have now noted the approach’s usefulness in measuring different disc regions such as the pouch.

      Related to point 1, a main conclusion of this study, that cell cycle length scales with growth of the wing, is based on a developmentally limited analysis that is restricted to the mid-third instar larval stage and later (early third instar begins at 72 hr - the authors' analysis started at 84 hr). The previous studies cited above made measurements from the beginning of the 3rd instar and combined them with previous histological analyses of cell numbers starting at the beginning of the 2nd instar. Interestingly, both studies found that cell number increases exponentially from the start of the 2nd instar until mid-third instar, and only after that point does the cell cycle slow resulting in the linear growth reported here. The current study states that growth is linear due to scaling of cell cycle with disc size as though this is a general principle, but from the earlier studies, this is not the case earlier in disc development and instead applies only to the last day of larval life.

      We apologize for not making this distinction clearer in the original manuscript. Indeed, growth is initially exponential and shifts to a more linear-like regime in the mid third instar. Our focus in the manuscript is primarily this latter phase. We have now rewritten the text in the Introduction, Results and Discussion to make this very clear. 

      While cell number and pouch volume increase exponentially from the start of the 2nd instar, the cell cycle already begins to slow down during the 2nd instar, as found with mitotic index measurements done by Wartlick et al 2011. Using their data to model cell cycle duration as a function of pouch area, we find that during the 2nd instar, cell cycle duration also increases as the size of the wing pouch increases. This is shown in the figure (panel C) below. Note that this relationship appears nonlinear and is quantitatively distinct from the relationship for third instar wing growth.

      Author response image 1.

      The analysis of the roles of Fat and Dachsous presented here has weaknesses that should be addressed. It is very curious that the authors found that depletion of Fat by RNAi in the wing blade had essentially no effect on growth while depletion of Dachsous did, given that the loss of function overgrowth phenotype of null mutations in fat is more severe than that of null mutations in dachsous (Matakatsu and Blair, 2006). An obvious possibility is that the Fat RNAi transgene employed in these experiments is not very efficient. The authors tried to address this by doubling the dose of the transgene, but it is not clear to me that this approach is known to be effective. The authors should test other RNAi transgenes and additionally include an analysis of growth of discs from animals homozygous for null alleles, which as they note survive to the late larval stages.

      We thank the reviewer for this suggestion. Indeed, the weak effect of Fat RNAi had been due to the specific RNAi driver. We followed the reviewer’s suggestion and tested other RNAi stocks. We had in hand an RNAi driver against GFP that we had found in unrelated studies to be a very potent repressor of GFP expression. Since we had been using a knock-in allele of GFP inserted in frame to Fat throughout this study, we applied nub>Gal4 UAS-GFP RNAi to knock down homozygous Fat-GFP. The effect of the knockdown was very strong, as measured by remaining 488nm fluorescence above background fluorescence after knockdown. Correcting for background fluorescence, we estimated that only 4.5% of Fat-GFP remained under RNAi conditions (Figure 5 - figure supplement 3). 

      Using the more potent RNAi reagent, we repeated the various experiments related to Fat. We observed a 42% increase in wing pouch growth, which is similar to that of Ds RNAi. We also observed an effect of Fat RNAi on the average cell cycle time of wing pouch cells. There was still a linear coupling between the cell cycle duration and wing pouch size, but the slope of the coupling was smaller with Fat RNAi. This was very similar to what Ds RNAi does to the cell cycle. Therefore, we have replaced the data from the original Fat RNAi experiments with the new data and modified the text throughout the manuscript to describe the new results.

      It is surprising that the authors detect a gradient of Fat expression that has not been seen previously given that this protein has been extensively studied. It is also surprising that they find that expression of Nubbin Gal4 is graded across the wing blade given that previous studies indicate that it is uniform (ie. Martín et al. 2004). These two surprising findings raise the possibility that the quantification of fluorescence could be inaccurate. The curvature of the wing blade makes it a challenging tissue to image, particularly for quantitative measurements.

      Fat protein expression not being uniform has been observed before but not carefully quantified (see Mao et al., 2009, Strutt and Strutt 2002).  Martin et al. 2004 (doi 10.1242/dev.013) claimed that Nub-Gal4 is uniform without actually measuring it. Please consult Fig 1A and 2A in their paper, which clearly shows stronger expression in the center/distal region of the pouch. 

      Regarding systematic errors in quantification, we took great pains to minimize them. We carefully divided the complex folded disc’s z stack into an apical region of interest (ROI) that included the distal domain of the wing pouch and a basal ROI that included the folds encompassing the pouch. We then used a published and widely used surface detection algorithm (ImSAnE) that captures a 3D region of interest (ROI) that can be curved and complex in shape (in z space) because the user creates a surface spline of the ROI. The resulting output treats the ROI as a virtual 2D object. This obviates the need to perform max projections of confocal stacks, which often create artifacts that the reviewer speaks of. Instead, ImSAnE eliminates such artifacts, and it is the gold standard for image processing of ROIs with 3D curvature. 

      Moreover, our pipeline does detect uniform expression if it is there. We used a da-Gal4 driver in Fig. 2K,L - this driver is widely acknowledged to be uniformly expressed in tissues of the fly. When it drives a control fluorescent marker (Bazooka-mCherry), our analysis pipeline detects a uniform expression pattern across the wing pouch (Fig. 2L). When the same Gal4 transgene drives Fat-HA in the same tissue, our pipeline detects a graded expression pattern of Fat-HA (Fig. 2L). In fact, this experiment co-expressed both Fat-HA and the control marker in the same disc. Thus, we feel confident that our analysis is not inaccurate.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

      Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

      I think the authors should state how many parameters require fitting to the data vs the total number of model parameters. It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      (1) I think the authors should state how many parameters require fitting to the data vs the total number of model parameters.

      The total number of model parameters are listed in Table 1. Each parameter has, in addition, references listed for the source of data (if one exists) along with how the data were used (’C’ calculate, ’F’ fit, ’E’ estimated, or ’S’ for scaled) for the specific simulations that appear in this paper. While this is a daunting number of parameters, only a few of these parameters must be updated when modeling a new musculotendon.

      Similar to a Hill-type muscle model, at least 5 parameters are needed to fit the VEXAT model to a specific musculotendon: maximum isometric force (fiso), optimal contractile element (CE) length, pennation angle, maximum shortening velocity, and tendon slack length. However, similar to a Hill model, it is only possible to use this minimal set of parameters by making use of default values for the remaining set of parameters. The defaults we have used have been extracted from mammalian muscle (see Table 1) and may not be appropriate for modeling muscle tissue that differs widely in terms of the ratio of fast/slow twitch fibers, titin isoform, temperature, and scale.

      Even when these defaults are appropriate, variation is the rule for biological data rather than the exception. It will always be the case that the best fit can only be obtained by fitting more of the model’s parameters to additional data. Standard measurements of the active force-length relation, passive forcelength relation, and force-velocity relations are quite helpful to improve the accuracy of the model to a specific muscle. It is challenging to improve the fit of the model’s cross-bridge (XE) and titin models because the data required are so rare. The experiments of Kirsch et al., Prado et al, and Trombitas et´ al. are unique to our knowledge. However, if more data become available, it is relatively straight forward to update the model’s parameters using the methods described in Appendix B or the code that appears online (https://github.com/mjhmilla/Millard2023VexatMuscle).

      We have modified the manuscript to make it clear that, in some circumstances, the burden of parameter identification for the VEXAT model can be as low as a Hill model:

      - Section 3: last two sentences of the 2nd paragraph, found at: Page 10, column 2, lines 1-12 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      - Table 1: last two sentences of the caption, found at: Page 11 of MillardFranklinHerzog v3.pdf and 05 MillardFranklinHerzog v2 v3 diff.pdf

      (2) It would also be interesting for the authors to discuss challenges associated with modeling ex vivo and in vivo data sets, due to differences in means of stimulation vs. model inputs.

      All of the experiments simulated in this work are in-situ or ex-vivo. So far the main challenges of simulating any experiment have been quite consistent across both in-situ and ex-vivo datasets: there are insufficient data to fit most model parameters to a specific specimen and, instead, defaults from the literature must be used. In an ideal case, a specimen would have roughly ten extra trials collected so that the maximum isometric force, optimal fiber length, active force-length relation, passive force-length relation (upto ≈ 0_._6_f_oM), and the force-velocity relations could be identified from measurements rather than relying on literature values. Since most lab specimens are viable for a small number of trials (with the exception of cat soleus), we don’t expect this situation to change in future.

      However, if data are available the fitting process is pretty straight forward for either in-situ or ex-vivo data: use a standard numerical method (for example non-linear least squares, or the bisection method) to adjust the model parameters to reduce the errors between simulation and experiment. The main difficulty, as described in the previous paragraph, is the availability of data to fit as many parameters as possible for a specific specimen. As such, the fitting process really varies from experiment to experiment and depends mainly on the richness of measurements taken from a specific specimen, and from the literature in general.

      Working from in-vivo data presents an entirely different set of challenges. When working with human data, for example, it’s just not possible to directly measure muscle force with tendon buckles, and so it is never completely clear how force is distributed across the many muscles that typically actuate a joint. Further, there is also uncertainty in the boundary condition of the muscle because optical motion capture markers will move with respect to the skeleton. Video fluoroscopy offers a method of improving the accuracy of measured boundary conditions, though only for a few labs due to its great expense. A final boundary condition remains impossible to measure in any case: the geometry and forces that act at the boundaries as muscle wraps over other muscles and bones. Fitting to in-vivo data are very difficult.

      While this is an interesting topic, it is tangent to our already lengthy manuscript. Since these reviews are public, we’ll leave it to the motivated reader to find this text here.

      Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model’s ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      (1) It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch.

      While many muscle physiologists are aware of the limitations of the Hill model, these limitations are not so well known among computational biomechanists. There are at least two reasons for this gap: there are few comprehensive evaluations of Hill models against several experiments, and some of the differences are quite nuanced. For example, active lengthening experiments can be replicated reasonably well using a Hill model if the lengthening is done on the ascending limb of the force length curve. Clearly the story is quite different on the descending limb as shown in Figure 9. Similarly, as Figure 8 shows, by choosing the right combination of tendon model and perturbation bandwidth it is possible to get reasonably accurate responses from the Hill model to stochastic length changes. Yet when a wide variety of perturbation bandwidths, magnitudes, and tendon models are tested it is clear that the Hill model cannot, in general, replicate the response of muscle to stochastic perturbations. For these reasons we think many of the Hill model’s drawbacks have not been clearly understood by computational biomechanists for many years now.

      (2) Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      We agree that it will be valuable to benchmark other models in the literature using the same set of experiments. Hopefully we, or perhaps others, will have the good fortune to secure research funding to continue this benchmarking work. This will, however, be quite challenging: few muscle models are accompanied by a professional-quality open-source implementation. Without such an implementation it is often impossible to reproduce published results let alone provide a fair and objective evaluation of a model.

      (3) For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening.

      The titin model described in the paper will provide an enhancement of force during a stretch-shortening cycle. This certainly would be an interesting next experiment to simulate in a future paper.

      (4) In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

      We can only respond to what drives the frequency dependent stiffness in the model, though we’re quite interested in what happens physiologically. Hopefully that there are some new experiments done to examine this phenomena in the future. In the case of the model, the reasons are pretty straight forward: the formulation of Eqn. 16 is responsible for this shift.

      Equation 16 has been formulated so that the acceleration of the attachment point of the XE is driven by the force difference between the XE and a reference Hill model (numerator of the first term in Eqn. 16) which is then low pass filtered (denominator of the first term in Eqn. 16). Due to this formulation the attachment point moves less when the numerator is small, or when the differences in the numerator change rapidly and effectively become filtered out. When the attachment point moves less, more of the CE’s force output is determined by variations in the length of the XE and its stiffness.

      On the other hand, the attachment point will move when the numerator of the first term in Eqn. 16 is large, or when those differences are not short lived. When the attachment point moves to reduce the strain in the XE, the force produced by the XE’s spring-damper is reduced. As a result, the CE’s force output is less influenced by variations of the length of the XE and its stiffness.

      Reviewer #2 (Recommendations for the Authors):

      I find the clarity of the manuscript to be much improved following revision. While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction, the revised description of small length changes makes the interpretation much less confusing.

      Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established. Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      (1) While I still find the combination of phenomenological and mechanistic approaches to be a little limiting with regards to our understanding of muscle contraction ...

      We have had to abstract some of the details of reality to have a model that can be used to simulate hundreds of muscles. In contrast, FiberSim produced by Kenneth Campbell’s group uses much less abstraction and might be of greater interest to you. FiberSim’s models include individual cross-bridges, titin molecules, and an explicit representation of the spatial geometry of a sarcomere. While this model is a great tool for testing muscle physiology questions through simulation, it is computationally expensive to use this model to simulate hundreds of muscles simultaneously.

      Kosta S, Colli D, Ye Q, Campbell KS. FiberSim: A flexible open-source model of myofilament-level contraction. Biophysical journal. 2022 Jan 18;121(2):175-82.https://campbell-muscle-lab.github.io/FiberSim/

      (2) Similarly, while I agree that Hill-type models are widely used their limitations have been addressed extensively and are very well established.

      Please see our response 1 to Reviewer # 1.

      (3) Hence, moving forward I think it would be much more valuable to start to compare these newer models to one another rather than just showing an improvement over a Hill model under (very biologically important) conditions which that model has no capacity to predict forces.

      Please see our response to 2 to Reviewer #1.

    2. eLife assessment

      This is a valuable study that develops a new model of the way muscle responds to perturbations, synthesizing models of how it responds to small and large perturbations, both of which are used to predict how muscles function for stability but also how they can be injured, and which tend to be predicted poorly by classic Hill-type models. The evidence presented to support the model is solid, since it outperforms Hill-type models in a variety of conditions. Although the combination of phenomenological and mechanistic aspects of the model may sometimes make it challenging to interpret the output, the work will be of interest to those developing realistic models of the stability and control of movement in humans or other animals.

    3. Reviewer #1 (Public Review):

      Muscle models are important tools in the fields of biomechanics and physiology. Muscle models serve a wide variety of functions, including validating existing theories, testing new hypotheses, and predicting forces produced by humans and animals in health and disease. This paper attempts to provide an alternative to Hill-type muscle models that includes contributions of titin to force enhancement over multiple time scales. Due to the significant limitations of Hill-type models, alternative models are needed and therefore the work is important and timely.

      The effort to include a role for titin in muscle models is a major strength of the methods and results. The results clearly demonstrate the weaknesses of Hill models and the advantages of incorporating titin into theoretical treatments of muscle mechanics. Another strength is to address muscle mechanics over a large range of time scales.

      The authors succeed in demonstrating the need to incorporate titin in muscle models, and further show that the model accurately predicts in situ force of cat soleus (Kirsch et al. 1994; Herzog & Leonard, 2002) and rabbit posts myofibrils (Leonard et al. 2010). However, it remains unclear whether the model will be practical for use with data from different muscles or preparations. Several ad hoc modifications were described in the paper, and the degree to which the model requires parameter optimization for different muscles, preparations and experiment types remains unclear.

    4. Reviewer #2 (Public Review):

      This model of skeletal muscle includes springs and dampers which aim to capture the effect of crossbridge and titin stiffness during the stretch of active muscle. While both crossbridge and titin stiffness have previously been incorporated, in some form, into models, this model is the first to simultaneously include both. The authors suggest that this will allow for the prediction of muscle force in response to short-, mid- and long-range stretches. All these types of stretch are likely to be experienced by muscle during in vivo perturbations, and are known to elicit different muscle responses. Hence, it is valuable to have a single model which can predict muscle force under all these physiologically relevant conditions. In addition, this model dramatically simplifies sarcomere structure to enable this muscle model to be used in multi-muscle simulations of whole-body movement.

      In order to test this model, its force predictions are compared to 3 sets of experimental data which focus on short-, mid- and long-range perturbations, and to the predictions of a Hill-type muscle model. The choice of data sets is excellent and provide a robust test of the model's ability to predict forces over a range of length perturbations. However, I find the comparison to a Hill-type muscle model to be somewhat limiting. It is well established that Hill-type models do not have any mechanism by which they can predict the effect of active muscle stretch. Hence, that the model proposed here represents an improvement over such a model is not a surprise. Many other models, some of which are also simple enough to be incorporated into whole-body simulations, have incorporated mechanistic elements which allow for the prediction of force responses to muscle stretch. And it is not clear from the results presented here that this model would outperform such models.

      The paper begins by outlining the phenomenological vs mechanistic approaches taken to muscle modelling, historically. It appears, although is not directly specified, that this model combines these approaches. A somewhat mechanistic model of the response of the crossbridges and titin to active stretch is combined with a phenomenological implementation of force-length and force-velocity relationships. This combination of approaches may be useful improving the accuracy of predictions of muscle models and whole-body simulations, which is certainly a worthy goal. However, it also may limit the insight that can be gained. For example, it does not seem that this model could reflect any effect of active titin properties on muscle shortening. In addition, it is not clear to me, either physiologically or in the model, what drives the shift from the high stiffness in short-range perturbations to the somewhat lower stiffness in mid-range perturbations.

    1. eLife assessment

      This proof-of-concept study focuses on an A->G DNA base editing strategy that converts CAG repeats to CAA repeats in the human HTT gene, which causes Huntington's disease (HD). These studies are conducted in human HEK293 cells engineered with a 51 CAG canonical repeat and in HD knock-in mice harboring 105+ CAG repeats. The findings of this study are valuable for the HD field, applying state-of-the-art techniques; however, the key experiments have yet to be performed in neuronal systems or brains of these mice: actual disease-rectifying effects relevant to patients have yet to observed.

    2. Reviewer #1 (Public Review):

      Summary:<br /> In the paper by Choi et al., the authors aimed to develop base editing strategies to convert CAG repeats to CAA repeats in the huntingtin gene (HTT), which causes Huntington's disease (HD). They hypothesized that this conversion would delay disease onset by shortening the uninterrupted CAG repeat. Using HEK-293T cells as a model, the researchers employed cytosine base editors and guide RNAs (gRNAs) to efficiently convert CAG to CAA at various sites within the CAG repeat. No significant indels, off-target edits, transcriptome alterations, or changes in HTT protein levels were detected. Interestingly, somatic CAG repeat expansion was completely abolished in HD knock-in mice carrying CAA-interrupted repeats.

      Strengths:<br /> This study represents the first proof-of-concept exploration of the cytosine base editing technique as a potential treatment for HD and other repeat expansion disorders with similar mechanisms.

      Weaknesses:<br /> Given that HD is a neurodegenerative disorder, it is crucial to determine the efficiency of the base editing strategies tested in this manuscript and their feasibility in relevant cells affected by HD and the brain, which needed to be improved in this manuscript.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In a proof-of-concept study with the aspiration of developing a treatment to delay HD onset, Choi et al. design and test an A>G DNA base editing strategy to exploit the recently established inverse relationship between the number of uninterrupted CAG repeats in polyglutamine repeat expansions and the age-of-onset of Huntington's Disease (HD). Most of the study is devoted to optimizing a base editing strategy typified by BE4max and gRNA2. The base editing is performed in human HEK293 cells engineered with a 51 CAG canonical repeat and in HD knock-in mice harboring 105+ CAG repeats.

      Weaknesses:<br /> Genotypic data on DNA editing are not portrayed in a clear manner consistent with the study's goal, namely reducing the number of uninterrupted CAG repeats by a clinically relevant amount according to the authors' least square approximated mean age-at-onset. No phenotypic data are presented to show that editing performed in either model would lead to reduced hallmarks of HD onset.

      More evidence is needed to support the central claims and therapeutic potential needs to be more adequate.

    4. Reviewer #3 (Public Review):

      Summary:<br /> In human patients with Huntington's disease (HD), caused by a CAG repeat expansion mutation, the number of uninterrupted CAG repeats at the genomic level influences age-at-onset of clinical signs independent of the number of polyglutamine repeats at the protein level. In most patients, the CAG repeat terminates with a CAA-CAG doublet. However, CAG repeat variants exist that either do not have that doublet or have two doublets. These variants consequently differ in their number of uninterrupted CAG repeats, while the number of glutamine repeats is the same as both CAA and CAG codes for glutamine. The authors first confirm that a shorter uninterrupted CAG repeat number in human HD patients is associated with developing the first clinical signs of HD later. They predict that introducing a further CAA-CAG doublet will result in years of delay of clinical onset. Based on this observation, the authors tested the hypothesis that turning CAG to CAA within a CAG repeat sequence using base editing techniques will benefit HD biology. They show that, indeed, in HD cell models (HEK293 cells expressing 16/17 CAG repeats; a single human stem cell line carrying a CAG repeat expansion in the fully penetrant range with 42 CAG repeats), their base editing strategies do induce the desired CAG-CAA conversion. The efficiency of conversion differed depending on the strategy used. In stem cells, delivery posed a problem, so to test allele specificity, the authors then used a HEK 293 cell line with 51 CAG repeats on the expanded allele. Conversion occurred in both alleles with huntingtin protein and mRNA levels; transcriptomics data was unchanged. In knock-in mice carrying 110 CAG repeats, however, base editing did not work as well for different, mainly technical, reasons.

      Strengths:<br /> The authors use state-of-the-art methods and carefully and thoroughly designed experiments. The data support the conclusions drawn. This work is a very valuable translation from the insight gained from large GWAS studies into HD pathogenesis. It rightly emphasises the potential this has as a causal treatment in HD, while the authors also acknowledge important limitations.

      Weaknesses:<br /> They could dedicate a little more to discussing several of the mentioned challenges. The reader will better understand where base editing is in HD currently and what needs to be done before it can be considered a treatment option. For instance,

      -It is important to clarify what can be gained by examining again the relationship between uninterrupted CAG repeat length and age-at-onset. Could the authors clarify why they do this and what it adds to their already published GWAS findings? What is the n of datasets?<br /> -What do they think an ideal conversion rate would be, and how that could be achieved?<br /> -Is there a dose-effect relationship for base editing, and would it be realistic to achieve the ideal conversion rate in target cells, given the difficulties described by the authors in differentiated neurons from stem cells?<br /> - The liver is a good tool for in-vivo experiments examining repeat instability in mouse models. However, the authors could comment on why they did not examine the brain.<br /> - Is there a limit to judging the effects of base editing on somatic instability with longer repeats, given the difficulties in measuring long CAG repeat expansions?<br /> - Given the methodological challenges for assessing HTT fragments, are there other ways to measure the downstream effects of base editing rather than extrapolate what it will likely be?<br /> - Sequencing errors could mask low-level, but biologically still relevant, off-target effects (such as gRNA-dependent and gRNA-independent DNA, Off-targets, RNA off-targets, bystander editing). How likely is that?<br /> - How worried are the authors about immune responses following base editing? How could this be assessed?

    5. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):


      In the paper by Choi et al., the authors aimed to develop base editing strategies to convert CAG repeats to CAA repeats in the huntingtin gene (HTT), which causes Huntington's disease (HD). They hypothesized that this conversion would delay disease onset by shortening the uninterrupted CAG repeat. Using HEK-293T cells as a model, the researchers employed cytosine base editors and guide RNAs (gRNAs) to efficiently convert CAG to CAA at various sites within the CAG repeat. No significant indels, off-target edits, transcriptome alterations, or changes in HTT protein levels were detected. Interestingly, somatic CAG repeat expansion was completely abolished in HD knock-in mice carrying CAA-interrupted repeats. 

      Correction of factual errors

      We analyzed HEK293 cells, not "HEK-293T".


      This study represents the first proof-of-concept exploration of the cytosine base editing technique as a potential treatment for HD and other repeat expansion disorders with similar mechanisms. 


      Given that HD is a neurodegenerative disorder, it is crucial to determine the efficiency of the base editing strategies tested in this manuscript and their feasibility in relevant cells affected by HD and the brain, which needed to be improved in this manuscript. 

      We appreciate the reviewer's constructive recommendations. Our genetic investigation focused on understanding observations in HD patients to develop genetic-based treatment strategies and test their feasibility. We agree with the reviewer regarding the importance of data from relevant cell types. Unfortunately, the levels of CAG-to-CAA conversion in the patient-derived neurons were modest, as described in our manuscript (approximately 2%). In addition, AAV did not produce detectable conversions in the brain of HD knock-in mice (data not shown), which was somewhat expected from the literature (PMID: 31937940). We believe some technical hurdles can be overcome by developing efficient delivery methods. Nonetheless, it will be an important follow-up study to perform preclinical studies employing optimized base editing strategies and efficient brain delivery methods to fully demonstrate the therapeutic potential of BE strategies. 

      Reviewer #2 (Public Review):


      In a proof-of-concept study with the aspiration of developing a treatment to delay HD onset, Choi et al. design and test an A>G DNA base editing strategy to exploit the recently established inverse relationship between the number of uninterrupted CAG repeats in polyglutamine repeat expansions and the age-of-onset of Huntington's Disease (HD). Most of the study is devoted to optimizing a base editing strategy typified by BE4max and gRNA2. The base editing is performed in human HEK293 cells engineered with a 51 CAG canonical repeat and in HD knock-in mice harboring 105+ CAG repeats. 

      Correction of factual errors

      We tested base editing strategies aimed at C > T conversion, not A > G DNA base editing. In addition to HEK293 and knock-in mice, we tested base editing strategies in patient-derived iPSC and neurons.


      Genotypic data on DNA editing are not portrayed in a clear manner consistent with the study's goal, namely reducing the number of uninterrupted CAG repeats by a clinically relevant amount according to the authors' least square approximated mean age-at-onset. No phenotypic data are presented to show that editing performed in either model would lead to reduced hallmarks of HD onset. 

      More evidence is needed to support the central claims and therapeutic potential needs to be more adequate. 

      Our strategies for converting CAG to CAA in model systems resulted in quantitative DNA modification in a population of cells. Consequently, individual cells may carry different genotypes, some harboring CAA and others CAG at the same genomic location. Therefore, using a standard genotype format for DNA to present base editing outcomes may not be ideal. Instead, we presented the resulting genotype data in a quantitative fashion to provide the percentage of conversion at each site. This approach allows for an intuitive interpretation of both the extent of repeat length reduction and the proportion of such modifications.

      Currently, genetically precise HD mouse models with robust motor and behavioral phenotypes are unavailable. While some HD mouse models, such as the BAC and YAC models, feature pronounced behavioral phenotypes, they consist of interrupted CAG repeat sequences, making them unsuitable for base conversion studies due to their inherently short uninterrupted repeats. Although genetically precise HD knockin mouse models exist, they do not manifest motor symptom-like phenotypes. Given that CAG repeat expansion is the primary driver of the disease and knock-in mice recapitulate such phenomenon, our genetic investigation focused on assessing the effects of base conversion on CAG repeat instability in knock-in mice. However, as emphasized by the reviewer, subsequent preclinical studies to evaluate the therapeutic efficacy of CAG-to-CAA conversion strategies using mouse models harboring uninterrupted adult-onset CAG repeats and robust HD-like phenotypes remain crucial.

      Reviewer #3 (Public Review):


      In human patients with Huntington's disease (HD), caused by a CAG repeat expansion mutation, the number of uninterrupted CAG repeats at the genomic level influences age-at-onset of clinical signs independent of the number of polyglutamine repeats at the protein level. In most patients, the CAG repeat terminates with a CAACAG doublet. However, CAG repeat variants exist that either do not have that doublet or have two doublets. These variants consequently differ in their number of uninterrupted CAG repeats, while the number of glutamine repeats is the same as both CAA and CAG codes for glutamine. The authors first confirm that a shorter uninterrupted CAG repeat number in human HD patients is associated with developing the first clinical signs of HD later. They predict that introducing a further CAA-CAG doublet will result in years of delay of clinical onset. Based on this observation, the authors tested the hypothesis that turning CAG to CAA within a CAG repeat sequence using base editing techniques will benefit HD biology. They show that, indeed, in HD cell models (HEK293 cells expressing 16/17 CAG repeats; a single human stem cell line carrying a CAG repeat expansion in the fully penetrant range with 42 CAG repeats), their base editing strategies do induce the desired CAG-CAA conversion. The efficiency of conversion differed depending on the strategy used. In stem cells, delivery posed a problem, so to test allele specificity, the authors then used a HEK 293 cell line with 51 CAG repeats on the expanded allele. Conversion occurred in both alleles with huntingtin protein and mRNA levels; transcriptomics data was unchanged. In knock-in mice carrying 110 CAG repeats, however, base editing did not work as well for different, mainly technical, reasons. 

      Correction of factual errors

      "HD cell models (HEK293 cells expressing 16/17 CAG repeats" is an incorrect description. It should be "HD cell models (HEK293 cells expressing 51/17 CAG repeats".


      The authors use state-of-the-art methods and carefully and thoroughly designed experiments. The data support the conclusions drawn. This work is a very valuable translation from the insight gained from large GWAS studies into HD pathogenesis. It rightly emphasises the potential this has as a causal treatment in HD, while the authors also acknowledge important limitations. 


      They could dedicate a little more to discussing several of the mentioned challenges. The reader will better understand where base editing is in HD currently and what needs to be done before it can be considered a treatment option. For instance, 

      - It is important to clarify what can be gained by examining again the relationship between uninterrupted CAG repeat length and age-at-onset. Could the authors clarify why they do this and what it adds to their already published GWAS findings? What is the n of datasets? 

      Published HD GWAS (PMID: 31398342) compared the onset age of duplicated interruption and loss of interruption to that of canonical repeats to determine whether uninterrupted CAG repeat or polyglutamine determines age at onset. However, GWAS findings did not quantify the magnitude of the unexplained remaining variance in age at onset in duplicated interruption and loss of interruption. Our study further investigated to gain insights into the amount of additional impact of duplicated interruption to estimate the maximum clinical benefits of base editing strategies for CAG-to-CAA conversion. Since the purpose of this genetic analysis is described in the result section already, we added the following sentence in the introduction section to bring up what is unknown. 

      "Still, age at onset of loss of interruption and duplicated interruption was not fully accounted for by uninterrupted CAG repeat, suggesting additional effects of non-canonical repeats."

      We added sample size for the least square approximation analysis in the text and corresponding figure legend. Sample sizes for molecular and animal experiments can be found in the corresponding figure legend.

      - What do they think an ideal conversion rate would be, and how that could be achieved? 

      It is a very important question. However, speculating the ideal conversion levels is out of the scope of this genetic investigation. A series of preclinical studies using relevant models may generate data that may shed light on the conversion rate levels that are required to produce meaningful clinical benefits. In the discussion section, we added the following sentence. 

      "Currently, the ideal levels of CAG-to-CAA conversion that produce significant clinical benefits are unknown. A series of preclinical studies using relevant model systems may generate data that may shed light on the optimal conversion rate levels that are required to produce significant clinical benefits."

      - Is there a dose-effect relationship for base editing, and would it be realistic to achieve the ideal conversion rate in target cells, given the difficulties described by the authors in differentiated neurons from stem cells? 

      We observed a clear dose-response relationship between the amount of BE reagents and the levels of conversion in non-neuronal cells. Unfortunately, the conversion rate was low in neuronal cells, potentially due to limited delivery, as speculated in the result section. As described in the discussion sections, we predict that efficient delivery methods will be crucial to produce significant CAG-to-CAA conversion to achieve therapeutic benefits.

      - The liver is a good tool for in-vivo experiments examining repeat instability in mouse models. However, the authors could comment on why they did not examine the brain.

      We focused on liver instability because of 1) the expectation that delivery/targeting efficiency is significantly lower in the brain (PMID: 31937940) and 2) shared underlying mechanisms between the brain and liver (described in the result section). The following sentence was added in the method section to provide a rationale for liver analysis. 

      "Since significantly lower delivery/targeting efficiency was expected in the brain 34, we focused on analyzing liver instability."

      - Is there a limit to judging the effects of base editing on somatic instability with longer repeats, given the difficulties in measuring long CAG repeat expansions? 

      Determining the levels of base conversion using sequencing technologies gets harder as repeats become longer. Fragment analysis can overcome such technical difficulty if conversion efficiency is high. As pointed out, the repeat expansion measure is also challenging because amplification is biased toward shorter alleles. However, if repeat sizes are relatively similar, the levels of repeat expansion as a function of base conversion can be determined relatively precisely without a significant bias by a standard fragment analysis approach. 

      - Given the methodological challenges for assessing HTT fragments, are there other ways to measure the downstream effects of base editing rather than extrapolate what it will likely be?

      Our CAG-to-CAA conversion strategies are not expected to directly generate fragments of huntingtin DNA, RNA, or protein. In contrast, immediate downstream effects of CAG-to-CAA conversion include sequence changes (DNA and RNA) and alteration of repeat instability, which are presented in the manuscript. If repeat instability is associated with HTT exon 1A fragment, base conversion strategies may indirectly alter the levels of such putative toxic species, which remains to be determined.  

      - Sequencing errors could mask low-level, but biologically still relevant, off-target effects (such as gRNAdependent and gRNA-independent DNA, Off-targets, RNA off-targets, bystander editing). How likely is that? 

      We agree with the reviewer that increased editing efficiency is expected to increase the levels of off-target editing. However, the field is actively developing base editors with minimal off-target effect (PMID: 35941130), which will increase the safety aspects of this technology for clinical use. We added the following sentence.  "In addition, developing base editors with high level on-target gene specificity and minimal off-target effects is a critical aspect to address 100."

      - How worried are the authors about immune responses following base editing? How could this be assessed? 

      We added the following sentence in the discussion section as the reviewer raised an important safety issue.  

      "Thorough assessments of immune responses against base editing strategies (e.g., development of antibody, B cell, and T cell-specific immune responses) and subsequent modification (e.g., immunosilencing) 101 will be critical to address immune response-associated safety issues of BE strategies."

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      The following points could be considered to improve the overall quality of the manuscript: 

      (1) The authors mentioned that the reason for checking repeat instability in the nonneuronal cells was due to the availability of specific types of AAV; there are other subtypes of AAVs available to infect neurons and iPSCs. 

      Our pilot experiments testing several AAV serotypes in patient-derived iPSC and HD knock-in mice showed that only AAV9 converted CAG to CAA at detectable levels in the liver, not in the brain or neurons. We also speculate that difficulties in targeting the CAG repeat region due to GC-rich sequence contributed to low conversion efficiency. Therefore, subsequent optimization of base editor and delivery may improve BE strategies for HD, permitting robust conversion at the challenging locus. 

      (2) Despite its bold nature, minimal data in the manuscript demonstrate that this gene editing strategy is disease-modifying.

      Resources required to demonstrate the therapeutic benefits of CAG-to-CAA conversion strategies are not fully available. Especially, relevant HD mouse models that carry uninterrupted adult onset CAG repeat and that permit measuring the levels of disease-modifying are lacking, as described in our response to the second reviewer. Given that CAG repeat expansion is the primary driver of the disease, this genetic investigation focused on determining the impacts of base editing strategies on CAG repeat expansion. Still, as indicated by the reviewer, follow-up preclinical studies to evaluate the levels of disease-modifying of CAG-to-CAA conversion strategies using relevant mouse models represent important next steps.

      (3) Off-target analysis at the DNA level was limited to "predicted" off-target sites. What about possible translocations that can result from co-nicking on different chromosomes, as a large number of potential targets exist? 

      Among gRNAs we tested, we focused on gRNAs 1 and 2, which predicted small numbers of off-target. Therefore, our off-target analysis at the DNA level was focused on validating those predicted off-targets. As pointed out, thoroughly evaluating off-target effects will be necessary when candidate BE strategies take the next steps for therapeutic development.

      Genomic translocation caused by double-strand breaks can produce negative consequences, such as cancer. Importantly, although paired nicks efficiently induced translocations, translocations were not detected when a single nick was introduced on each chromosome (PMID: 25201414). Therefore, it is predicted that BE strategies using nickase confers little risk of translocation.

      (4) For in vivo work, somatic repeat expansion was analyzed only in peripheral tissue samples. Since the main affected cellular population in HD is the brain, the outcome of this treatment on a disease-relevant organ still needs to be determined. 

      Challenges in delivery to the brain made us determine instability in the liver since many mechanistic components of somatic CAG repeat instability are shared between the liver and striatum, as rationalized in the manuscript. However, we agree with the reviewer regarding the importance of determining the effects of base conversion on brain instability. We added the following sentence in the method section to provide a rationale. "Since significantly lower delivery/targeting efficiency was expected in brain 34, we focused on analyzing liver instability."

      Reviewer #2 (Recommendations For The Authors):

      Throughout the manuscript, the authors apologize for techniques that do not work when workarounds seem readily apparent to an expert in the field. In its current form, the manuscript reads verbose, speculative, apologetic, and preliminary. 

      Drug development programs that are supported by human genetics data show increased success rates in clinical trials (PMID: 26121088, 31827124, 31830040). This is why this genetic study focused on 1) investigating observations in HD subjects and 2) subsequently developing treatment strategies that are supported by patient genetics. As the first illustration of base editing in HD, the main scope of our manuscript is to justify the genetic rationale of CAG-to-CAA conversion and demonstrate the feasibility of therapeutic strategies rooted in patient genetics. As our study was not aimed at entirely demonstrating the clinical benefits of base editing strategies in HD, some of our data were based on tools and approaches that were not fully optimal. We agree with the reviewer that it will be an important next step to employ optimized approaches to evaluate the efficacy of base editing strategies in model systems. Nevertheless, our novel base conversion strategies derived from HD patient genetics represent a significant advancement as they may contribute to developing effective treatments for this devastating disorder. 

      Reviewer#3 (Recommendations For The Authors):

      It would make for an easier read if abbreviations were kept to a minimum. 

      As recommended, we decreased the use of abbreviations. The following has been spelled out throughout the manuscript: CR (canonical repeat), LI (loss of interruption), DI (duplicated interruption), and CBE (cytosine base editor). Other abbreviations with infrequent usage (e.g., ABE, SS, QC) were also spelled out in the text.

    1. eLife assessment

      This study provides a valuable contribution to our understanding of the mechanisms underlying the limited capacity to process rapid sequences of visual stimuli by reporting convincing evidence that the attentional blink affects neurally separable processes of visual detection and discrimination. The motivation for some of the analyses and the connection to previous empirical and theoretical work can be improved. The study will be of interest to neuroscientists and psychologists investigating perception and attention.

    2. Reviewer #1 (Public Review):


      In this study, the authors used a multi-alternative decision task and a multidimensional signal-detection model to gain further insight into the cause of perceptual impairments during the attentional blink. The model-based analyses of behavioural and EEG data show that such perceptual failures can be unpacked into distinct deficits in visual detection and discrimination, with visual detection being linked to the amplitude of late ERP components (N2P and P3) and discrimination being linked to the coherence of fronto-parietal brain activity.


      The main strength of this paper lies in the fact that it presents a novel perspective on the cause of perceptual failures during the attentional blink. The multidimensional signal-detection modelling approach is explained clearly, and the results of the study show that this approach offers a powerful method to unpack behavioural and EEG data into distinct processes of detection and discrimination.


      While the model-based analyses are compelling, the paper also features some analyses that seem misguided, or, at least, insufficiently motivated and explained. Specifically, in the introduction, the authors raise the suggestion that the attentional blink could be due to a reduction in sensitivity or a response bias. The suggestion that a response bias could play a role seems misguided, as any response bias would be expected to be constant across lags, while the attentional blink effect is only observed at short lags. Thus, it is difficult to understand why the authors would think that a response bias could explain the attentional blink.

      A second point of concern regards the way in which the measures for detection and discrimination accuracy were computed. If I understand the paper correctly, a correct detection was defined as either correctly identifying T2 (i.e., reporting CW or CCW if T2 was CW or CCW, respectively, see Figure 2B), or correctly reporting T2's absence (a correct rejection). Here, it seems that one should also count a misidentification (i.e., incorrect choice of CW or CCW when T2 was present) as a correct detection, because participants apparently did detect T2, but failed to judge/remember its orientation properly in case of a misidentification. Conversely, the manner in which discrimination performance is computed also raises questions. Here, the authors appear to compute accuracy as the average proportion of T2-present trials on which participants selected the correct response option for T2, thus including trials in which participants missed T2 entirely. Thus, a failure to detect T2 is now counted as a failure to discriminate T2. Wouldn't a more proper measure of discrimination accuracy be to compute the proportion of correct discriminations for trials in which participants detected T2?

      My last point of critique is that the paper offers little if any guidance on how the inferred distinction between detection and discrimination can be linked to existing theories of the attentional blink. The discussion mostly focuses on comparisons to previous EEG studies, but it would be interesting to know how the authors connect their findings to extant, mechanistic accounts of the attentional blink. A key question here is whether the finding of dissociable processes of detection and discrimination would also hold with more meaningful stimuli in an identification task (e.g., the canonical AB task of identifying two letters shown amongst digits). There is evidence to suggest that meaningful stimuli are categorized just as quickly as they are detected (Grill-Spector & Kanwisher, 2005; Grill-Spector K, Kanwisher N. Visual recognition: as soon as you know it is there, you know what it is. Psychol Sci. 2005 Feb;16(2):152-60. doi: 10.1111/j.0956-7976.2005.00796.x. PMID: 15686582.). Does that mean that the observed distinction between detection and discrimination would only apply to tasks in which the targets consist of otherwise meaningless visual elements, such as lines of different orientations?

    3. Reviewer #2 (Public Review):


      The authors had two aims: First, to decompose the attentional blink (AB) deficit into the two components of signal detection theory; sensitivity and bias. Second, the authors aimed to assess the two subcomponents of sensitivity; detection and discrimination. They observed that the AB is only expressed in sensitivity. Furthermore, detection and discrimination were doubly dissociated. Detection modulated N2p and P3 ERP amplitude, but not frontoparietal beta-band coherence, whereas this pattern was reversed for discrimination.


      The experiment is elegantly designed, and the data - both behavioral and electrophysiological - are aptly analyzed. The outcomes, in particular the dissociation between detection and discrimination blinks, are consistently and clearly supported by the results. The discussion of the results is also appropriately balanced.


      The lack of an effect of stimulus contrast does not seem very surprising from what we know of the nature of AB already. Low-level perceptual factors are not thought to cause AB. This is fine, as there are also other, novel findings reported, but perhaps the authors could bolster the importance of these (null) findings by referring to AB-specific papers, if there are indeed any, that would have predicted different outcomes in this regard.

      On an analytical note, the ERP analysis could be finetuned a little more. The task design does not allow measurement of the N2pc or N400 components, which are also relevant to the AB, but the N1 component could additionally be analyzed. In doing so, I would furthermore recommend selecting more lateral electrode sites for both the N1, as well as the P1. Both P1 and N1 are likely not maximal near the midline, where the authors currently focused their P1 analysis.

      Impact & Context:

      The results of this study will likely influence how we think about selective attention in the context of the AB phenomenon. However, I think its impact could be further improved by extending its theoretical framing. In particular, there has been some recent work on the nature of the AB deficit, showing that it can be discrete (all-or-none) and gradual (Sy et al., 2021; Karabay et al., 2022, both in JEP: General). These different faces of target awareness in the AB may be linked directly to the detection and discrimination subcomponents that are analyzed in the present paper. I would encourage the authors to discuss this potential link and comment on the bearing of the present work on these previous behavioral findings.

    4. Reviewer #3 (Public Review):


      In the present study, the authors aimed to achieve a better understanding of the mechanisms underlying the attentional blink, that is, a deficit in processing the second of two target stimuli when they appear in rapid succession. Specifically, they used a concurrent detection and identification task in- and outside of the attentional blink and decoupled effects of perceptual sensitivity and response bias using a novel signal detection model. They conclude that the attentional blink selectively impairs perceptual sensitivity but not response bias, and link established EEG markers of the attentional blink to deficits in stimulus detection (N2p, P3) and discrimination (fronto-parietal high-beta coherence), respectively. Taken together, their study suggests distinct mechanisms mediating detection and discrimination deficits in the attentional blink.


      Major strengths of the present study include its innovative approach to investigating the mechanisms underlying the attentional blink, an elegant, carefully calibrated experimental paradigm, a novel signal detection model, and multifaceted data analyses using state-of-the-art model comparisons and robust statistical tests. The study appears to have been carefully conducted and the overall conclusions seem warranted given the results. In my opinion, the manuscript is a valuable contribution to the current literature on the attentional blink. Moreover, the novel paradigm and signal detection model are likely to stimulate future research.


      Weaknesses of the present manuscript mainly concern the negligence of some relevant literature, unclear hypotheses, potentially data-driven analyses, relatively low statistical power, potential flaws in the EEG methods, and the absence of a discussion of limitations. In the following, I will list some major and minor concerns in detail.

      Major points

      Hypotheses:<br /> I appreciate the multifaceted, in-depth analysis of the given dataset including its high amount of different statistical tests. However, neither the Introduction nor the Methods contain specific statistical hypotheses. Moreover, many of the tests (e.g., correlations) rely on selected results of previous tests. It is unclear how many of the tests were planned a priori, how many more were performed, and how exactly corrections for multiple tests were implemented. Thus, I find it difficult to assess the robustness of the results.

      Power:<br /> Some important null findings may result from the rather small sample sizes of N = 24 for behavioral and N = 18 for ERP analyses. For example, the correlation between detection and discrimination d' deficits across participants (r=0.39, p=0.059) (p. 12, l. 263) and the attentional blink effect on the P1 component (p=0.050, no test statistic) (p. 14, 301) could each have been significant with one more participant. In my opinion, such results should not be interpreted as evidence for the absence of effects.

      Neural basis of the attentional blink:<br /> The introduction (e.g., p. 4, l. 56-76) and discussion (e.g., p. 19, 427-447) do not incorporate the insights from the highly relevant recent review by Zivony & Lamy (2022), which is only cited once (p. 19, l. 428). Moreover, the sections do not mention some relevant ERP studies of the attentional blink (e.g., Batterink et al., 2012; Craston et al., 2009; Dell'Acqua et al., 2015; Dellert et al., 2022; Eiserbeck et al., 2022; Meijs et al., 2018).

      Detection versus discrimination:<br /> Concerning the neural basis of detection versus discrimination (e.g., p. 6, l. 98-110; p. 18, l. 399-412), relevant existing literature (e.g., Broadbent & Broadbent, 1987; Hillis & Brainard, 2007; Koivisto et al., 2017; Straube & Fahle, 2011; Wiens et al., 2023) is not included.

      Pooling of lags and lag 1 sparing:<br /> I wonder why the authors chose to include 5 different lags when they later pooled early (100, 300 ms) and late (700, 900 ms) lags, and whether this pooling is justified. This is important because T2 at lag 1 (100 ms) is typically "spared" (high accuracy) while T2 at lag 3 (300 ms) shows the maximum AB (for reviews, see, e.g., Dux & Marois, 2009; Martens & Wyble, 2010). Interestingly, this sparing was not observed here (p. 43, Figure 2). Nevertheless, considering the literature and the research questions at hand, it is questionable whether lag 1 and 3 should be pooled.

      Discrimination in the attentional blink<br /> Concerning the claims that previous attentional blink studies conflated detection and discrimination (p. 6, l. 111-114; p. 18, l. 416), there is a recent ERP study (Dellert et al., 2022) in which participants did not perform a discrimination task for the T2 stimuli. Moreover, since the relevance of all stimuli except T1 was uncertain in this study, irrelevant distractors could not be filtered out (cf. p. 19, l. 437). Under these conditions, the attentional blink was still associated with reduced negativities in the N2 range (cf. p. 19, l. 427-437) but not with a reduced P3 (cf. p. 19, l 439-447).

      General EEG methods:<br /> While most of the description of the EEG preprocessing and analysis (p. 31/32) is appropriate, it also lacks some important information (see, e.g., Keil et al., 2014). For example, it does not include the length of the segments, the type and proportion of artifacts rejected, the number of trials used for averaging in each condition, specific hypotheses, and the test statistics (in addition to p-values).

      EEG filters:<br /> P. 31, l. 728: "The data were (...) bandpass filtered between 0.5 to 18 Hz (...). Next, a bandstop filter from 9-11 Hz was applied to remove the 10 Hz oscillations evoked by the RSVP presentation." These filter settings do not follow common recommendations and could potentially induce filter distortions (e.g., Luck, 2014; Zhang et al., 2024). For example, the 0.5 high-pass filter could distort the slow P3 wave. Mostly, I am concerned about the bandstop filter. Since the authors commendably corrected for RSVP-evoked responses by subtracting T2-absent from T2-present ERPs (p. 31, l. 746), I wonder why the additional filter was necessary, and whether it might have removed relevant peaks in the ERPs of interest.

      Coherence analysis:<br /> P. 33, l. 786: "For subsequent, partial correlation analyses of coherence with behavioral metrics and neural distances (...), we focused on a 300 ms time period (0-300 ms following T2 onset) and high-beta frequency band (20-30 Hz) identified by the cluster-based permutation test (Fig. 5A-C)." I wonder whether there were any a priori criteria for the definition and selection of such successive analyses. Given the many factors (frequency bands, hemispheres) in the analyses and the particular shape of the cluster (p. 49, Fig 5C), this focus seems largely data-driven. It remains unclear how many such tests were performed and whether the results (e.g., the resulting weak correlation of r = 0.22 in one frequency band and one hemisphere in one part of a complexly shaped cluster; p. 15, l. 327) can be considered robust.

      References<br /> Batterink, L., Karns, C. M., & Neville, H. (2012). Dissociable mechanisms supporting awareness: The P300 and gamma in a linguistic attentional blink task. Cerebral Cortex, 22(12), 2733-2744. https://doi.org/10.1093/cercor/bhr346<br /> Broadbent, D. E., & Broadbent, M. H. P. (1987). From detection to identification: Response to multiple targets in rapid serial visual presentation. Perception & Psychophysics, 42(2), 105-113. https://doi.org/10.3758/BF03210498<br /> Craston, P., Wyble, B., Chennu, S., & Bowman, H. (2009). The attentional blink reveals serial working memory encoding: Evidence from virtual and human event-related potentials. Journal of Cognitive Neuroscience, 21(3), 550-566. https://doi.org/10.1162/jocn.2009.21036<br /> Dell'Acqua, R., Dux, P. E., Wyble, B., Doro, M., Sessa, P., Meconi, F., & Jolicœur, P. (2015). The attentional blink impairs detection and delays encoding of visual information: Evidence from human electrophysiology. Journal of Cognitive Neuroscience, 27(4), 720-735. https://doi.org/10.1162/jocn_a_00752<br /> Dellert, T., Krebs, S., Bruchmann, M., Schindler, S., Peters, A., & Straube, T. (2022). Neural correlates of consciousness in an attentional blink paradigm with uncertain target relevance. NeuroImage, 264C, 119679. https://doi.org/10.1016/j.neuroimage.2022.119679<br /> Dux, P. E., & Marois, R. (2009). The attentional blink: A review of data and theory. Attention, Perception, & Psychophysics, 71(8), 1683-1700. https://doi.org/10.3758/APP.71.8.1683<br /> Hillis, J. M., & Brainard, D. H. (2007). Distinct mechanisms mediate visual detection and identification. Current Biology, 17(19), 1714-1719. https://doi.org/10.1016/j.cub.2007.09.012<br /> Keil, A., Debener, S., Gratton, G., Junghöfer, M., Kappenman, E. S., Luck, S. J., Luu, P., Miller, G. A., & Yee, C. M. (2014). Committee report: Publication guidelines and recommendations for studies using electroencephalography and magnetoencephalography. Psychophysiology, 51(1), 1-21. https://doi.org/10.1111/psyp.12147<br /> Koivisto, M., Grassini, S., Salminen-Vaparanta, N., & Revonsuo, A. (2017). Different electrophysiological correlates of visual awareness for detection and identification. Journal of Cognitive Neuroscience, 29(9), 1621-1631. https://doi.org/10.1162/jocn_a_01149<br /> Luck, S. J. (2014). An introduction to the event-related potential technique. MIT Press.<br /> Martens, S., & Wyble, B. (2010). The attentional blink: Past, present, and future of a blind spot in perceptual awareness. Neuroscience & Biobehavioral Reviews, 34(6), 947-957. https://doi.org/10.1016/j.neubiorev.2009.12.005<br /> Meijs, E. L., Slagter, H. A., de Lange, F. P., & Gaal, S. van. (2018). Dynamic interactions between top-down expectations and conscious awareness. Journal of Neuroscience, 38(9), 2318-2327. https://doi.org/10.1523/JNEUROSCI.1952-17.2017<br /> Straube, S., & Fahle, M. (2011). Visual detection and identification are not the same: Evidence from psychophysics and fMRI. Brain and Cognition, 75(1), 29-38. https://doi.org/10.1016/j.bandc.2010.10.004<br /> Wiens, S., Andersson, A., & Gravenfors, J. (2023). Neural electrophysiological correlates of detection and identification awareness. Cognitive, Affective, & Behavioral Neuroscience. https://doi.org/10.3758/s13415-023-01120-5<br /> Zhang, G., Garrett, D. R., & Luck, S. J. (2024). Optimal filters for ERP research II: Recommended settings for seven common ERP components. Psychophysiology, n/a(n/a), e14530. https://doi.org/10.1111/psyp.14530

    5. Author response:

      Reviewer #1: 


      In this study, the authors used a multi-alternative decision task and a multidimensional signal-detection model to gain further insight into the cause of perceptual impairments during the attentional blink. The model-based analyses of behavioural and EEG data show that such perceptual failures can be unpacked into distinct deficits in visual detection and discrimination, with visual detection being linked to the amplitude of late ERP components (N2P and P3) and discrimination being linked to the coherence of fronto-parietal brain activity.


      The main strength of this paper lies in the fact that it presents a novel perspective on the cause of perceptual failures during the attentional blink. The multidimensional signaldetection modelling approach is explained clearly, and the results of the study show that this approach offers a powerful method to unpack behavioural and EEG data into distinct processes of detection and discrimination.


      (1.1) While the model-based analyses are compelling, the paper also features some analyses that seem misguided, or, at least, insufficiently motivated and explained. Specifically, in the introduction, the authors raise the suggestion that the attentional blink could be due to a reduction in sensitivity or a response bias. The suggestion that a response bias could play a role seems misguided, as any response bias would be expected to be constant across lags, while the attentional blink effect is only observed at short lags. Thus, it is difficult to understand why the authors would think that a response bias could explain the attentional blink.

      A deficit in T2 identification accuracy could arise from either sensitivity or criterion effects; the criterion effect may manifest as a choice bias. For example, in short T1-T2 lag trials, when T2 closely follows T1, participants may adopt a more conservative choice criterion for reporting the presence of T2. Moreover, criterion effects need not be uniform across lags: A participant could infer the T1-T2 lag interval based on various factors, including trial length, thereby permitting them to adjust their choice criterion variably across different lags. We will provide a more detailed illustration of this claim in the revision.

      (1.2) A second point of concern regards the way in which the measures for detection and discrimination accuracy were computed. If I understand the paper correctly, a correct detection was defined as either correctly identifying T2 (i.e., reporting CW or CCW if T2 was CW or CCW, respectively, see Figure 2B), or correctly reporting T2's absence (a correct rejection). Here, it seems that one should also count a misidentification (i.e., incorrect choice of CW or CCW when T2 was present) as a correct detection, because participants apparently did detect T2, but failed to judge/remember its orientation properly in case of a misidentification. Conversely, the manner in which discrimination performance is computed also raises questions. Here, the authors appear to compute accuracy as the average proportion of T2-present trials on which participants selected the correct response option for T2, thus including trials in which participants missed T2 entirely. Thus, a failure to detect T2 is now counted as a failure to discriminate T2. Wouldn't a more proper measure of discrimination accuracy be to compute the proportion of correct discriminations for trials in which participants detected T2?

      Detection and discrimination accuracies were computed with precisely the same procedure, and under the same conditions, as described by the Reviewer (underlined text, above). We regret our poor description; we will improve upon it in the revised manuscript.

      (1.3) My last point of critique is that the paper offers little if any guidance on how the inferred distinction between detection and discrimination can be linked to existing theories of the attentional blink. The discussion mostly focuses on comparisons to previous EEG studies, but it would be interesting to know how the authors connect their findings to extant, mechanistic accounts of the attentional blink. A key question here is whether the finding of dissociable processes of detection and discrimination would also hold with more meaningful stimuli in an identification task (e.g., the canonical AB task of identifying two letters shown amongst digits). There is evidence to suggest that meaningful stimuli are categorized just as quickly as they are detected (Grill-Spector & Kanwisher, 2005; Grill-Spector K, Kanwisher N. Visual recognition: as soon as you know it is there, you know what it is. Psychol Sci. 2005 Feb;16(2):152-60. doi: 10.1111/j.0956-7976.2005.00796.x. PMID: 15686582.). Does that mean that the observed distinction between detection and discrimination would only apply to tasks in which the targets consist of otherwise meaningless visual elements, such as lines of different orientations?

      Our results are consistent with previous literature suggested by the Reviewer. Specifically, we do not claim that detection and discrimination are sequential processes; in fact, we modeled them as concurrent computations (Figs. 3A-B). Yet, our results suggest that these processes possess distinct neural bases. We have discussed this idea briefly in the Discussion section (e.g., “Yet, we found no evidence for these two computations being sequential…”). We will discuss this further in the revised manuscript in the context of previous literature.

      Reviewer #2:


      The authors had two aims: First, to decompose the attentional blink (AB) deficit into the two components of signal detection theory; sensitivity and bias. Second, the authors aimed to assess the two subcomponents of sensitivity; detection and discrimination. They observed that the AB is only expressed in sensitivity. Furthermore, detection and discrimination were doubly dissociated. Detection modulated N2p and P3 ERP amplitude, but not frontoparietal beta-band coherence, whereas this pattern was reversed for discrimination.


      The experiment is elegantly designed, and the data - both behavioral and electrophysiological - are aptly analyzed. The outcomes, in particular the dissociation between detection and discrimination blinks, are consistently and clearly supported by the results. The discussion of the results is also appropriately balanced.


      (2.1) The lack of an effect of stimulus contrast does not seem very surprising from what we know of the nature of AB already. Low-level perceptual factors are not thought to cause AB. This is fine, as there are also other, novel findings reported, but perhaps the authors could bolster the importance of these (null) findings by referring to AB-specific papers, if there are indeed any, that would have predicted different outcomes in this regard.

      While there is consensus that the low-level perceptual factors are not affected by the attentional blink, other studies may suggest evidence to the contrary (e.g., Chua et al, Percept. Psychophys., 2005). We will highlight the significance of our findings in the context of such conflicting evidence in literature, in the revised manuscript.

      (2.2) On an analytical note, the ERP analysis could be finetuned a little more. The task design does not allow measurement of the N2pc or N400 components, which are also relevant to the AB, but the N1 component could additionally be analyzed. In doing so, I would furthermore recommend selecting more lateral electrode sites for both the N1, as well as the P1. Both P1 and N1 are likely not maximal near the midline, where the authors currently focused their P1 analysis.

      We will incorporate these additional analyses in the revised manuscript.

      (2.3) Impact & Context:

      The results of this study will likely influence how we think about selective attention in the context of the AB phenomenon. However, I think its impact could be further improved by extending its theoretical framing. In particular, there has been some recent work on the nature of the AB deficit, showing that it can be discrete (all-or-none) and gradual (Sy et al., 2021; Karabay et al., 2022, both in JEP: General). These different faces of target awareness in the AB may be linked directly to the detection and discrimination subcomponents that are analyzed in the present paper. I would encourage the authors to discuss this potential link and comment on the bearing of the present work on these behavioural findings.

      Thank you. We will discuss our findings in the context of these recent studies.

      Reviewer #3:


      In the present study, the authors aimed to achieve a better understanding of the mechanisms underlying the attentional blink, that is, a deficit in processing the second of two target stimuli when they appear in rapid succession. Specifically, they used a concurrent detection and identification task in- and outside of the attentional blink and decoupled effects of perceptual sensitivity and response bias using a novel signal detection model. They conclude that the attentional blink selectively impairs perceptual sensitivity but not response bias, and link established EEG markers of the attentional blink to deficits in stimulus detection (N2p, P3) and discrimination (fronto-parietal high-beta coherence), respectively. Taken together, their study suggests distinct mechanisms mediating detection and discrimination deficits in the attentional blink.


      Major strengths of the present study include its innovative approach to investigating the mechanisms underlying the attentional blink, an elegant, carefully calibrated experimental paradigm, a novel signal detection model, and multifaceted data analyses using state-of-theart model comparisons and robust statistical tests. The study appears to have been carefully conducted and the overall conclusions seem warranted given the results. In my opinion, the manuscript is a valuable contribution to the current literature on the attentional blink. Moreover, the novel paradigm and signal detection model are likely to stimulate future research.


      Weaknesses of the present manuscript mainly concern the negligence of some relevant literature, unclear hypotheses, potentially data-driven analyses, relatively low statistical power, potential flaws in the EEG methods, and the absence of a discussion of limitations. In the following, I will list some major and minor concerns in detail.

      Major points

      (3.1) Hypotheses:

      I appreciate the multifaceted, in-depth analysis of the given dataset including its high amount of different statistical tests. However, neither the Introduction nor the Methods contain specific statistical hypotheses. Moreover, many of the tests (e.g., correlations) rely on selected results of previous tests. It is unclear how many of the tests were planned a priori, how many more were performed, and how exactly corrections for multiple tests were implemented. Thus, I find it difficult to assess the robustness of the results.

      As outlined in the Introduction, we hypothesized that neural computations associated with target detection would be characterized by regional neuronal markers (e.g., parietal or occipital ERPs), whereas computations linked to feature discrimination may involve neural coordination across multiple brain regions (e.g. fronto-parietal coherence). We planned and conducted our statistical tests based on this hypothesis. All multiple comparison corrections (e.g., Bonferroni-Holm correction, see Methods) were performed separately for each class of analyses. We will clarify these hypotheses and provide further details in the revised manuscript.

      (3.2) Power:

      Some important null findings may result from the rather small sample sizes of N = 24 for behavioral and N = 18 for ERP analyses. For example, the correlation between detection and discrimination d' deficits across participants (r=0.39, p=0.059) (p. 12, l. 263) and the attentional blink effect on the P1 component (p=0.050, no test statistic) (p. 14, 301) could each have been significant with one more participant. In my opinion, such results should not be interpreted as evidence for the absence of effects.

      We agree and will revise the manuscript accordingly. We will also report Bayes factor (BF) values, where relevant, to further evaluate these claims.

      (3.3) Neural basis of the attentional blink:

      The introduction (e.g., p. 4, l. 56-76) and discussion (e.g., p. 19, 427-447) do not incorporate the insights from the highly relevant recent review by Zivony & Lamy (2022), which is only cited once (p. 19, l. 428). Moreover, the sections do not mention some relevant ERP studies of the attentional blink (e.g., Batterink et al., 2012; Craston et al., 2009; Dell'Acqua et al., 2015; Dellert et al., 2022; Eiserbeck et al., 2022; Meijs et al., 2018).

      We will motivate and discuss our study in the context of these previous studies. 

      (3.4) Detection versus discrimination:

      Concerning the neural basis of detection versus discrimination (e.g., p. 6, l. 98-110; p. 18, l. 399-412), relevant existing literature (e.g., Broadbent & Broadbent, 1987; Hillis & Brainard, 2007; Koivisto et al., 2017; Straube & Fahle, 2011; Wiens et al., 2023) is not included.

      Thank you for these suggestions. We will include these important studies in our discussion.

      (3.5) Pooling of lags and lags 1 sparing:

      I wonder why the authors chose to include 5 different lags when they later pooled early (100, 300 ms) and late (700, 900 ms) lags, and whether this pooling is justified. This is important because T2 at lag 1 (100 ms) is typically "spared" (high accuracy) while T2 at lag 3 (300 ms) shows the maximum AB (for reviews, see, e.g., Dux & Marois, 2009; Martens & Wyble, 2010). Interestingly, this sparing was not observed here (p. 43, Figure 2). Nevertheless, considering the literature and the research questions at hand, it is questionable whether lag 1 and 3 should be pooled.

      Lag-1 sparing is not always observed in attentional blink studies; there are notable exceptions that do not report such sparing (Hommel et al., Q. J. Exp. Psychol., 2005; Livesay et al., Attention, Percept. Psychophys., 2011). Our statistical tests revealed no significant difference in accuracies between short lag (100 and 300 ms) trials or between long lag (700 and 900 ms) trials but did reveal significant differences between the short and long lag trials (ANOVA, followed by post-hoc tests). To simplify the presentation of the findings, we pooled together the short lag (100 and 300 ms) and, separately, the long lag (700 and 900 ms) trials. We will present these analyses, and clarify the motivation for pooling in the revised manuscript. 

      (3.6) Discrimination in the attentional blink

      Concerning the claims that previous attentional blink studies conflated detection and discrimination (p. 6, l. 111-114; p. 18, l. 416), there is a recent ERP study (Dellert et al., 2022) in which participants did not perform a discrimination task for the T2 stimuli. Moreover, since the relevance of all stimuli except T1 was uncertain in this study, irrelevant distractors could not be filtered out (cf. p. 19, l. 437). Under these conditions, the attentional blink was still associated with reduced negativities in the N2 range (cf. p. 19, l. 427-437) but not with a reduced P3 (cf. p. 19, l 439-447).

      We will address the difference between our findings and those of Dellert et al (2022) in the revised manuscript.

      (3.7) General EEG methods:

      While most of the description of the EEG preprocessing and analysis (p. 31/32) is appropriate, it also lacks some important information (see, e.g., Keil et al., 2014). For example, it does not include the length of the segments, the type and proportion of artifacts rejected, the number of trials used for averaging in each condition, specific hypotheses, and the test statistics (in addition to p-values).

      We regret the oversight. We will include these details in the revised Methods.

      (3.8) EEG filters:

      P. 31, l. 728: "The data were (...) bandpass filtered between 0.5 to 18 Hz (...). Next, a bandstop filter from 9-11 Hz was applied to remove the 10 Hz oscillations evoked by the RSVP presentation." These filter settings do not follow common recommendations and could potentially induce filter distortions (e.g., Luck, 2014; Zhang et al., 2024). For example, the 0.5 high-pass filter could distort the slow P3 wave. Mostly, I am concerned about the bandstop filter. Since the authors commendably corrected for RSVP-evoked responses by subtracting T2-absent from T2-present ERPs (p. 31, l. 746), I wonder why the additional filter was necessary, and whether it might have removed relevant peaks in the ERPs of interest.

      Thank you for this suggestion. We will repeat this analysis by removing these additional filters.

      (3.9) Coherence analysis:

      P. 33, l. 786: "For subsequent, partial correlation analyses of coherence with behavioral metrics and neural distances (...), we focused on a 300 ms time period (0-300 ms following T2 onset) and high-beta frequency band (20-30 Hz) identified by the cluster-based permutation test (Fig. 5A-C)." I wonder whether there were any a priori criteria for the definition and selection of such successive analyses. Given the many factors (frequency bands, hemispheres) in the analyses and the particular shape of the cluster (p. 49, Fig 5C), this focus seems largely data-driven. It remains unclear how many such tests were performed and whether the results (e.g., the resulting weak correlation of r = 0.22 in one frequency band and one hemisphere in one part of a complexly shaped cluster; p. 15, l. 327) can be considered robust.

      Please see responses to comments #3.1 and #3.2 (above). In addition to reporting further details regarding statistical tests and multiple comparisons corrections, we will compute and report Bayes factors to quantify the strength of the evidence for correlations, as appropriate.

    1. eLife assessment

      This study employs a modified protocol for single-nuclei RNA sequencing of adipose tissue that preserves RNA quality and nuclei integrity. Using this protocol, the study provides valuable insights into the cellular heterogeneity and molecular landscape of murine adipose tissue from lean mice and mice with diet-induced obesity. The study is solid in its approach and analysis, providing a comprehensive description of a dysfunctional hypertrophic adipocyte subpopulation that emerges in association with obesity.

    2. Reviewer #1 (Public Review):


      This manuscript from So et al. describes what is suggested to be an improved protocol for single-nuclei RNA sequencing (snRNA-seq) of adipose tissue. The authors provide evidence that modifications to the existing protocols result in better RNA quality and nuclei integrity than previously observed, with ultimately greater coverage of the transcriptome upon sequencing. Using the modified protocol, the authors compare the cellular landscape of murine inguinal and perigonadal white adipose tissue (WAT) depots harvested from animals fed a standard chow diet (lean mice) or those fed a high-fat diet (mice with obesity).


      Overall, the manuscript is well-written, and the data are clearly presented. The strengths of the manuscript rest in the description of an improved protocol for snRNA-seq analysis. This should be valuable for the growing number of investigators in the field of adipose tissue biology that are utilizing snRNA-seq technology, as well as those other fields attempting similar experiments with tissues possessing high levels of RNAse activity.

      Moreover, the study makes some notable observations that provide the foundation for future investigation. One observation is the correlation between nuclei size and cell size, allowing for the transcriptomes of relatively hypertrophic adipocytes in perigonadal WAT to be examined. Another notable observation is the identification of an adipocyte subcluster (Ad6) that appears "stressed" or dysfunctional and likely localizes to crown-like inflammatory structures where pro-inflammatory immune cells reside.


      Analogous studies have been reported in the literature, including a notable study from Savari et al. (Cell Metabolism). This somewhat diminishes the novelty of some of the biological findings presented here. Moreover, a direct comparison of the transcriptomic data derived from the new vs. existing protocols (i.e. fully executed side by side) was not presented. As such, the true benefit of the protocol modifications cannot be fully understood.

    3. Reviewer #2 (Public Review):


      In the present manuscript So et al utilize single-nucleus RNA sequencing to characterize cell populations in lean and obese adipose tissues.


      The authors utilize a modified nuclear isolation protocol incorporating VRC that results in higher-quality sequencing reads compared with previous studies.


      The use of VRC to enhance snRNA-seq has been previously published in other tissues. The snRNA-seq snRNA-seq data sets presented in this manuscript, when compared with numerous previously published single-cell analyses of adipose tissue, do not represent a significant scientific advance.

      Figure 1-3: The snRNA-seq data obtained by the authors using their enhanced protocol does not represent a significant improvement in cell profiling for the majority of the highlighted cell types including APCs, macrophages, and lymphocytes. These cell populations have been extensively characterized by cytoplasmic scRNA-seq which can achieve sufficient sequencing depth, and thus this study does not contribute meaningful additional insight into these cell types. The authors note an increase in the number of rare endothelial cell types recovered, however this is not translated into any kind of functional analysis of these populations.

      Figure 4: The authors did not provide any evidence that the relative fluorescent brightness of GFP and mCherry is a direct measure of the nuclear size, and the nuclear size is only a moderate correlation with the cell size. Thus sorting the nuclei based on GFP/mCherry brightness is not a great proxy for adipocyte diameter. Furthermore, no meaningful insights are provided about the functional significance of the reported transcriptional differences between small and large adipocyte nuclei.

      Figure 5-6: The Ad6 population is highly transcriptionally analogous to the mAd3 population from Emont et al, and is thus not a novel finding. Furthermore, in the present data set, the authors conclude that Ad6 are likely stressed/dying hypertrophic adipocytes with a global loss of gene expression, which is a well-documented finding in eWAT > iWAT, for which the snRNA-seq reported in the present manuscript does not provide any novel scientific insight.

    4. Reviewer #3 (Public Review):


      The authors aimed to improve single-nucleus RNA sequencing (snRNA-seq) to address current limitations and challenges with nuclei and RNA isolation quality. They successfully developed a protocol that enhances RNA preservation and yields high-quality snRNA-seq data from multiple tissues, including a challenging model of adipose tissue. They then applied this method to eWAT and iWAT from mice fed either a normal or high-fat diet, exploring depot-specific cellular dynamics and gene expression changes during obesity. Their analysis included subclustering of SVF cells and revealed that obesity promotes a transition in APCs from an early to a committed state and induces a pro-inflammatory phenotype in immune cells, particularly in eWAT. In addition to SVF cells, they discovered six adipocyte subpopulations characterized by a gradient of unique gene expression signatures. Interestingly, a novel subpopulation, termed Ad6, comprised stressed and dying adipocytes with reduced transcriptional activity, primarily found in eWAT of mice on a high-fat diet. Overall, the methodology is sound, the writing is clear, and the conclusions drawn are supported by the data presented. Further research based on these findings could pave the way for potential novel interventions in obesity and metabolic disorders, or for similar studies in other tissues or conditions.


      • The authors developed a robust snRNA-seq technique that preserves the integrity of the nucleus and RNA across various tissue types, overcoming the challenges of existing methods.

      • They identified adipocyte subpopulations that follow adaptive or pathological trajectories during obesity.

      • The study reveals depot-specific differences in adipose tissues, which could have implications for targeted therapies.


      • The adipose tissues were collected after 10 weeks of high-fat diet treatment, lacking the intermediate time points for identifying early markers or cell populations during the transition from healthy to pathological adipose tissue.

      • The expansion of the Ad6 subpopulation in obese iWAT and gWAT is interesting. The author claims that Ad6 exhibited a substantial increase in eWAT and a moderate rise in iWAT (Figure 4C). However, this adipocyte subpopulation remains the most altered in iWAT upon obesity. Could the authors elaborate on why there is a scarcity of adipocytes with ROS reporter and B2M in obese iWAT?

      • While the study provides extensive data on mouse models, the potential translation of these findings to human obesity remains uncertain.

    1. eLife assessment

      This valuable study shows how an intersecting network of regulators acting on genes with differences in their RNA metabolism explains why the loss of some regulators of RNAi in C. elegans can selectively impair the silencing of some target genes. The evidence presented is convincing, as the authors use a combination of computational modeling and RNAi assays to support their conclusions.

    2. Reviewer #1 (Public Review):

      The goal of Knudsen-Palmer et al. was to define a biological set of rules that dictate the differential RNAi-mediated silencing of distinct target genes, motivated by facilitating the long-term development of effective RNAi-based drugs/therapeutics. To achieve this, the authors use a combination of computational modeling and RNAi function assays to reveal several criteria for effective RNAi-mediated silencing. This work provides insights into how (1) cis-regulatory elements influence the RNAi-mediated regulation of genes; (2) it is determined that genes can "recover" from RNAi-silencing signals in an animal; and 3) pUGylation occurs exclusively downstream of the dsRNA trigger sequence, suggesting 3º siRNAs are not produced. In addition, the authors show that the speed at which RNAi-silencing is triggered does not correlate with the longevity of the silencing. These insights are significant because they suggest that if we understand the rules by which RNAi pathways effectively silence genes with different transcription/processing levels then we can design more effective synthetic RNAi-based therapeutics targeting endogenous genes. The conclusions of this study are mostly supported by the data, but there are some aspects that need to be clarified.

      (1) The methods do not describe the "aged RNAi plates feeding assay" in Figure 2E. The figure legend states that "aged RNAi plates" were used to trigger weaker RNAi, but the detail explaining the experiment is insufficient. How aged is aged? If the goal was to effectively reduce the dsRNA load available to the animals, why not quantitatively titrate the dsRNA provided? Were worms previously fed on the plates, or was simply a lawn of bacteria grown until presumably the IPTG on the plate was exhausted?

      (2) Is the data presented in Figure 2F completed using the "aged RNAi plates" to achieve the partial silencing of dpy-7 observed? Clarification of this point would be helpful.

      (3) Throughout the manuscript the authors refer to "non-dividing cells" when discussing animals' ability to recover from RNA silencing. It is not clear what the authors specifically mean with the phrase "non-dividing cells", but as this is referred to in one of their major findings, it should be clarified. Do they mean the cells are somatic cells in aged animals, thus if they are "non-dividing" the siRNA pools within the cells cannot be diluted by cell division? Based on the methods, the animals of RNAi assays were L4/Young adults that were scored over 8 days after the initial pulse of dsRNA feeding. If this is the case, wouldn't these animals be growing into gravid adults after the feeding, and thus have dividing cells as they grew?

      (4) What are the typical expression levels/turnover of unc-22 and bli-1? Based on the results from the altered cis-regulatory regions of bli-1 and unc-22 in Figure 5, it seems like the transcription/turnover rates of each of these genes could also be used as a proof of principle for testing the model proposed in Figure 4. The strength of the model would be further increased if the RNAi sensitivity of unc-22 reflects differences in its transcription/turnover rates compared to bli-1.

    3. Reviewer #2 (Public Review):


      This manuscript by Knudsen-Palmer et al. describes and models the contribution of MUT-16 and RDE-10 in the silencing through RNAi by the Argonaute protein NRDE-3 or others. The authors show that MUT-16 and RDE-10 constitute an intersecting network that can be redundant or not depending on the gene being targeted by RNAi. In addition, the authors provide evidence that increasing dsRNA processing can compensate for NRDE-3 mutants. Overall, the authors provide convincing evidence to understand the factors involved in RNAi in C. elegans by using a genetic approach.

      Major Strengths:

      The author's work presents a compelling case for understanding the intricacies of RNA interference (RNAi) within the model organism Caenorhabditis elegans through a meticulous genetic approach. By harnessing genetic manipulation, they delve into the role of MUT-16 and RDE-10 in RNAi, offering a nuanced understanding of the molecular mechanisms at play in two independent case study targets (unc-22 and bli-1).

      Major Weaknesses:

      (1) It is unclear how the molecular mechanisms of amplification are different under the MUT-16 and RDE-10 branches of the regulatory pathway, since they are clearly distinct proteins structurally. It would be interesting to do some small-RNA-seq of products generated from unc-22 and bli-1, on wild-type conditions and some of the mutants studied (eg. mut-16, rde-10 and mut-16 + rde-10). That would provide some insights into whether the products of the 2 amplifications are the same in all conditions, just changing in abundance, or whether they are distinct in sequence patterns.

      (2) In the same line, Figure 5 aims to provide insights into the sequence determinants that influence the RNAi of bli-1. It is unclear whether the changes in transcript stability dictated by the 3'UTR are the sole factor governing the preference for the MUT-16 and RDE-10 branches of the regulatory pathway. In line with the mutant jam297, it might be interesting to test whether factors like codon optimality, splicing, ... of the ORF region upstream from bli-1-dsRNA can affect its sensitivity to the MUT-16 and RDE-10 branches of the regulatory pathway.

    1. eLife assessment

      This work investigated the mechanisms by which sperm DNA is excluded from the meiotic spindle after fertilization. The finding that kinesin-13, katanin and Ataxin-2 proteins are involved in this process is useful in uncovering the mechanisms underlying healthy embryo formation. The overall conclusions of the work are supported by solid evidence obtained by microscopy and RNAi experiments, though more robust data analyses and rescue experiments would have strengthened the study.

    2. Reviewer #1 (Public Review):


      This paper by Beath et. al. identifies a potential regulatory role for proteins involved in cytoplasmic streaming and maintaining the grouping of paternal organelles: holding sperm contents in the fertilized embryos away from the oocyte meiotic spindle so that they don't get ejected into the polar body during meiotic chromosome segregation. The authors show that by time-lapse video, paternal mitochondria (used as a readout for sperm and its genome) is excluded from yolk granules and maternal mitochondria, even when moving long distances by cytoplasmic streaming. To understand how this exclusion is accomplished, they first show that it is independent of both internal packing and the engulfment of the paternal chromosomes by maternal endoplasmic reticulum creating an impermeable barrier. They then test whether the control of cytoplasmic steaming affects this exclusion by knocking down two microtubule motors, Katanin and kinesis I. They find that the ER ring, which is used as a proxy for paternal chromosomes, undergoes extensive displacement with these treatments during anaphase I and interacts with the meiotic spindle, supporting their hypothesis that the exclusion of paternal chromosomes is regulated by cytoplasmic streaming. Next, they test whether a regulator of maternal ER organization, ATX-2, disrupts sperm organization so that they can combine the double depletion of ATX-2 and KLP-7, presumably because klp-7 RNAi (unlike mei-1 RNAi) does not affect polar body extrusion and they can report on what happens to paternal chromosomes. They find that the knockdown of both ATX-2 and KLP-7 produces a higher incidence of what appears to be the capture of paternal chromosomes by the meiotic spindle (5/24 vs 1/25). However, this capture event appears to halt the cell cycle, preventing the authors from directly observing whether this would result in the paternal chromosomes being ejected into the polar body.


      This is a useful, descriptive paper that highlights a potential challenge for embryos during fertilization: when fertilization results in the resumption of meiotic divisions, how are the paternal and maternal genomes kept apart so that the maternal genome can undergo chromosome segregation and polar body extrusion without endangering the paternal genome? In general, the experiments are well-executed and analyzed. In particular, the authors' use of multiple ways to knock down ATX-2 shows rigor.


      The paper makes a case that this regulation may be important but the authors should do some additional work to make this case more convincing and accessible for those outside the field. In particular, some of the figures could include greater detail to support their conclusions, they could explain the rationale for some experiments better and they could perform some additional control experiments with their double depletion experiments to better support their interpretations. Also, the authors' inability to assess the functional biological consequences of the capture of the sperm genome by the oocyte spindle should be discussed, particularly in light of the cell cycle arrest that they observe.

    3. Reviewer #2 (Public Review):


      In this manuscript, Beath et al. use primarily C. elegans zygotes to test the overarching hypothesis that cytoplasmic mechanisms exit to prevent interaction between paternal chromosomes and the meiotic spindle, which are present in a shared zygotic cytoplasm after fertilization. Previous work, much of which by this group, had characterized cytoplasmic streaming in the zygote and the behavior of paternal components shortly after fertilization, primarily the clustering of paternal mitochondria and membranous organelles around the paternal chromosomes. This work set out to identify the molecular mechanisms responsible for that clustering and test the specific hypothesis that the "paternal cloud" helps prevent the association of paternal chromosomes with the meiotic spindle.


      This work is a collection of technical achievements. The data are primarily 3- and 4-channel time-lapse images of zygotes shortly after fertilization, which were performed inside intact animals. There are many instances in which the experiments show extreme technical skill, such as tracking the paternal chromosomes over large displacements throughout the volume of the embryo. The authors employ a wide variety of fluorescent reporters to provide a remarkably clear picture of what is going on in the zygote. These reagents and the novel characterization of these stages that they provide will be widely beneficial to the community.

      The data provide direct visualization of what had previously been a mostly hypothetical structure, the "paternal cloud," using simultaneous labeling of paternal DNA and mitochondria in combination with a variety of maternal proteins including maternal mitochondria, yolk granules, tubulin, and plasma membrane. Together, these images provided convincing evidence of the existence of this specified cytoplasmic domain. They go on to show that the knockdown of the ataxin-2 homolog ALX-2, a protein previously shown to affect ER dynamics, disrupted the paternal cloud, identifying a role for ER organization in this structure.

      The authors then used the system to test the functional consequences of perturbing the cytoplasmic organization. Consistent with the paternal cloud being a stable structure, it stayed intact during large movements the authors generated using previously published knockdowns (of mei-1/katanin and kinesin-13/kpl-7) that increased cytoplasmic streaming. They used this data to document instances in which the paternal chromosomes were likely to have been attached to the spindle. They concluded with direct evidence of spindle fibers connecting to the paternal chromatin upon knockdown of ATX-2 in combination with increased cytoplasmic streaming, providing strong, direct support for their overarching hypothesis.


      While the data is convincing, the narrative of the paper could be streamlined to highlight the novelty of the experiments and better articulate the aims. For example, the cloud of paternal mitochondria and membranous organelles was previously shown, but Figures 1-2 largely reiterate that observation. The innovation seems to be that the combination of ER, yolk, and maternal mitochondrial markers makes the existence of a specified domain more concrete. There are also some instances where more description is needed to make the conclusions from the images clear.

      The manuscript intersperses what read like basic characterizations of fluorescent markers that, as written, can distract from the main story. The authors characterized the dynamics of ER organization throughout the substages of meiosis and the permeability of the envelope of ER that surrounds the paternal chromatin, but it could be more clearly established how the ability to visualize these structures allowed them to address their aims. More background on what was previously known about ER organization in M-phase and the role of ataxin proteins specifically may help provide more continuity.

    4. Reviewer #3 (Public Review):


      This study by Beath et al. investigated the mechanisms by which sperm DNA is excluded from the meiotic spindle after fertilization. Time-lapse imaging revealed that sperm DNA is surrounded by paternal mitochondria and maternal ER that is permeable to proteins. By increasing cytoplasmic streaming using kinesin-13 or katanin RNAi, the authors demonstrated that limiting cytoplasmic streaming in the embryo is an important step that prevents the capture of sperm DNA by the oocyte meiotic spindle. Further experiments showed that the Ataxin-2 protein is required to hold paternal mitochondria together and close to the sperm DNA. Finally, double depletion of kinesin-13 and Ataxin-2 suggested an increased risk of meiotic spindle capture of sperm DNA.

      Overall, this is an interesting finding that could provide a new understanding of how meiotic spindle capture of sperm DNA and its accidental expulsion into the polar body is prevented. However, some conceptual gaps need to be addressed and further experiments and improved data analyses would strengthen the paper.

      • It would be helpful if the authors could discuss in good detail how they think maternal ER surrounds the sperm DNA and why is it not disrupted following Ataxin disruption.

      • Since important phenotypes revealed in RNAi experiments (e.g. kinesin-13 and ataxin-2 double depletion) are not very robust, the authors should consider toning down their conclusions and revising some of their section headings. I appreciate that they are upfront about some limitations, but they do nonetheless make strong concluding sentences.

      • The discussion section could be improved further to present the authors' findings in the larger context of current knowledge in the field.

      • The authors previously demonstrated that F-actin prevents meiotic spindle capture of sperm DNA in this system. However, the current manuscript does not discuss how the katanin, kinesin-13 and Ataxin-2 mechanisms could work together with previously established functions of F-actin in this process.

      • How can the authors exclude off-target effects in their RNAi depletion experiments? Can kinesin-13, katanin, and Ataxin phenotypes be rescued for instance?

      • How are the authors able to determine if the paternal genome was actually captured by the spindle? Does lack of movement definitively suggest capture without using a spindle marker?

    1. eLife assessment

      This important study identifies biallelic variants of DNAH3 in four unrelated infertile men. In addition, it reports that DNAH3 knockout (KO) mice are infertile, and that compromised DNAH3 activity decreases the expression of IDA-associated proteins in the spermatozoa of human patients and the KO mice. Of note, the infertility of both can be rescued by intracytoplasmic sperm injection (ICSI). In aggregate, the work provides solid evidence to demonstrate that DNAH3 is a novel pathogenic gene for asthenoteratozoospermia and male infertility . It will be of substantial interest to clinicians, reproductive counselors, embryologists, and basic researchers working on infertility and assisted reproductive technology.

    2. Joint Public Review:


      The study identified biallelic variants of DNAH3 in four unrelated Han Chinese infertile men through whole-exome sequencing, which contributes to abnormal sperm flagellar morphology and ultrastructure. To investigate the importance of DNAH3 in male infertility, the authors generated crispant DNAH3 knockout (KO) male mice. They observed that KO mice are also infertile, showing a severe reduction in sperm movement with abnormal IDA (inner dynein arms) and mitochondrion structure. Moreover, nonfunctional DNAH3 expression decreased the expression of IDA-associated proteins in the spermatozoa of patients and KO mice, which are involved in the disruption of sperm motility. Interestingly, the infertility of patients and KO mice was rescued by intracytoplasmic sperm injection (ICSI). Taken together, the authors propose that DNAH3 is a novel pathogenic gene for asthenoterozoospermia and male infertility.


      This work investigates the role of DNAH3 in sperm mobility and male infertility and utilised gold-standard molecular biology techniques, showing strong evidence of its role in male infertility. All aspects of the study design and methods are well described and appropriate to address the main question of the manuscript. The conclusions drawn are consistent with the analyses conducted and supported by the data.


      (1) The manuscript lacks a comparison with previous studies on DNAH3 in the Discussion section.

      (2) The variants of DNAH3 in four infertile men were identified through whole-exome sequencing. Providing an overview of the WES data would be beneficial to offer additional insights into whether other variants may contribute the infertility. This could also help explain why ICSI only works for two out of four patients with DNAH3 variants.

      (3) Quantification of images would help substantiate the conclusions, particularly in Figures 2, 3, 4, and 6. Improved images in Figures 3A, 4B, and 4C, would help increase confidence in the claims made.

    1. eLife assessment

      This work presents valuable information on the structure of the spirosome's native extended conformation as the active form of the enzyme aldehyde-alcohol dehydrogenase (AdhE). However, the data supporting this claim are incomplete.

    2. Reviewer #1 (Public Review):


      Clostridium thermocellum serves as a model for consolidated bioprocess (CBP) in lignocellulosic ethanol production, but yet faces limitations in solid contents and ethanol titers achieved by engineered strains thus far. The primary ethanol production pathway involves the enzyme aldehyde-alcohol dehydrogenase (AdhE), which forms long oligomeric structures known as spirosomes, previously characterized via the 3.5 Å resolution E. coli AdhE structure using single-particle cryo-EM. The present study describes the cryo-EM structure of the C. thermocellum ortholog, sharing 62% sequence identity with E. coli AdhE, resolved at 3.28 Å resolution. Detailed comparative structural analysis, including the Vibrio cholerae AdhE structure, was conducted. Integrating cryo-EM data with molecular dynamics simulations indicated that the aldehyde intermediate resides longer in the channel of the extended form, supporting the hypothesis that the extended spirosome represents the active form of AdhE.


      The study conducts a comprehensive structural comparative analysis of oligomerization interfaces and the acetaldehyde channel across compact and extended conformations. Structural and computational results suggest the extended spirosome as the most likely active state of AdhE.


      The overall resolution of the C. thermocellum structure is similar to the E. coli ortholog, which shares 62% sequence identity, and the oligomerization interfaces and the acetaldehyde channel were previously described.

    3. Reviewer #2 (Public Review):


      The manuscript by Ziegler et al, entitled 'Structural characterization and dynamics of AdhE ultrastructure from Clostridium thermocellum: A containment strategy for toxic intermediates?" presents the atomic resolution cryo-EM structure of C. thermocellum AdhE showing that it show dominantly an extended form while E.coli AdhE shows dominantly a compact form. With comparative analysis of their C. thermocellum structure and the previous E.coli AdhE structure, they tried to reveal the mechanism by which C.thermocellum and E.coli show different dominant conformations. In addition, they also analyzed the substrate channel by comparative and computational approaches. Lastly, their computational analysis using CryoDRGN reveals conformational heterogeneity in the sample. Although this manuscript suggests a potential mechanism of the different features of AdhEs, this manuscript is very descriptive and does not provide sufficient data to support the authors' conclusions, which may be due to the lack of experimental data to support their findings from the computational analysis.


      This manuscript provides the first C. thermocellum (Ct) AdhE structure and comparatively analyzed this structure with E.coli AdhE.


      Their main conclusions obtained mostly by computational and comparative analysis are not supported by experimental data.

    4. Reviewer #3 (Public Review):

      This study describes the first structure of Gram-positive bacterial AdhE spirosomes that are in a native extended conformation. All the previous structures of AdhE spirosomes obtained come from Gram-negative bacterial species with native compact spirosomes (E. coli, V. cholerae). In E. coli, AdhE spirosomes can be found in two different conformational states, compact and extended, depending on the substrates and cofactors they are bound to.

      The high-resolution cryoEM structure of the extended C. thermocellum AdhE spirosomes produced in E. coli in an apo state (without any substrate or cofactors) is compared to the E. coli extended and compact AdhE spirosomes structures previously published. The authors have modeled (in Swiss-Model) the structure of compact C. thermocellum AdhE spirosomes, using E. coli compact AdhE spirosome conformation as a template, and performed molecular dynamics simulations. They have identified a channel in which the toxic reaction intermediate aldehyde could transit from the aldehyde dehydrogenase active site to the alcohol dehydrogenase active site, in an analogous manner to E. coli spirosomes. These findings are in line with the hypothesis that the extended spirosomes could correspond to the active form of the enzyme.

      In this work, the authors speculate that the C. thermocellum AdhE spirosomes could switch from the native extended conformation to a compact conformation, in a way that is inverse of E. coli spirosomes. Although attractive, this hypothesis is not supported by the literature. Amazingly, in some Gram-positive bacterial species (S. pneumoniae, S. sanguinis or C. difficile...), AdhE spirosomes are natively extended and have never been observed in a compact conformation. On the opposite, E. coli (and other Gram-negative bacteria) native AdhE spirosomes are compact and are able to switch to an extended conformation in the presence of the cofactors (NAD+, coA, and iron). The data presented as they are now are not convincing to confirm the existence of C. thermocellum AdhE spirosomes in a compact conformation.

    1. eLife assessment:

      This study presents an important finding on durotaxis in various amoeboid cells that is independent of focal adhesions. The evidence supporting the authors' claims is compelling. The work will be of interest to cell biologists and biophysicists working on rigidity sensing, the cytoskeleton, and cell migration.

    2. Reviewer #1 (Public Review):

      In their paper, Kang et al. investigate rigidity sensing in amoeboid cells, showing that, despite their lack of proper focal adhesions, amoeboid migration of single cells is impacted by substrate rigidity. In fact, many different amoeboid cell types can durotax, meaning that they preferentially move towards the stiffer side of a rigidity gradient.

      The authors observed that NMIIA is required for durotaxis and, building on this observation, they generated a model to explain how durotaxis could be achieved in the absence of strong adhesions. According to the model, substrate stiffness alters the diffusion rate of NMAII, with softer substrates allowing for faster diffusion. This allows for NMAII accumulation at the back, which, in turn, results in durotaxis.

      The experiments support the main message of the paper regarding durotaxis by amoeboid cells. In my opinion, a few clarifications on the mechanism proposed to explain this phenomenon could strengthen this research:

      (1) According to your model, the rear end of the cell, which is in contact with softer substrates, will have slower diffusion rates of MNIIA. Does this mean that bigger cells will durotax better than smaller cells because the stiffness difference between front and rear is higher? Is it conceivable to attenuate the slope of the durotactic gradient to a degree where smaller cells lose their ability to durotact, while longer cells retain their capacity for directional movement?

      (2) Where did you place the threshold for soft, middle, and stiff regions (Figure 6)? Is it possible that you only have a linear rigidity gradient in the center of your gel and the more you approach the borders, the flatter the gradient gets? In this case, cells would migrate randomly on uniform substrates. Did you perform AFM over the whole length of the gel or just in the central part?

      (3) In which region (soft, middle, stiff) did you perform all the cell tracking of the previous figures?

      (4) What is the level of confinement experienced by the cells? Is it possible that cells on the soft side of the gels experience less confinement due to a "spring effect" whereby the coverslips descending onto the cells might exert diminished pressure because the soft hydrogels act as buffers, akin to springs? If this were the case, cells could migrate following a confinement gradient.

    3. Reviewer #2 (Public Review):

      Summary:<br /> The authors developed an imaging-based device that provides both spatial confinement and stiffness gradient to investigate if and how amoeboid cells, including T cells, neutrophils, and Dictyostelium, can durotax. Furthermore, the authors showed that the mechanism for the directional migration of T cells and neutrophils depends on non-muscle myosin IIA (NMIIA) polarized towards the soft-matrix-side. Finally, they developed a mathematical model of an active gel that captures the behavior of the cells described in vitro.


      The topic is intriguing as durotaxis is essentially thought to be a direct consequence of mechanosensing at focal adhesions. To the best of my knowledge, this is the first report on amoeboid cells that do not depend on FAs to exert durotaxis. The authors developed an imaging-based durotaxis device that provides both spatial confinement and stiffness gradient and they also utilized several techniques such as quantitative fluorescent speckle microscopy and expansion microscopy. The results of this study have well-designed control experiments and are therefore convincing.


      Overall this study is well performed but there are still some minor issues I recommend the authors address:

      (1) When using NMIIA/NMIIB knockdown cell lines to distinguish the role of NMIIA and NMIIB in amoeboid durotaxis, it would be better if the authors took compensatory effects into account.<br /> (2) The expansion microscopy assay is not clearly described and some details are missed such as how the assay is performed on cells under confinement.<br /> (3) In this study, an active gel model was employed to capture experimental observations. Previously, some active nematic models were also considered to describe cell migration, which is controlled by filament contraction. I suggest the authors provide a short discussion on the comparison between the present theory and those prior models.<br /> (4) In the present model, actin flow contributes to cell migration while myosin distribution determines cell polarity. How does this model couple actin and myosin together?

    1. eLife assessment

      This manuscript presents important observations on the early changes in calcium signaling, TMEM16a activation, and mitochondrial dysfunction in salivary gland cells in an inflammation murine model of autoimmune Sjögren's disease. Convincing changes are shown in saliva release, calcium signaling, TMEM16a activation, mitochondrial function, and sub-cellular morphology of the endoplasmic reticulum following DMXAA treatment. The work will be of strong interest to physiologists working on secretion, calcium signaling, and mitochondria.

    2. Reviewer #1 (Public Review):


      The authors address cellular mechanisms underlying the early stages of Sjogren's syndrome, using a mouse model in which 5,6-Dimethyl-9-oxo-9H-xanthene-4-acetic acid (DMXAA) is applied to stimulate the interferon gene (STING) pathway. They show that, in this model, salivary secretion in response to neural stimulation is greatly reduced, even though individual secretory cell calcium responses were enhanced. They attribute the secretion defect to reduced activation of Ca2+ -activated Cl- channels (TMEM16a), due to an increased distance between Ca2+ release channels (IP3 receptors) and TMEM16a which is expected to reduce the [Ca2+] sensed by TMEM16a. A variety of disruptions in mitochondria were also observed after DMXAA treatment, including reduced abundance, altered morphology, depolarization, and reduced oxygen consumption rate. The results of this study shed new light on some of the early events leading to the loss of secretory function in Sjogren's syndrome, at a time before inflammatory responses cause the death of secretory cells.


      Two-photon microscopy enabled Ca2+ measurements in the salivary glands of intact animals in response to physiological stimuli (nerve stimulation). This approach has been shown previously by the authors as necessary to preserve the normal spatiotemporal organization of calcium signals that lead to secretion under physiological conditions.

      Superresolution (STED) microscopy allowed precise measurements of the spacing of IP3R and TMEM16a and the cell membranes that would otherwise be prevented by the diffraction limit. The measured increase of distance (from 84 to 155 nm) would be expected to reduce [Ca2+] at the TMEM16a channel.

      The authors effectively ruled out a variety of alternative explanations for reduced secretion, including changes in AQP5 expression, TMEM16a expression, localization, and Ca2+ sensitivity as indicated by Cl- current in response to defined levels of Ca2+.


      While the Ca2+ distribution in the cells was less restricted to the apical region in DMXAA-treated cells, it is not clear that this is relevant to the reduced activation of TMEM16a. The way in which the change in Ca2+ distribution is quantified (apical/basal ratio) is not informative, as this is not what activates TMEM16a, but rather the local [Ca2+] at the channel.

      Despite the decreased level of secretion, Ca2+ signal amplitudes were higher in the treated cells, raising the question of how much this might compensate for the increased distance between IP3R and TMEM16a. The authors assume that the increased separation of IP3R and TMEM16a (and the resulting decrease in local [Ca2+]) outweighed the effect of higher global [Ca2+], but this important point was not addressed.

      The description of mitochondrial changes in abundance, morphology, membrane potential, and oxygen consumption rate were not well integrated into the rest of the paper. While they may be a facet of the multiple effects of STING activation and may occur during Sjogren's syndrome, their possible role in reducing secretion was not examined. As it stands, the mitochondrial results are largely descriptive and there is no evidence here that they contribute to the secretory phenotype.

    3. Reviewer #2 (Public Review):


      This manuscript describes a very eloquent study of disrupted stimulus-secretion coupling in salivary acinar cells in the early stages of an animal model (DMXAA) of Sjogren's syndrome (SS). The study utilizes a range of technically innovative in vivo imaging of Ca signaling, in vivo salivary secretion, patch clamp electrophysiology to assess TMEM16a activity, immunofluorescence and electron microscopy, and a range of morphological and functional assays of mitochondrial function. Results show that in mice with DMXAA-induced Sjogren's syndrome, there was a reduced nerve-stimulation-induced salivary secretion, yet surprisingly the nerve-stimulation-induced Ca signaling was enhanced. There was also a reduced carbachol (CCh)-induced activation of TMEM16a currents in acinar cells from DMXAA-induced SS mice, whereas the intrinsic Ca-activated TMEM16a currents were unaltered, further supporting that stimulus-secretion coupling was impaired. Consistent with this, high-resolution STED microscopy revealed that there was a loss of close physical spatial coupling between IP3Rs and TMEM16a, which may contribute to the impaired stimulus-secretion coupling. Furthermore, the authors show that the mitochondria were both morphologically and functionally impaired, suggesting that bioenergetics may be impaired in salivary acinar cells of DMXAA-induced SS mice.


      Overall, this is an outstanding manuscript, that will have a huge impact on the field. The manuscript is beautifully well-written with a very clear narrative. The experiments are technically innovative, very well executed, and with a logical design The data are very well presented and appropriately analyzed and interpreted.

    4. Reviewer #3 (Public Review):


      The pathomechanism underlying Sjögren's syndrome (SS) remains elusive. The authors have studied if altered calcium signaling might be a factor in SS development in a commonly used mouse model. They provide a thorough and straightforward characterization of the salivary gland fluid secretion, cytoplasmic calcium signaling, mitochondrial morphology, and respiration. A special strength of the study is the spectacular in vivo imaging, very few if any groups could have succeeded with the studies. The authors show that the cytoplasmic calcium signaling is upregulated in the SS model and the Ca2+ regulated Cl- channels are normally localized and function, but still fluid secretion is suppressed. They also find altered localization of the IP3R and speculate about lesser exposure of Cl- channels to high local [Ca2+]. In addition, they describe changes in mitochondrial morphology and function that might also contribute to the attenuated secretory response. Although the exact contribution of calcium and mitochondria to secretory dysfunction remains to be determined, the results seem to be useful for a range of scientists.

      Specific points to consider:

      (1) Are all the effects of DMXAA mediated through STING? DMXAA has been reported to inhibit NAD(P)H quinone oxidoreductase (NQO1) PMID: 10423172, which might be relevant both for the calcium and mitochondrial phenotypes. I would recommend that the authors either test the dependency of the DMXAA effects on STING or avoid attributing all effects of DMXAA to STING.

      (2) "mitochondrial membrane potential (ΔΨm), the driving force of ATP production" the driving force is the electrochemical H+ gradient.

      (3) ΔΨm is assessed as decreased in the DMXAA model without a change in TMRE steady state. Higher post-uncoupler fluorescence caused a lesser uncoupler-sensitive pool. This is not a very common observation. Was the autofluorescence of the DMXAA-treated cells higher in the red channel?

      (4) The EM study indicated ER structure disruption. Are there any clues to the contribution of this to the augmented agonist/electrical stimulation-induced calcium signaling and decreased fluid secretion?

    1. eLife assessment

      Gain-of-function mutations and amplifications of PPM1D are found across several human cancers and are associated with advanced tumor stage and worse prognosis. Thus far, the clinical translation has not been possible due to the lack of PPM1D inhibitors with favorable pharmacokinetic properties. This useful study leverages CRISPR/Cas9 screening to determine that loss of SOD1 and is synthetic lethal with PPM1D mutation in leukemia. The mechanistic analyses are still incomplete.

    1. eLife assessment

      This important study expands our understanding of the role of two axon guidance factors in a specific axon guidance decision. The strength of the study is the compelling axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor.

    2. Reviewer #1 (Public Review):


      The current manuscript provides an extensive in vivo analysis of two guidance pathways identifying multiple mechanisms that shape the bifurcation of DRG axons when forming the dorsal funiculus in the DREZ.


      Multiple mouse mutant lines were used, together with complementary techniques; the results are very clear and compelling.<br /> The findings are very significant and clearly move forward our understanding of the regulation of axonal development at the DREZ.


      No major weaknesses were found. As it is I have no recommendations that would increase the clarity or quality of the manuscript.

    3. Reviewer #2 (Public Review):


      In this manuscript, the authors conduct a detailed analysis of the molecular cues that control guidance of bifurcated dorsal root ganglion axons in a key region of the spinal cord called the dorsal funiculus. This is a specific case of axon guidance that occurs in a precise way. The authors knew that Slit was important but many axons still target correctly in Slit knockouts, suggesting a role for other guidance factors. Netrin1 is also expressed in this region, so they looked at netrin mutants. The authors found axons outside the DREZ in the Ntn1 mutants, and they show by single neuron genetic labeling that many of these come from DRG neurons. Quantified axonal tracing studies in Slit1/2, Ntn1, or triple mutant embryos supports the idea that Slit and Ntr1 have distinct functions in guidance and that the effect of their loss is additive. Interestingly none of these knockouts affect bifurcation itself but rather the guidance of one or both of the bifurcated axon terminals. Knockout of the Slit receptors (Robo1/2) or the Netrin 1 receptor (DCC) in embryos causes similar guidance defects to loss of the ligands, providing an additional confirmation of the requirement for both guidance pathways. This study expands understanding of the role of the axon guidance factors Ntr1/DCC and Slit/Robo in a specific axon guidance decision. The strength of the study is the careful axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor.

    4. Reviewer #3 (Public Review):


      In this paper, Curran et al investigate the role of Ntn, Slit1 and Slit 2 in axon patterning of DRG neurons. The paper uses mouse genetics to perturb each guidance molecule and its corresponding receptor. Cre-based approaches and immunostaining of DRG neurons are used to assess the phenotypes. Overall, the study uses the strength of mouse genetics and imaging to reveal new genetic modifiers of DRG axons. The conclusions of the experiments match the presented results. The paper is an important contribution to the field, as evidence that dorsal funiculus formation is impacted by Ntn and Slit signaling. The paper clearly demonstrates molecules that impact the patterning of the dorsal funiculus formation, which can provide a foundation for future studies on the specific steps in that patterning that require the studied molecules.


      The manuscript uses the advantage of mouse genetics to investigate axon patterning of DRG neurons. The work does a great job of assessing individual phenotypes in single and double mutants. This reveals an intriguing cooperative and independent function of Ntn, Slit1 and Slit2 in DRG axon patterning. The sophisticated triple mutant analysis is lauded and provides important insight.


      Overall, the manuscript is sound in technique and analysis. While not a weakness, the paper provides the foundation for future studies that investigate the specific molecular mechanisms of each step in the patterning of the dorsal funiculus.

    5. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):


      The current manuscript provides an extensive in vivo analysis of two guidance pathways identifying multiple mechanisms that shape the bifurcation of DRG axons when forming the dorsal funiculus in the DREZ. 


      Multiple mouse mutant lines were used, together with complementary techniques; the results are very clear and compelling. 

      The findings are very significant and clearly move forward our understanding of the regulation of axonal development at the DREZ. 


      No major weaknesses were found. As it is I have no recommendations that would increase the clarity or quality of the manuscript. 

      Reviewer #2 (Public Review):


      In this manuscript, the authors conduct a detailed analysis of the molecular cues that control the guidance of bifurcated dorsal root ganglion axons in a key region of the spinal cord called the dorsal funiculus. This is a specific case of axon guidance that occurs in a precise way. The authors knew that Slit was important but many axons still target correctly in Slit knockouts, suggesting a role for other guidance factors. Netrin1 is also expressed in this region, so they looked at netrin mutants. The authors found axons outside the DREZ in the Ntn1 mutants, and they show by single-neuron genetic labeling that many of these come from DRG neurons. Quantified axonal tracing studies in Slit1/2, Ntn1, or triple mutant embryos support the idea that Slit and Ntr1 have distinct functions in guidance and that the effect of their loss is additive. Interestingly none of these knockouts affect bifurcation itself but rather the guidance of one or both of the bifurcated axon terminals. Knockout of the Slit receptors (Robo1/2) or the Netrin 1 receptor (DCC) in embryos causes similar guidance defects to loss of the ligands, providing additional confirmation of the requirement for both guidance pathways. 


      This study expands understanding of the role of the axon guidance factors Ntr1/DCC and Slit/Robo in a specific axon guidance decision. The strength of the study is the careful axonal labeling and quantification, which allows the authors to establish precise consequences of the loss of each guidance factor or receptor. 


      There are some places in the text where the discussion of these data is compared with other studies and models, but additional details would help clarify the arguments. 

      The details were added to the first section of Discussion in the revision to address this weakness.  Also see the response to the recommendations below.

      Reviewer #3 (Public Review):


      In this paper, Curran et al investigate the role of Ntn, Slit1, and Slit 2 in the axon patterning of DRG neurons. The paper uses mouse genetics to perturb each guidance molecule and its corresponding receptor. Cre-based approaches and immunostaining of DRG neurons are used to assess the phenotypes. Overall, the study uses the strength of mouse genetics and imaging to reveal new genetic modifiers of DRG axons. The conclusions of the experiments match the presented results. The paper is an important contribution to the field, as evidence that dorsal funiculus formation is impacted by Ntn and Slit signaling. However, there are some potential areas of the manuscript that should be edited to better match the results with the conclusions of the work. 


      The manuscript uses the advantage of mouse genetics to investigate the axon patterning of DRG neurons. The work does a great job of assessing individual phenotypes in single and double mutants. This reveals an intriguing cooperative and independent function of Ntn, Slit1, and Slit2 in DRG axon patterning. The sophisticated triple mutant analysis is lauded and provides important insight. 


      Overall, the manuscript is sound in technique and analysis. However, the majority of the manuscript is about the dorsal funiculus and not the bifurcation of the axons, as the title would make a reader believe. Further, the manuscript would provide a more scholarly discussion of the current knowledge of DRG axon patterning and how their work fits into that knowledge. 

      We revised the title as suggested.  Additional discussion of DRG axon growth at the DREZ is added to the last section of the Discussion in the revision.  Also see the response to the recommendations below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Given the reasons stated above, I have no specific recommendations for the authors. 

      There is a typo in the Abstract (... mice with triple deletion of Ntn1, Slit2, and Slit2....). 

      Corrected in the revision.

      Reviewer #2 (Recommendations For The Authors):

      (1) The authors twice repeated that their data on DRG guidance defects in the Ntn1 mutants differ from studies previously published in references 19 and 26. However it is unclear to me, without having read those other studies, what is actually different between this study and those, and why there would be differences between the results from two groups. If the authors think this is an important point to make they need to more clearly say what the other group saw and offer an explanation of why the data may be different. 

      We added detailed comparison of the defects from different studies to the first section of the Discussion and suggested multiple roles of Ntn1 in controlling sensory axon growth at the DREZ in the revision.

      (2) In the final section of the discussion it says, "The guidance regulation of DRG axon bifurcation by Slit and Ntn1 may be similar to but overshadowed by their function in midline guidance [43]." The meaning of this sentence was unclear to me. I had been thinking that since there are total knockout embryos (not conditional) there could be patterning effects that happen before the DRG branching that influence the formation of the DREZ. Is this what the authors mean to say here? How can the authors show that the guidance factors they have knocked out are actually functioning in the DRG neurons? 

      We agree with the reviewer that the first sentence is vague, so we edited the paragraph and included the discussion of the regulation of DRG axons at the DREZ, which was the main theme of this last section.  In addition, we agree with the reviewer’s suggestion of the possible indirect role of Ntn1 on DRG axons via the control of interneuron migration.  This possibility was included in the last paragraph of the Discussion.

      (3) In several of the figures (3T, 5I, 5J) there are distance measurements that are presumably averages of multiple axons in 3 or 4 embryos because 3-4 points are shown per graph. However, the figure and methods do not say how many axons were measured per embryo and I could not find if it says these numbers are averages. Clarifying the details of these panels would be useful. 

      The n is the number of animals analyzed and is now added to the figure legends.  From each animal, multiple sections (2-4) were analyzed for various parameters in Fig. 3 and 5.  This information was added to the Method section of the revision.

      Reviewer #3 (Recommendations For The Authors):

      Overall the data matches the conclusions in the paper. However, to this reviewer, the title suggests that Ntn and Slit will have defects in bifurcation. This is not the presented phenotype. I recommend the authors change the title to better reflect the findings of the work. 

      We edited the title of the revised manuscript to reflect the control of growth direction in the context of bifurcation.  

      The introduction of the work clearly outlines what is known about DREZ formation in mice but could extend its discussion to other systems like chick and zebrafish (Jaeda Coutinho-Budd et al. 2008, Wang and Scott 2000, Golding et al 1997, Nichols and Smith 2019, Kikel-Coury et al 2021). These studies are particularly important given that pioneer events, including bifurcation, can be visualized. Acknowledging the contribution of other model systems to the understanding of DRG axon patterning is important to improve the scholarly discussion of the paper. 

      We added more detailed discussion of the current knowledge of DRG axon growth at the DREZ from several relevant studies of the rodent and zebrafish models in the last section of Discussion.

      In the data presented, the authors see defects in the axon patterning of DRG neurons and conclude it is a defect in the dorsal funiculus formation. Another interpretation is that a subset of axons cannot invade the spinal cord boundary properly. This phenotype was observed in zebrafish with timelapse imaging (Kikel-Coury et al 2021). It may not be necessary to specifically test the axons' ability to enter the spinal cord in this paper, but the possibility that this could drive the presented phenotypes should be more clearly stated in the results. Entry is not thoroughly addressed in this paper and would need to be confirmed by labeling the edge of the spinal cord with a second reporter. No entry would obviously impact axon targeting. However, delayed entry could place the axon in a navigation environment that is atypical, causing it to navigate aberrantly and present as a funiculus phenotype. 

      We thank the reviewer for raising this very interesting point.  In our present view, dorsal funiculus formation is related to DRG axon patterning, which involves growth, guidance, and bifurcation of the incoming afferents at the dorsal spinal cord.  We believe that these events are highly coordinated by various environmental cues to generate the DREZ and the dorsal funiculus.  The defects we observed could result from the disruption of such coordination that leads to misregulation of DRG axon entry at the dorsal spinal cord, as suggested by the reviewer.  We propose that further analysis by time-lapse imaging as done in zebrafish would provide better understanding of such coordination.  This discussion was included in the last section of Discussion. 

      The authors should clarify that their approach does not knock out molecules in a cell-specific way. This would specifically impact the interpretation of the Dcc phenotypes. It is possible that UNC-40/DCC is guiding cells that are not labeled. The non-autonomous role of UNC-40/DCC should be clearly stated as a possibility. 

      This discussion was added to the last paragraph of the Discussion section.

    1. eLife assessment

      This study presents an important finding on the structural role of glycosylation at position N343 of the SARS-CoV-2 spike protein's receptor-binding domain in maintaining its stability, with implications across different variants of concern. The evidence supporting the claims of the authors is convincing, since appropriate and validated methodology in line with current state-of-the-art has been approached. The work will be of interest to evolutionary virologists.

    2. Reviewer #2 (Public Review):

      The authors sought to establish the role played by N343 glycosylation on the SARS-CoV-2 S receptor binding domain structure and binding affinity to the human host receptor ACE2 across several variants of concern. The work includes both computational analysis in the form of molecular dynamics simulations and experimental binding assays between the RBD and ganglioside receptors.

      The work extensively samples the conformational space of the RBD beginning with atomic coordinates representing both the bound and unbound states and computes molecular dynamics trajectories until equilibrium is achieved with and without removing N343 glycosylation. Through comparison of these simulated structures, the authors are able to demonstrate that N343 glycosylation stabilizes the RBD. Prior work had demonstrated that glycosylation at this site plays an important role in shielding the RBD core and in this work the authors demonstrate that removal of this glycan can trigger a conformational change to reduce water access to the core without it. This response is variant dependent and variants containing interface substitutions which increase RBD stability, including Delta substitution L452R, do not experience the same conformational change when the glycan is removed. The authors also explore structures corresponding to Alpha and Beta in which no structure-reinforcing substitutions were identified and two Omicron variants in which other substitutions with an analogous effect to L452R are present.

      The authors experimentally assessed these inferred structural changes by measuring the binding affinity of the RBD for the oligosaccharides of the monosialylated gangliosides GM1os and GM2os with and without the glycan at N343. While GM1os and GM2os binding is influenced by additional factors in the Beta and Omicron variants, the comparison between Delta and Wuhan-hu-1 is clear: removal of the glycan abrogated binding for Wuhan-hu-1 and minimally affected Delta as predicted by structural simulations.

      In summary, these findings suggest, in the words of the authors, that SARS-CoV-2 has evolved to render the N-glycosylation site at N343 "structurally dispensable". This study emphasizes how glycosylation impacts both viral immune evasion and structural stability which may in turn impact receptor binding affinity and infectivity. Mutations which stabilize the antigen may relax the structural constraints on glycosylation opening up avenues for subsequent mutations which remove glycans and improve immune evasion. This interplay between immune evasion and receptor stability may support complex epistatic interactions which may in turn substantially expand the predicted mutational repertoire of the virus relative to expectations which do not take into account glycosylation.

    3. Reviewer #3 (Public Review):


      The receptor binding domain of SARS-Cov-2 spike protein contains two N-glycans which have been conserved the variants observed in these last 4 years. Through the use of extensive molecular dynamics, the authors demonstrate that even if glycosylation is conserved, the stabilization role of glycans at N343 differs among the strains. They also investigate the effect of this glycosylation on the binding of RBD towards sialylated gangliosides, also as a function of evolution


      The molecular dynamics characterization is well performed and demonstrates differences on the effect of glycosylation as a factor of evolution. The binding of different strains to human gangliosides shows variations of strong interest. Analyzing structure function of glycans on SARS-Cov-2 surface as a function of evolution is important for the surveillance of novel variants, since it can influence their virulence.


      The revised article does not hold significant weaknesses

    4. Author response:

      The following is the authors’ response to the original reviews.

      We are thankful to all reviewers and to you for your careful analysis of our work and for the feedback you all provided. The reviews were fundamentally positive with very minor modifications suggested, which we have addressed in this new version as follows.

      (1) We changed Figure 1 to include a high resolution image of the 3D structure of the low affinity complex between the RBD and the GM1 tetrasaccharide (GM1os), see panel d. We predicted this structure through extensive sampling through MD simulations as part of earlier work aimed at guiding the resolution of a crystal structure. Due to insurmountable difficulties in the crystallization of such complex the work was only published as an extended abstract(Garozzo, Nicotra, and Sonnino 2022). Following one of the reviewer’s suggestions we added all the details on the computational approach we used as Supplementary Material.

      (2) We added the comment and corresponding references to the Discussion section in relation to earlier work flagged by one of the Reviewers (Rochman et al. 2022) “Further to this, our results show that taking into consideration the effects on _N-_glycosylation on protein structural stability and dynamics in the context of specific protein sequences may be key to understanding epistatic interactions among RBD residues, which would be otherwise very difficult, where not impossible, to decipher.”


      Garozzo, Domenico, Francesco Nicotra, and Sandro Sonnino. 2022. “‘Glycans and Glycosylation in SARS-COV2 Infection’ Session at the XVII Advanced School in Carbohydrate Chemistry, Italian Chemical Society. July 4th -7th 2021, Pontignano (Si), Italy.” Glycoconjugate Journal 39 (3): 327–34.

      Rochman, Nash D., Guilhem Faure, Yuri I. Wolf, Peter L. Freddolino, Feng Zhang, and Eugene V. Koonin. 2022. “Epistasis at the SARS-CoV-2 Receptor-Binding Domain Interface and the Propitiously Boring Implications for Vaccine Escape.” MBio 13 (2): e0013522.

    1. Author response:

      eLife assessment

      This study presents potentially valuable insights into the role of climbing fibers in cerebellar learning. The main claim is that climbing fiber activity is necessary for optokinetic reflex adaptation, but is dispensable for its long-term consolidation. There is evidence to support the first part of this claim, though it requires a clearer demonstration of the penetrance and selectivity of the manipulation. However, support for the latter part of the claim is incomplete owing to methodological concerns, including unclear efficacy of longer-duration climbing fiber activity suppression.

      We sincerely appreciate the thoughtful feedback provided by the reviewer regarding our study on the role of climbing fibers in cerebellar learning. Each point raised has been carefully considered, and we are committed to addressing them comprehensively. We acknowledge the importance of addressing methodological concerns, particularly regarding the efficacy of long-term suppression of CF activity, as well as ensuring clarity regarding penetrance and selectivity of our manipulation. To this end, we have outlined plans for substantial revisions to the manuscript to adequately address these issues.

      Public Reviews:

      Reviewer #1 (Public Review):


      The study by Seo et al highlights knowledge gaps regarding the role of cerebellar complex spike (CS) activity during different phases of learning related to optokinetic reflex (OKR) in mice. The novelty of the approach is twofold: first, specifically perturbing the activity of climbing fibers (CFs) in the flocculus (as opposed to disrupting communication between the inferior olive (IO) and its cerebellar targets globally); and second, examining whether disruption of the CS activity during the putative "consolidation phase" following training affects OKR performance.

      The first part of the results provides adequate evidence supporting the notion that optogenetic disruption of normal CF-Purkinje neuron (PN) signaling results in the degradation of OKR performance. As no effects are seen in OKR performance in animals subjected to optogenetic irradiation during the memory consolidation or retrieval phases, the authors conclude that CF function is not essential beyond memory acquisition. However, the manuscript does not provide a sufficiently solid demonstration that their long-term activity manipulation of CF activity is effective, thus undermining the confidence of the conclusions.


      The main strength of the work is the aim to examine the specific involvement of the CF activity in the flocculus during distinct phases of learning. This is a challenging goal, due to the technical challenges related to the anatomical location of the flocculus as well as the IO. These obstacles are counterbalanced by the use of a well-established and easy-to-analyse behavioral model (OKR), that can lead to fundamental insights regarding the long-term cerebellar learning process.


      The impact of the work is diminshed by several methodological shortcomings.

      Most importantly, the key finding that prolonged optogenetic inhibition of CFs (for 30 min to 6 hours after the training period) must be complemented by the demonstration that the manipulation maintains its efficacy. In its current form, the authors only show inhibition by short-term optogenetic irradiation in the context of electrical-stimulation-evoked CSs in an ex vivo preparation. As the inhibitory effect of even the eNpHR3.0 is greatly diminished during seconds-long stimulations (especially when using the yellow laser as is done in this work (see Zhang, Chuanqiang, et al. "Optimized photo-stimulation of halorhodopsin for long-term neuronal inhibition." BMC biology 17.1 (2019): 1-17. ), we remain skeptical of the extent of inhibition during the long manipulations. In short, without a demonstration of effective inhibition throughout the putative consolidation phase (for example by showing a significant decrease in CS frequency throughout the irradiation period), the main claim of the manuscript of phase-specific involvement of CF activity in OKR learning can not be considered to be based on evidence.

      Second, the choice of viral targeting strategy leaves gaps in the argument for CF-specific mechanisms. CaMKII promoters are not selective for the IO neurons, and even the most precise viral injections always lead to the transfection of neurons in the surrounding brainstem, many of which project to the cerebellar cortex in the form of mossy fibers (MF). Figure 1Bii shows sparsely-labelled CFs in the flocculus, but possibly also MFs. While obtaining homogenous and strong labeling in all floccular CFs might be impossible, at the very least the authors should demonstrate that their optogenetic manipulation does not affect simple spiking in PNs.

      Finally, while the paper explicitly focuses on the effects of CF-evoked complex spikes in the PNs and not, for example, on those mediated by molecular layer interneurons or via direct interaction of the CF with vestibular nuclear neurons, it would be best if these other dimensions of CF involvement in cerebellar learning were candidly discussed.

      We appreciate the thorough review and recognize both the strengths and weaknesses highlighted.

      We concur with the reviewer’s assessment of the novelty of our approach, particularly in specifically perturbing the activity of CF in the flocculus and examining the effects during different phases of learning. Also the usage of OKR behavior paradigm adds strength to our study by providing a well-established model for investigating cerebellar learning processes.

      Regarding concerns about the efficacy of long-term optogenetic inhibition and the specificity of viral targeting, we are committed to addressing these issues through additional experiments. Specifically, we aim to demonstrate sustained inhibition of CF transmission by verifying the maintenance of inhibition throughout the putative consolidation phase. This may involve monitoring CF activity during the irradiation period in vivo. Furthermore, we plan to provide further characterization of viral targeting to ensure specificity of our approach.  

      Additionally, we recognize the importance of discussing alternative mechanisms of CF involvement in cerebellar learning. Hence, we will expand the manuscript to provide more comprehensive discussion of these dimensions of CF function to provide a clearer understanding of the broader implications of our findings.

      Reviewer #2 (Public Review):


      The authors aimed to explore the role of climbing fibers (CFs) in cerebellar learning, with a focus on optokinetic reflex (OKR) adaptation. Their goal was to understand how CF activity influences memory acquisition, memory consolidation, and memory retrieval by optogenetically suppressing CF inputs at various stages of the learning process.


      The study addresses a significant question in the cerebellar field by focusing on the specific role of CFs in adaptive learning. The authors use optogenetic tools to manipulate CF activity. This provides a direct method to test the causal relationship between CF activity and learning outcomes.


      Despite shedding light on the potential role of CFs in cerebellar learning, the study is hampered by significant methodological issues that question the validity of its conclusions. The absence of detailed evidence on the effectiveness of CF suppression and concerns over tissue damage from optogenetic stimulation weakens the argument that CFs are not essential for memory consolidation. These challenges make it difficult to confirm whether the study's objectives were fully met or if the findings conclusively support the authors' claims. The research commendably attempts to unravel the temporal involvement of CFs in learning but also underscores the difficulties in pinpointing specific neural mechanisms that underlie the phases of learning. Addressing these methodological issues, investigating other signals that might instruct consolidation, and understanding CFs' broader impact on various learning behaviors are crucial steps for future studies.

      We appreciate the reviewer’s recognition of the significance of our study in addressing the fundamental question of the role of CF in adaptive learning within the cerebellar field. The use of optogenetic tools indeed provides a direct means to investigate the causal relationship between CF activity and learning outcomes.

      To address concerns regarding the effectiveness of CF suppression during consolidation, we plan to conduct further in-vivo recordings. These will demonstrate how reliably CF transmission can be suppressed through optogenetic manipulation over an extended period.

      In response to the concern about potential tissue damage from laser stimulation, we believe that our optogenetic manipulation was not strong enough to induce significant heat-induced tissue damage in the flocculus. According to Cardin et al. (2010), light applied through an optic fiber may cause critical damage if the intensity exceeds 100 mW, which is eight times stronger than the intensity we used in our OKR experiment. Furthermore, if there had been tissue damage from chronic laser stimulation, we would expect to see impaired long-term memory reflected in abnormal gain retrieval results tested the following day. However, as shown in Figures 2 and 3, there were no significant abnormalities in consolidation percentages even after the optogenetic manipulation.

      Finally, we appreciate the reviewer’s recognition of the challenges involved in pinpointing specific neural mechanisms. We plan to expand the discussion to address these complexities and outline future research directions.

    3. Reviewer #1 (Public Review):


      The study by Seo et al highlights knowledge gaps regarding the role of cerebellar complex spike (CS) activity during different phases of learning related to optokinetic reflex (OKR) in mice. The novelty of the approach is twofold: first, specifically perturbing the activity of climbing fibers (CFs) in the flocculus (as opposed to disrupting communication between the inferior olive (IO) and its cerebellar targets globally); and second, examining whether disruption of the CS activity during the putative "consolidation phase" following training affects OKR performance.

      The first part of the results provides adequate evidence supporting the notion that optogenetic disruption of normal CF-Purkinje neuron (PN) signaling results in the degradation of OKR performance. As no effects are seen in OKR performance in animals subjected to optogenetic irradiation during the memory consolidation or retrieval phases, the authors conclude that CF function is not essential beyond memory acquisition. However, the manuscript does not provide a sufficiently solid demonstration that their long-term activity manipulation of CF activity is effective, thus undermining the confidence of the conclusions.


      The main strength of the work is the aim to examine the specific involvement of the CF activity in the flocculus during distinct phases of learning. This is a challenging goal, due to the technical challenges related to the anatomical location of the flocculus as well as the IO. These obstacles are counterbalanced by the use of a well-established and easy-to-analyse behavioral model (OKR), that can lead to fundamental insights regarding the long-term cerebellar learning process.


      The impact of the work is diminshed by several methodological shortcomings.

      Most importantly, the key finding that prolonged optogenetic inhibition of CFs (for 30 min to 6 hours after the training period) must be complemented by the demonstration that the manipulation maintains its efficacy. In its current form, the authors only show inhibition by short-term optogenetic irradiation in the context of electrical-stimulation-evoked CSs in an ex vivo preparation. As the inhibitory effect of even the eNpHR3.0 is greatly diminished during seconds-long stimulations (especially when using the yellow laser as is done in this work (see Zhang, Chuanqiang, et al. "Optimized photo-stimulation of halorhodopsin for long-term neuronal inhibition." BMC biology 17.1 (2019): 1-17. ), we remain skeptical of the extent of inhibition during the long manipulations. In short, without a demonstration of effective inhibition throughout the putative consolidation phase (for example by showing a significant decrease in CS frequency throughout the irradiation period), the main claim of the manuscript of phase-specific involvement of CF activity in OKR learning can not be considered to be based on evidence.

      Second, the choice of viral targeting strategy leaves gaps in the argument for CF-specific mechanisms. CaMKII promoters are not selective for the IO neurons, and even the most precise viral injections always lead to the transfection of neurons in the surrounding brainstem, many of which project to the cerebellar cortex in the form of mossy fibers (MF). Figure 1Bii shows sparsely-labelled CFs in the flocculus, but possibly also MFs. While obtaining homogenous and strong labeling in all floccular CFs might be impossible, at the very least the authors should demonstrate that their optogenetic manipulation does not affect simple spiking in PNs.

      Finally, while the paper explicitly focuses on the effects of CF-evoked complex spikes in the PNs and not, for example, on those mediated by molecular layer interneurons or via direct interaction of the CF with vestibular nuclear neurons, it would be best if these other dimensions of CF involvement in cerebellar learning were candidly discussed.

    4. Reviewer #2 (Public Review):


      The authors aimed to explore the role of climbing fibers (CFs) in cerebellar learning, with a focus on optokinetic reflex (OKR) adaptation. Their goal was to understand how CF activity influences memory acquisition, memory consolidation, and memory retrieval by optogenetically suppressing CF inputs at various stages of the learning process.


      The study addresses a significant question in the cerebellar field by focusing on the specific role of CFs in adaptive learning. The authors use optogenetic tools to manipulate CF activity. This provides a direct method to test the causal relationship between CF activity and learning outcomes.


      Despite shedding light on the potential role of CFs in cerebellar learning, the study is hampered by significant methodological issues that question the validity of its conclusions. The absence of detailed evidence on the effectiveness of CF suppression and concerns over tissue damage from optogenetic stimulation weakens the argument that CFs are not essential for memory consolidation. These challenges make it difficult to confirm whether the study's objectives were fully met or if the findings conclusively support the authors' claims. The research commendably attempts to unravel the temporal involvement of CFs in learning but also underscores the difficulties in pinpointing specific neural mechanisms that underlie the phases of learning. Addressing these methodological issues, investigating other signals that might instruct consolidation, and understanding CFs' broader impact on various learning behaviors are crucial steps for future studies.

      This important study combines experiments that rely on the use of target-agnostic memory B cell sorting and screening approaches and thorough characterization of antibodies with specificities to the sexual stages of Plasmodium falciparum. The authors present solid findings that one antibody, B1E11K, is cross-reactive with multiple proteins containing glutamate-rich repeats. B1E11k binds to the repeats through homotypic interactions, similar to what has been observed for Plasmodium circumsporozoite protein repeat-directed antibodies. Despite the importance of the findings beyond the field of malaria, the writing, in several places, lacks clarity.

    2. Reviewer #1 (Public Review):


      In this paper, the authors used target agnostic MBC sorting and activation methods to identify B cells and antibodies against sexual stages of Plasmodium falciparum. While they isolated some Mabs against PFs48/45 and PFs230, two well-known candidates for "transmission blocking" vaccines, these antibodies' efficacies, as measured by TRA, did not perform as well as other known antibodies. They also isolated one cross-reactive mAb to proteins containing glutamic acid-rich repetitive elements, that express at different stages of the parasite life cycle. They then determined the structure of the Fab with the highest protein binder they could determine through protein microarray, RESA, and observed homotypic interactions.


      - Target agnostic B cell isolation (although not a novel methodology).<br /> - New cross-reactive antibody and mechanism (homotypic interactions) as demonstrated by structural data and other biophysical data.


      The paper lacks clarity at times and could benefit from more transparency (showing all the data) and explanations.<br /> In particular:<br /> -define SIFA<br /> -define TRAbs<br /> -it is not possible to read the Supplementary Figure 6B and C panels.

    3. Reviewer #2 (Public Review):

      This manuscript by Amen, Yoo, Fabra-Garcia et al describes a human monoclonal antibody B1E11K, targeting EENV repeats which are present in parasite antigens such as Pfs230, RESAs, and 11.1. The authors isolated B1E11K using an initial target agnostic approach for antibodies that would bind gamete/gametocyte lysate which they made 14 mAbs. Following a suite of highly appropriate characterization methods from Western blotting of recombinant proteins to native parasite material, use of knockout lines to validate specificity, ITC, peptide mapping, SEC-MALS, negative stain EM, and crystallography, the authors have built a compelling case that B1E11K does indeed bind EENV repeats. In addition, using X-ray crystallography they show that two B1E11K Fabs bind to a 16 aa RESA repeat in a head-to-head conformation using homotypic interactions and provide a separate example from CSP, of affinity-matured homotypic interactions.

      There are some minor comments and considerations identified by this reviewer, These include that one of the main conclusions in the paper is the binding of B1E11K to RESAs which are blood stage antigens that are exported to the infected parasite surface. It would have been interesting if immunofluorescence assays with B1E11K mAb were performed with blood-stage parasites to understand its cellular localization in those stages.

    4. Reviewer #3 (Public Review):

      The manuscript from Amen et al reports the isolation and characterization of human antibodies that recognize proteins expressed at different sexual stages of Plasmodium falciparum. The isolation approach was antigen agnostic and based on the sorting, activation, and screening of memory B cells from a donor whose serum displays high transmission-reducing activity. From this effort, 14 antibodies were produced and further characterized. The antibodies displayed a range of transmission-reducing activities and recognized different Pf sexual stage proteins. However, none of these antibodies had higher TRA than previously described antibodies.

      The authors then performed further characterization of antibody B1E11K, which was unique in that it recognized multiple proteins expressed during sexual and asexual stages. Using protein microarrays, B1E11K was shown to recognize glutamate-rich repeats, following an EE-XX-EE pattern. An impressive set of biophysical experiments was performed to extensively characterize the interactions of B1E11K with various repeat motifs and lengths. Ultimately, the authors succeeded in determining a 2.6 A resolution crystal structure of B1E11K bound to a 16AA repeat-containing peptide. Excitingly, the structure revealed that two Fabs bound simultaneously to the peptide and made homotypic antibody-antibody contacts. This had only previously been observed with antibodies directed against CSP repeats.

      Overall I found the manuscript to be very well written, although there are some sections that are heavy on field-specific jargon and abbreviations that make reading unnecessarily difficult. For instance, 'SIFA' is never defined. Strengths of the manuscript include the target-agnostic screening approach and the thorough characterization of antibodies. The demonstration that B1E11K is cross-reactive to multiple proteins containing glutamate-rich repeats, and that the antibody recognizes the repeats via homotypic interactions, similar to what has been observed for CSP repeat-directed antibodies, should be of interest to many in the field.

      This study presents an important study of the relationship between morphogen signaling and cell fate choices in the forming zebrafish neural tube, addressing a topical question in developmental biology. The authors provide a solid characterization of the precision limit for gene regulatory networks interpreting Shh, with single-cell resolution and state-of-the-art in vivo approaches. However, the analyses are at times incomplete and would benefit from a higher number of cell traces. With the analyses strengthened, this work will be of interest to developmental biologists interested in cellular decision-making.

    2. Reviewer #1 (Public Review):

      Throughout the paper, the authors do a fantastic job of highlighting caveats in their approach, from image acquisition to analysis. Despite this, some conclusions and viewpoints portrayed in this study do not appear well-supported by the provided data. Furthermore, there are a few technical points regarding the analysis that should be addressed.

      (1) Analysis of signaling traces

      - Relevance of "modeled signaling level": It is not clear whether this added complexity and potential for error (below) provides benefits over a more simple analysis such as taking the derivative (shown in Figure 3C). Could the authors provide evidence for the benefits? For example, does the "maximal response" given a simpler metric correlate less well with cell fate than that calculated from the fitted response?

      - Assumptions for "modeled signaling level": According to equation (1) Kaede levels are monotonically increasing. This is assumed given the stability of the fluorescent protein. However, this only holds for the "totally produced Kaede/fluorescence". Other metrics such as mean fluorescence can very well decrease over time due to growth and division. Does "intensity" mean total fluorescence? Visual inspection of the traces shown in Figure 2 suggests that "fluorescence intensity" can decrease. What does this mean for the inferred traces?

      - Estimation of Kaede reporter half-live: It is not clear how the mRNA stability of Kaede is estimated. It sounds like it was just assessed visually, which seems not entirely appropriate given the quantitative aspects of the rest of the study. Also, given that Shh signaling was inhibited on the level of Smoothened, it is not obvious how the dynamics of signaling shutdown affect the estimate. Most results in Figure 7 seem to be quite robust to the estimate of the half-live. That they are, might suggest that the whole analysis is unnecessary in the first place. However, not all are. Thus, it would be important to make this estimate more quantitative.

      (2) Assignment of fates and correlations

      - Error estimate for cell-type assignment: Trying to correlate signaling traces to cell fate decisions requires accurate cell fate assignment post-tracking. The provided protocol suggests a rather manual, expert-directed process of making those decisions. Can the authors provide any error-bound on those decisions, for example comparing the results obtained by two experts or something comparable? I am particularly concerned about the results regarding the higher degree of variability in the correlation between signaling dynamics and cell fate in the posterior neural tube. Here, the expression of Olig2 does not seem to segregate between different assigned fates, while it does so nicely in the anterior neural tube. This would suggest to me that cells in the posterior neural tube might not yet be fully committed to a fate or that there could be a relatively high error rate in assigning fates. Thus, the results could emerge from technical errors or differences in pure timing. Could the authors please comment on these possibilities?

      - Clustering and fates: One approach the authors use to analyze the correlation between signaling and fate is clustering of cell traces and comparison of the fate distributions in those clusters. There is a large number of clusters with only single traces, suggesting that the data (number of traces) might not be sufficient for this analysis. Furthermore, I am skeptical about clustering cells of different anterior-posterior identities together, given potential differences in the timing of signal reception and signaling. I am not convinced that this analysis reveals enough about how signaling maps to fate given the heterogeneity in traces in large clusters and the prevalence of extremely small clusters.

      - Signaling vector and hand-picked metrics: As an alternative approach, that might be better suited for their data, the authors then pick three metrics (based on their model-predicted signaling dynamics) and show that the maximal response is a very good predictor of fate for different anterior-posterior identities. Previous information-theoretic analysis of signaling dynamics has found that a whole time-vector of signaling can carry much more information than individual metrics (Selimkhanov et al, 2014, PMID: 25504722). Have the authors tried to use approaches that make use of the whole trace (such as simple classifiers (Granados et al, 2018, PMID: 29784812), or can comment on why this is not feasible for their data? The authors should at least make clear that their results present a lower bound to how accurately cells can make cell-fate decisions based on signaling dynamics.

      (3) Consequences of signaling heterogeneity

      The authors focus heavily on portraying that signaling dynamics are highly variable, which seems visually true at first glance. However, there is no metric used or a description given of what this actually means. Mainly, the variability seems to relate to the correlation between signaling and fate. However, given the data and analysis, I would argue that the decoding of signaling dynamics into fate is surprisingly accurate. So signaling dynamics that seem quite noisy and variable by visual inspection can actually be very well discriminated by cells, which to me appears very exciting.

      Indeed, simple features of signaling traces can predict cell fate as well as position (for anterior progenitors). Given that signaling should be a function of position, it naively seems as if signaling read-out could be almost perfect. It might be interesting to plot dorsal-ventral position vs the signaling metrics, to also investigate how Shh concentration/position maps to signaling dynamics, this would give an even more comprehensive view of signal transmission.

      There remains the discrepancy between signaling traces and fate in the posterior neural tube. The authors point towards differences in tissue architecture and difficulties in interpreting a "small" Shh gradient. However, the data seems consistent with differences in timing of cell-fate decisions between anterior and posterior cells. The authors show that fate does initially not correlate well with position in the posterior neural tube. So, signaling dynamics should likely also not, as they should rather be a function of position, given they are downstream of the Shh gradient. As mentioned above, not even Olig2 expression does segregate the assigned fates well. All this points towards a difference in the time of fate assignment between the anterior and posterior. Given likely delays in reporter protein production and maturation, it can thus not be expected that signaling dynamics correlate better with cell fate than the reporter "83%". Can the authors please discuss this possibility in the paper?

      Thus, while this paper represents an example of what the community needs to do to gain a better understanding of robust patterning under variability, the provided data is not always sufficient to make clear conclusions regarding the functional consequences of signaling dynamics.

    3. Reviewer #2 (Public Review):


      In this work, Xiong and colleagues examine the relationship between the profile of the morphogen Shh and the resulting cell fate decisions in the zebrafish neural tube. For this, the authors combine high-resolution live imaging of an established Shh reporter with reporter lines for the different progenitor types arising in the forming neural tube. One of the key observations in this manuscript is that, while, on average, cells respond to differences in Shh activity to adopt distinct progenitor fates, at the single cell level there is strong heterogeneity between Shh response and fate choices. Further, the authors showed that this heterogeneity was particularly prominent for the pMN fate, with similar Shh response dynamics to those observed in neighboring LFP progenitors.


      It is important to directly correlate Shh activity with the downstream TFs marking distinct progenitor types in vivo and with single cell resolution. This additional analysis is in line with previous observations from these authors, namely in Xiong, 2013. Further, the authors show that cells in different anterior-posterior positions within the neural tube show distinct levels of heterogeneity in their response to Shh, which is a very interesting observation and merits further investigation.


      This is a convincing work, however, adding a few more analyses and clarifications would, in my view, strengthen the key finding of heterogeneity between Shh response and the resulting cell fate choices.

      The authors address key assumptions underlying current models of the formation of value-based decisions. They provide solid evidence that the subjective values human participants assign to choice options change across sequences of multiple decisions and establish valuable methods to detect these changes in frequently used behavioral task designs. That said, the description of the fMRI results requires further elaboration in order to support the claim that the authors' algorithm reveals neural valuation processes better than the current standard approach.

    2. Reviewer #1 (Public Review):


      There is a long-standing idea that choices influence evaluation: options we choose are re-evaluated to be better than they were before the choice. There has been some debate about this finding, and the authors developed several novel methods for detecting these re-evaluations in task designs where options are repeatedly presented against several alternatives. Using these novel methods the authors clearly demonstrate this re-evaluation phenomenon in several existing datasets.


      The paper is well-written and the figures are clear. The authors provided evidence for the behaviour effect using several techniques and generated surrogate data (where the ground truth is known) to demonstrate the robustness of their methods.


      The description of the results of the fMRI analysis in the text is not complete: weakening the claim that their re-evaluation algorithm better reveals neural valuation processes.

    3. Reviewer #2 (Public Review):


      Zylberberg and colleagues show that food choice outcomes and BOLD signal in the vmPFC are better explained by algorithms that update subjective values during the sequence of choices compared to algorithms based on static values acquired before the decision phase. This study presents a valuable means of reducing the apparent stochasticity of choices in common laboratory experiment designs. The evidence supporting the claims of the authors is solid, although currently limited to choices between food items because no other goods were examined. The work will be of interest to researchers examining decision-making across various social and biological sciences.


      The paper analyses multiple food choice datasets to check the robustness of its findings in that domain.

      The paper presents simulations and robustness checks to back up its core claims.


      To avoid potential misunderstandings of their work, I think it would be useful for the authors to clarify their statements and implications regarding the utility of item ratings/bids (e-values) in explaining choice behavior. Currently, the paper emphasizes that e-values have limited power to predict choices without explicitly stating the likely reason for this limitation given its own results or pointing out that this limitation is not unique to e-values and would apply to choice outcomes or any other preference elicitation measure too. The core of the paper rests on the argument that the subjective values of the food items are not stored as a relatively constant value, but instead are constructed at the time of choice based on the individual's current state. That is, a food's subjective value is a dynamic creation, and any measure of subjective value will become less accurate with time or new inputs (see Figure 3 regarding choice outcomes, for example). The e-values will change with time, choice deliberation, or other experiences to reflect the change in subjective value. Indeed, most previous studies of choice-induced preference change, including those cited in this manuscript, use multiple elicitations of e-values to detect these changes. It is important to clearly state that this paper provides no data on whether e-values are more or less limited than any other measure of eliciting subjective value. Rather, the paper shows that a static estimate of a food's subjective value at a single point in time has limited power to predict future choices. Thus, a more accurate label for the e-values would be static values because stationarity is the key assumption rather than the means by which the values are elicited or inferred.

      There is a puzzling discrepancy between the fits of a DDM using e-values in Figure 1 versus Figure 5. In Figure 1, the DDM using e-values provides a rather good fit to the empirical data, while in Figure 5 its match to the same empirical data appears to be substantially worse. I suspect that this is because the value difference on the x-axis in Figure 1 is based on the e-values, while in Figure 5 it is based on the r-values from the Reval algorithm. However, the computation of the value difference measure on the two x-axes is not explicitly described in the figures or methods section and these details should be added to the manuscript. If my guess is correct, then I think it is misleading to plot the DDM fit to e-values against choice and RT curves derived from r-values. Comparing Figures 1 and 5, it seems that changing the axes creates an artificial impression that the DDM using e-values is much worse than the one fit using r-values.

      Relatedly, do model comparison metrics favor a DDM using r-values over one using e-values in any of the datasets tested? Such tests, which use the full distribution of response times without dividing the continuum of decision difficulty into arbitrary hard and easy bins, would be more convincing than the tests of RT differences between the categorical divisions of hard versus easy.

      Revaluation and reduction in the imprecision of subjective value representations during (or after) a choice are not mutually exclusive. The fact that applying Reval in the forward trial order leads to lower deviance than applying it in the backwards order (Figure 7) suggests that revaluation does occur. It doesn't tell us if there is also a reduction in imprecision. A comparison of backwards Reval versus no Reval would indicate whether there is a reduction in imprecision in addition to revaluation. Model comparison metrics and plots of the deviance from the logistic regression fit using e-values against backward and forward Reval models would be useful to show the relative improvement for both forms of Reval.

      Did the analyses of BOLD activity shown in Figure 9 orthogonalize between the various e-value- and r-value-based regressors? I assume they were not because the idea was to let the two types of regressors compete for variance, but orthogonalization is common in fMRI analyses so it would be good to clarify that this was not used in this case. Assuming no orthogonalization, the unique variance for the r-value of the chosen option in a model that also includes the e-value of the chosen option is the delta term that distinguishes the r and e-values. The delta term is a scaled count of how often the food item was chosen and rejected in previous trials. It would be useful to know if the vmPFC BOLD activity correlates directly with this count or the entire r-value (e-value + delta). That is easily tested using two additional models that include only the r-value or only the delta term for each trial.

      Please confirm that the correlation coefficients shown in Figure 11 B are autocorrelations in the MCMC chains at various lags. If this interpretation is incorrect, please give more detail on how these coefficients were computed and what they represent.

      The paper presents the ceDDM as a proof-of-principle type model that can reproduce certain features of the empirical data. There are other plausible modifications to bounded evidence accumulation (BEA) models that may also reproduce these features as well or better than the ceDDM. For example, a DDM in which the starting point bias is a function of how often the two items were chosen or rejected in previous trials. My point is not that I think other BEA models would be better than the ceDDM, but rather that we don't know because the tests have not been run. Naturally, no paper can test all potential models and I am not suggesting that this paper should compare the ceDDM to other BEA processes. However, it should clearly state what we can and cannot conclude from the results it presents.

      This work has important practical implications for many studies in the decision sciences that seek to understand how various factors influence choice outcomes. By better accounting for the context-specific nature of value construction, studies can gain more precise estimates of the effects of treatments of interest on decision processes. That said, there are limitations to the generalizability of these findings that should be noted.

      These limitations stem from the fact that the paper only analyzes choices between food items and the outcomes of the choices are not realized until the end of the study (i.e., participants do not eat the chosen item before making the next choice). This creates at least two important limitations. First, preferences over food items may be particularly sensitive to mindsets/bodily states. We don't yet know how large the choice deltas may be for other types of goods whose value is less sensitive to satiety and other dynamic bodily states. Second, the somewhat artificial situation of making numerous choices between different pairs of items without receiving or consuming anything may eliminate potential decreases in the preference for the chosen item that would occur in the wild outside the lab setting. It seems quite probable that in many real-world decisions, the value of a chosen good is reduced in future choices because the individual does not need or want multiples of that item. Naturally, this depends on the durability of the good and the time between choices. A decrease in the value of chosen goods is still an example of dynamic value construction, but I don't see how such a decrease could be produced by the ceDDM.

      This landmark paper introduces the generation and analysis of a connectome resource of the entire ventral nerve cord of a fruit fly which is one of the top model organisms to investigate how a nervous system forms and functions. The work introduces new and improved approaches - from tissue preparation to automated reconstruction - to generate a detailed connectome from a complex adult ventral nerve cord. This extensive new dataset provides cell type and lineage annotations, putative neurotransmitter expression information, and the potential to link to genetic driver lines, with compelling evidence to support the claims made.

    2. Reviewer #1 (Public Review):


      Drosophila is one of the most studied model organisms to understand how neural circuits form and function to control intricate animal behaviors. The ventral nerve cord (VNC) part of the fly's CNS serves as a sensory processing and motor output center just like our spinal cord. Over the last decade, the VNC has become a fruitful platform to understand neural circuits responsible for motor behavior such as walking and flying. The missing resource was the complete connectome of the VNC neurons. This study provides this needed resource. The authors documented their approaches on how to generate the data from tissue preparation to computer-assisted reconstruction in a simple manner and left the in-depth analysis of the network features of the connecting neurons to two other well-written companion articles.

      Strengths:<br /> Unlike many other previously published EM datasets, the authors presented a ready-to-view connectome dataset of the adult fly VNC. Readers, without needing permission, can access the dataset to find their neurons of interest and determine their synaptic partners with a few clicks. The authors also share their novel approaches in a detailed manner for others to reproduce similar EM volumes for other tissues.


      The reconstruction completion, around 50%, might be considered a weakness. However, the data appear to have ~ %50 completion across all different neuropils suggesting that sampling is homogenous and does not induce bias. Nevertheless, a higher percentage will give a more complete picture.

    3. Reviewer #2 (Public Review):


      Takemura et al. achieved a milestone in connectomics with their dense reconstruction of the Male Adult Nerve Cord (MANC) in Drosophila, revealing the neural circuitry of the primary premotor and motor domains in the CNS of the fruit fly. The team meticulously reconstructed neuron morphologies and synaptic connections and registered these data with light microscopy datasets (of driver lines for example), made neuronal lineage annotations and neurotransmitter predictions, providing the basis for new hypotheses about motor control. A description of the dataset and methods are presented here, while cell type annotations and characterisation of connectivity between brain descending neurons and motor neurons are provided in two companion papers, Marin et al. and Cheong, Eichler, Stürner et al., respectively. This dataset and analysis will provide a rich resource for future neuroscientific exploration.


      The authors fully utilise a wealth of tools and techniques developed over the course of over a decade to produce a new publicly available dataset with an impressive number of reconstructed neurons and synapses. The precision and recall of connections are as high or higher than past datasets (e.g. the Hemibrain), pointing to the reliability of any downstream analyses performed on this connectome. These data are augmented with neurotransmitter identities, providing essential information for modelling and computational analysis. The MANC connectome can also be linked to genetic tools through registration to pre-existing light microscopy datasets, allowing experimentalists to test hypotheses made based on the connectome.


      This dataset presents the nerve cord connectome of just a single animal, so connectivity variability and validity will be hard to assess. However, it is bilaterally reconstructed, which does allow comparison between bilaterally symmetrical neurons on the left and right sides of the nerve cord, increasing confidence in connections observed on both sides. Damage occurred to the nerves during sample preparation, which will have to be considered when analysing sensory connectivity.

      Work described in this manuscript reveals the importance of the zinc transporter SLC30A1 in the antimicrobial function of macrophages, specifically against Salmonella. Cell-targeted deletion of the zinc transporter increased susceptibility of mice to systemic infection with Salmonella, leading to decreases in several cell functions such as nos2 expression. The authors argue that zinc homeostasis promotes macrophage cell function that is not conductive to the intracellular proliferation of Salmonella. This study provides novel and supportive evidence for a new pathway in nutritional immunity.

    2. Reviewer #1 (Public Review):

      This is an important and very well conducted study providing novel evidence on the role of zinc homeostasis for the control of infection with the intracellular bacterium S. typhimurium also disentangling the underlying mechanisms and providing clear evidence on the importance of spatio-temporal distribution of (free) zinc within the cell.


      It would be important to provide more information on the genotype of mice. It is rather unlikely that C57Bl6 mice survive up to two weeks after i.p. injection of 1x10E5 bacteria.

      To be sure that macrophages Slc30A1 fl/fl LysMcre mice really have an impaired clearance of bacteria it would be important to rule out an effect of Slc30A1 deletion of bacterial phagocytosis and containment (f.e. evaluation of bacterial numbers after 30 min of infection).

      Does the addition of zinc to macrophages negatively affect iNOS transcription as previously observed for the divalent metal iron and is a similar mechanism also employed (CEBPß/NF-IL6 modulation) (Dlaska M et al. J Immunol 1999)?

      How does Zinc or TPEN supplementation to bacteria in LB medium affect the log growth of Salmonella?

    3. Reviewer #2 (Public Review):

      This paper explores the importance of zinc metabolism in host defense against the intracellular pathogen Salmonella Typhimurium. Using conditional mice with a deletion of the Slc30a1 zinc exporter, the authors show a critical role for zinc homeostasis in the pathogenesis of Salmonella. Specifically, mice deficient in Slc30a1 gene in LysM+ myeloid cells are hypersusceptible to Salmonella infection, and their macrophages show alter phenotypes in response to Salmonella. The study adds important new information on the role metal homeostasis plays in microbe host interactions. Despite the strengths, the manuscript has some weaknesses. The authors conclude that lack of slc30a1 in macrophages impairs nos2-dependent anti-Salmonella activity. However, this idea is not tested experimentally. In addition, the research presented on Mt1 is preliminary. The text related to Figure 7 could be deleted without affecting the overall impact of the findings.

    4. Reviewer #3 (Public Review):

      Na-Phatthalung et al observed that transcripts of the zinc transporter Slc30a1 was upregulated in Salmonella-infected murine macrophages and in human primary macrophages therefore they sought to determine if, and how, Slc30a1 could contribute to the control of bacterial pathogens. Using a reporter mouse the authors show that Slc30a1 expression increases in a subset of peritoneal and splenic macrophages of Salmonella-infected animals. Specific deletion of Slc30a1 in LysM+ cells resulted in a significantly higher susceptibility of mice to Salmonella infection which, counter to the authors conclusions, is not explained by the small differences in the bacterial burden observed in vivo and in vitro. Although loss of Slc30a1 resulted in reduced iNOS levels in activated macrophages, the study lacks experiments that mechanistically link loss of NO-mediated bactericidal activity to Salmonella survival in Slc30a1 deficient cells. The additional deletion of Mt1, another zinc binding protein, resulted in even lower nitrite levels of activated macrophages but only modest effects on Salmonella survival. By combining genetic approaches with molecular techniques that measure variables in macrophage activation and the labile zinc pool, Na-Phattalung et al successfully demonstrate that Slc30a1 and metallothionein 1 regulate zinc homeostasis in order to modulate effective immune responses to Salmonella infection. The authors have done a lot of work and the information that Slc30a1 expression in macrophages contributes to control of Salmonella infection in mice is a new finding that will be of interest to the field. Whether the mechanism by which SLC30A1 controls bacterial replication and/or lethality of infection involves nitric oxide production by macrophages remains to be shown.

      Serotonin is an important neurotransmitter and its synaptic concentration is controlled by re-uptake by the sodium-coupled serotonin transporter SERT. The manuscript by Chan et al reports results from a systematic deep mutagenesis approach to study the surface expression and APP+ (5HT analogue) transport mechanism of the human serotonin transporter. The authors complement this experimental evidence with large-scale molecular simulations of the transporter in the presence of APP+. The use of deep mutagenesis and large-scale adaptive sampling simulations is impressive, and could contribute to understanding the structural requirements for folding and how transporters evolve to recognize different substrates.

    2. Reviewer #1 (Public Review):

      Sertonin is an important neurotransmitter and it synaptic concentration is controlled by re-uptake by the sodium-coupled serotonin transporter SERT. In this paper, some 6000 mutations of SERT were made and tested for surface expression and uptake of a serotonin analogue APP+. The SERT mutants were analysed and compared to the SERT structure and dynamics based on MD simulations. The authors have concluded that mutations located on surface exposed regions are tolerated whilst those involved in packing and structural integrity are not. Gain-of-function mutations map onto regions that in most cases favour opening of a solvent-exposed intracellular vestibule. Closure of the intracellular gate is thought to be rate-limiting to the transport cycle, and thus the evolutionary-based screen is consistent with the clustering of gain-of-function mutations.

      Strengths:<br /> This paper using a large unbiased data-set to probe the evolution of the serotonin transporter SERT for the substrate APP+. They have been able to compare both localisation and transport data, which is an interesting data-set. Using MD simulations they are further able to provide some rationale basis for the gain-of-function mutants.

      Weaknesses:<br /> They can only detect surface expression of myc-tagged SERT based on conjugation with a fluorescent anti-myc antibody. As such, they cannot distinguish between SERT mutants that abolish expression vs. those that are no longer trafficking to the plasma membrane. This is a downside, as it would have been interesting to know the fraction of SERT mutations disrupt trafficking. Indeed, the relationship between misfolding and targeting is poorly understood beyond the calnexin- calreticulin cycle. Furthermore, there seems to be a gap between the large-scale mutagenesis data and the MD simulations in which the main mechanistic conclusions seem to be based on (carried out in a separate publication). Thus, overall while the mutation data-set is impressive its not clear how this aids to our mechanistic understanding of SERT.

    3. Reviewer #2 (Public Review):

      The manuscript by Chan et al reports results of a systematic mutagenesis approach to study the surface expression and APP+ transport mechanism of serotonin transporter. They complement this experimental evidence with large-scale molecular simulations of the transporter in the presence of APP+. The use of deep mutagenesis and large-scale adaptive sampling simulations is impressive and could be very exciting contributions to the field.

      On the whole, the results appear to provide a fascinating insight into the effects of mutations on transport mechanisms, and how those interrelate with the structural fold and biophysical properties of a dynamic protein and its substrate pathways. A weakness of the conclusions based on the molecular simulation is that it relies on comparison with previously-published work involving non-identical simulation systems (i.e. different protonation states).

      Conclusions in this work about the origins of the sodium:serotonin 1:1 stoichiometry should also be considered in the context of the fact that there are two sodium ions bound in the structures of SERT, and more work is needed to explain why this ion is not also released/co-transported.

      Some of the methods require additional information to be provided to be reproducible, for example, for the Transition Path Theory results, and so it is not possible to assess these conclusions with the manuscript in its current form.

    4. Reviewer #3 (Public Review):

      The results of the deep mutagenesis screen represent a wealth of information on the expression and function of SERT that everyone studying this protein will appreciate. However, as the authors explain, the screen identified mutations that increased APP+ transport but inhibited transport of the cognate substrate, 5-HT. Because of the methods used, 5-HT could not be used as a substrate, somewhat limiting the usefulness of the screen.

      However, the authors have taken advantage of this limitation to address the mechanistic features of SERT that discriminate between 5-HT and APP+. From the position of mutations that augment APP+ transport, they have identified the aqueous pathway created in inward facing SERT conformations as a region of importance. Based on the MD simulations, transition to inward facing conformations is facilitated by 5-HT but less so by APP+. The authors conclude, quite reasonably, that mutations interfering with the stability of inward-closed SERT states could overcome the reduced ability of APP+ to open the pathway.

      Another reasonable conclusion based on the mutant screen, is that mutations detrimental to surface expression were found in packed hydrophobic regions of the protein, but similar mutations in the permeation pathways were less likely to decrease expression. The authors postulate that this provides an evolutionary advantage by maintaining the structural fold while allowing modification of ion and substrate binding and coupling sites, a reasonable but speculative conclusion.

      Not all gain-of-function mutations have to be specific to APP+. The authors point out that Ala173Gly converts SERT to the residue found in NET and DAT at this position. It would have been interesting to know how this mutation and others affect 5-HT transport. Indeed, the lack of any 5-HT transport measurements with the mutants is a glaring weakness of the manuscript.

      The authors provide a high quality genome of the xenacoelomorph worm Xenoturbella bocki and discuss its structure and evolution. Understanding the genomic structure of this group provides important insights into bilaterian evolution. The authors make a solid case that the data they present can support the placement of Xenacoelomorpha within the deuterostomes rather than as a sister group to all other bilaterians, but do not unequivocally reject the competing scenario.

    2. Reviewer #1 (Public Review):

      The authors report a high-quality genome assembly for a member of Xenacoelomorpha, a taxon that is at the center of the last remaining great controversies in animal evolution. The taxon and the species in question have "jumped around" the animal tree of life over the past 25 years, and seemed to have found their place as a sister-group to all remaining bilaterians. This hypothesis posits that the earliest split within Bilateria includes Xenacoelomorpha on the one hand and a clade known as Nephrozoa (Protostomia + Deuterostomia) on the other, and is thus referred to as the Nephrozoa hypothesis. Nephrozoa is supported by phylogenomic evidence, by a number of synapomorphic morphological characters in the Nephrozoa (namely, the presence of nephridia) and lack of some key bilaterian characters in Xenacoelomorpha, and by the presence of unique miRNAs in Nephrozoa.

      The Nephrozoa hypothesis has been challenged several times by the authors' groups who alternatively suggest placing Xenacoelomorpha within Deuterostomia as a sister group to a clade known as Ambulacraria. This hypothesis (the Xenambulacraria hypothesis) is supported by alternative phylogenomic datasets and by the shared presence of a number of unique molecular signatures. In this contribution, the authors aim to strengthen their case by providing full genome data for Xenoturbella bocki.<br /> The actual sequencing and analysis are technically and methodologically excellent. Some of the analyses were done several years ago using approaches that may now seem obsolete, but there is no reason not to include them. As a detailed report of a newly sequenced genome, the manuscript meets the highest standards.

      The authors emphasize a number of key findings. One is the fact that the genome is not as simple as one might expect from a "basal" taxon, and is on par with other bilaterian genomes and even more complex than the genome of secondarily simplified bilaterians. There is an implicit expectation here that the sister group to all Bilateria would represent the primitive state. This is of course not true, and the authors are aware of this, but it sometimes feels as though they are using this implicit assumption as a straw dog argument to say that since the genome is not as simple as expected, X. bocki must be nested within Bilateria. The authors get around this by acknowledging that their finding is consistent with a "weak version of the Nephrozoa hypothesis", which is essentially the Nephrozoa phylogenetic hypothesis without implicit assumptions of simplicity.

      Another finding is a refutation of the miRNA data supporting Nephrozoa. This is an important finding although it is somewhat flogging a dead horse, since there is already a fair amount of skepticism about the validity of the miRNA data (now over 20 years old) for higher-level phylogenetics.

      The finding that the authors feel is most important is gene presence-absence data that recovers a topology in which X. bocki is sister to Abulacraria. The problem is that the same tree does not support the monophyly of Xenacoelomorpha. This may be an artifact of fast evolving acoel genomes, as the authors suggest, but it still raises questions about the robustness of the data.

      In sum, the authors' results and analyses leave an open window for the Xenambulacraria hypothesis, but do not refute the Nephrozoa hypothesis. The manuscript is a valuable contribution to the debate but does not go a significant way towards its resolution.<br /> The manuscript has gone through several rounds of review and revision on a preprint server and is thus fairly clear of typos, inconsistencies and lack of clarity. The authors are honest and open in their interpretation of the results and their strengths.

    3. Reviewer #2 (Public Review):

      The manuscript describes the genome assembly and analysis of Xenoturbella bocki, a worm that bears many morphological features ascribed to basal bilateria. The authors aim to analyse this genome in an attempt to determine the phylogenetic position of X. bocki as a representative of Xenacoelomorpha and its associated acoelomorphs. In doing so, they want to inform the debate as to whether xenacoelomorph belong among, or is in fact paraphyletic to all bilaterians.

      This paper presents a high-quality assembly of the X. bocki genome. By virtue of the phylogenetic position of this species, this genome has considerable scientific interest. This assembly appears to be highly complete and is a strength of the paper. The further characterisation of the genome is well executed and presented. Solid results from this paper include a comprehensive description of the Hox genes, miRNA and neruopeptide repertoire, as well as a description of the linkage group and how they relate to the ancestral linkage groups.

      Where this paper is weaker is that for the central claims and questions of this paper, i.e,. the question of the phylogenetic position of xenacoelomorph and whether X. bocki is a slowly evolving, but otherwise representative member of this clade, remains insufficiently resolved.

      The authors have achieved the goal of describing the X. bocki genome very well. By contrast, it is unclear, based on the presented evidence, whether xenacoelomorph is truly a monophyletic group. The balance of the evidence seems to suggest that the X. bocki genome belongs within the bilateria group. However, it is unclear as to what is driving the position of the other acoels. Assumign that X. bocki and the other two species in that group are monophyletic, then the evidence will favour the authors' conclusion (but without clearly rejecting the alternatives).

      This paper will likely further animate the debate regarding this basal species, and also questions related to the ancestral characters of bilateria as a whole. In particular the results from the HOX and paraHOX clusters, may provide an interesting counterpoint to the previous results based on the acoels.

      The study presents a valuable finding on quantifying the orientation and organization of chondrocyte columns in the prenatal and postnatal growth plate cartilage using advanced 3D imaging and a sophisticated image analysis pipeline. The evidence supporting the authors' conclusions regarding the lack of columns in the fetal growth plate is considered inadequate due to technical caveats, inconsistencies in the data and corresponding model, and failure to correctly put the findings in context.

    2. Reviewer #1 (Public Review):

      Rubin et al. study chondrocyte columns in the prenatal and postnatal growth plate in 3D for the first time, using a novel analysis pipeline in which Confetti clones in the murine growth plate are analysed morphometrically. Prenatal chondrocytes were found not to be organised in columns parallel to the main orientation of the long bone, but rather, prenatal chondrocytes were commonly organised perpendicular to the main direction of growth. In the postnatal (P40) growth plate there was a diverse arrangement of columns, but more of the columns were vertically aligned

      I enjoyed reading the work and the analysis is rigorous. However, I think that it is not valid to state that columns do not form in the embryo. The data only supports the finding that strictly vertical columns do not form in the embryo, as the cells are still organised into columns, albeit with a range of orientations. I do not like the term "typically" aligned, as how can we know what is "typical" when orientation has never before been assessed in 3D... And the authors' data demonstrates that it is certainly not "typical" for chondrocyte to organise into vertical columns prenatally.

      It would be very interesting to delve deeper into the reason for the change in orientation of columns between pre- and post-natal. For example, does more circumferential growth happen prenatally as compared to postnatally? Is the rate of circumferential vs longitudinal growth different between prenatal and postnatal, and could the change in column orientation be responsible for a (possible) shift in the balance between longitudinal vs circumferential growth before vs after birth? The first sentence of the Discussion refers to the role of chondrocyte columns in driving bone elongation, but aren't they also involved in driving bone morphology?

      I feel describing the activity of the cells as "mis-rotations" which implies the orientations are not intentional. It is likely not accidental or mistaken that the chondrocytes align in the ways they do- the diaphysis is largely for longitudinal growth while the epiphyses, and lateral expansion of the joint is also important. I find the data in Figure 4 fascinating, especially the variation in orientations between the regions of the growth plate (from proximal to distal), with the most lateral orientation at the most proximal and distal ends- it would be nice to see more discussion of these variations and what they may be contributing to.

      The abstract focuses solely on the analysis of columns prenatally and would benefit from the inclusion of the data from the postnatal growth plate and from the chondrocyte rotations.

    3. Reviewer #2 (Public Review):

      The origin and function of proliferative chondrocyte columns in the growth plate that are generally aligned with predicted longitudinal growth vectors have been robustly debated since the implementation of clonal analysis and live cell imaging techniques more than a decade ago. In particular, live cell imaging demonstrated that in the proliferative zone, most daughter pairs rotate fully or partially after division to form columns of stacked cells and a minority of pairs fail to rotate. These observations and others led to a mechanistic model of column formation, but limitations in the live cell imaging methods that only visualize a single round of division and rotation left open an important question - what is the effect of different rotation profiles on column formation, bone growth, and morphology?

      This manuscript describes the use of an inducible lineage tracing system in the mouse combined with a novel image analysis pipeline to analyze column formation over multiple cell divisions. The main conclusion is that many clones generate single columns in postnatal mice (as expected), but clones in embryonic growth plate cartilage form clusters distributed laterally, not aligned with longitudinal growth. These findings are interpreted to suggest that column formation is not required for long bone growth in the embryo and that lateral expansion of proliferative chondrocyte clusters may drive an increase in bone width.

      Although these findings are intriguing and potentially impactful, there are important caveats to the approach that generate significant uncertainty in both the measurements and the conclusions. (1) The claim that embryonic growth plate chondrocytes do not form columns conflicts with the observation of columnar stacks in the clusters. (2) Interpretation of nuclear elevation data is based on the unproven assumption that nuclei should be stacked in cell columns. (3) Clonal analysis of proliferative chondrocyte cell division and stacking behaviors is only valid if clone labeling is initiated in a proliferative chondrocyte, not when the founder cell is a resting chondrocyte. The data are insufficient to validate this absolute requirement.

    4. Reviewer #3 (Public Review):

      The manuscript by Rubin and Agrawal et al presents a very nice imaging analysis of clonal cell organization in the fetal and late juvenile mouse growth cartilages. The authors have performed a thorough quantification of the orientations of clusters and of clones of cells with respect to the growth axis. They conclude that growth cartilage is not as strictly 'columnar' as has been commonly described, especially at the fetal stage. There is value to having such quantifications in the literature as a reminder that interpretations of phenotypes need to be rooted in the cell biology of the stage at hand, as emphasized by the authors. However, although the approach is comprehensive, aspects of the quantification methods are not described adequately to determine if they are correct for the questions. There are also some inequivalent comparisons to prior literature and an oversight of important published observations showing that some of these conclusions have been known for decades, though not as thoroughly quantitative. There have long been observations that some growth cartilages do not have proliferative columns oriented in the axis of growth and that not all columns of a growth cartilage are perfectly organized; these facts do not negate the observations that columnar organization does exist, as re-confirmed here, and that it correlates with and contributes to rapid growth rates. Each of these points is further elaborated below.

      This fundamental work advances our understanding of the central coding and control mechanisms regulating sympathetic nervous system efferent signals to bone. The evidence supporting the conclusion is mostly convincing, although the inclusion of higher resolution images for certain data and further discussions would strengthen the study. This paper holds potential interest for skeletal biologists and neuroscientists who study the brain-bone sympathetic neural circuits.

    2. Reviewer #1 (Public Review):

      This manuscript presents, for the first time, the utilization of PRV viral transneuronal tracing to elucidate the central coding and control mechanisms governing sympathetic nervous system (SNS) efferent signals to bone. This groundbreaking work not only holds promising research prospects but also establishes a robust foundation for understanding the neural regulation of bone metabolism.

    3. Reviewer #2 (Public Review):

      Summary:<br /> In this study, the authors have used virtual transneuronal tracing technology to identify for the first time the central sympathetic nervous system outflow sites that innervate bone.

      Strengths:<br /> The study provides a comprehensive atlas of the brain regions that potentially play a role in coding and decoding sympathetic nervous system signals to bone.

      Weaknesses:<br /> While the study provides compelling evidence for the brain-bone sympathetic nervous system neuroaxis, it is unclear if diseases that affect bone (e.g. diabetes, osteoporosis, kidney failure) disrupt brain-bone sympathetic neural circuits.

    4. Reviewer #3 (Public Review):

      It has been reported that the sympathetic nervous system (SNS) mediates bone metabolism and nociceptive functions. However, the exact localization and organization of the central SNS circuitry innervating bone and the brain sites have not been mapped and efferent SNS outflow to bone has not yet been characterized yet. Authors used pseudorabies (PRV) viral transneuronal tracing approach to identify central SNS outflow sites that innervate bone. The authors found that the central SNS outflow to bone originates from brain nuclei, sub-nuclei and regions of six brain divisions (midbrain and pons, hypothalamus, hindbrain medulla, forebrain, cerebral cortex, and thalamus). The authors provided compelling evidence for a brain-bone SNS neuroaxis that may regulate bone metabolism and nociceptive functions, which provided a greater understanding of the neural regulation of bone metabolism and would stimulate further research into bone pain and the neural regulation of bone metabolism. Authors may discuss and summarize their results in detail for a better understanding of their findings and enhancing the manuscript's utility for readers.

      This paper is valuable in that it provides a critical missing link between measures of structural connectivity and rhythmic tapping abilities, pointing to some interesting possibilities for how tapping synchronization is carried out. The methodology and findings are solid, and of interest to those studying the neural mechanisms of timing.

    2. Reviewer #1 (Public Review):

      Garcia-Saldivar and colleagues present a manuscript investigating connections between diffusion-weighted imaging (DWI) parameters and paced finger tapping measures. A cohort of human participants (n=32) performed a paced finger tapping task with a synchronization-continuation paradigm, in which they were required to listen to a paced metronome, begin tapping in synchrony with it, and then continue tapping at the same rate without it. Both auditory and visual metronomes were used, at a range of intervals. All subjects received structural scans measuring DWI, with an emphasis on superficial and deep white matter structures. This latter analysis was the most innovative, as it allowed the authors to examine microstructural effects in short-range cortical connections.

      Behaviorally, the authors replicated some well-known effects in paced finger tapping, with better performance for auditory over visual rhythms, negative lag-1 autocorrelations, and best performance at a range of ~1.5Hz. For the DWI analyses, a large number of correlations were observed across a wide variety of connections with various brain regions. The most salient effects observed were a connection between asynchrony, only for the auditory condition, and connections between the right auditory and motor systems, around the duration of peak performance, as well as a "chronotopic" organization across parts of the corpus callosum, most notably in areas linking motor regions between hemispheres.

      Overall, this paper provides a critical missing link between measures of structural connectivity and rhythmic tapping abilities, pointing to some interesting possibilities for how tapping synchronization (at least for auditory intervals) is carried out. Negative aspects of the paper come from the largely exploratory aspects of the analysis, as well as potential biases from the low sample size.

    3. Reviewer #2 (Public Review):

      This is a valuable study of the relationships between aspects of white matter structure in the brain and the accuracy of tapping performance on auditory and visual versions of a synchronization-continuation task. The authors find brain-behaviour relationships between absolute asynchrony (precision of phase alignment between taps and stimulus events), but only for certain temporal rates (650 and 750 ms ISI, not 550, 850, or 950 ms ISI). Other behavioural metrics do not significantly correlate with white matter measures, and no visual condition behavioural metrics correlate either. The methodology and findings are solid, and of interest to those studying the neural mechanisms of timing.

      The question is interesting, as the neural mechanisms of timing, and the nature of how modality differences in timing arise, are important, given that certain modality differences in timing accuracy (e.g., auditory benefits relative to visual) are less striking in our closest evolutionary relatives. Overall, the methods are well-presented and both behavioural and neural measures are appropriate.

      The results are generally well-reported, although there is a lack of clarity about multiple comparison corrections for the number of separate behavioural metrics, different interval lengths examined, and the two sensory modalities.

      Some weaknesses:<br /> The use of absolute (unsigned) asynchrony as a measure of 'predictive' ability is not fully justified. Signed asynchrony may be a more informative measure of predictive ability, as (small) negative asynchronies (taps prior to event onset) are often interpreted as indicating prediction, whereas positive asynchronies (taps after the event onset) are not.<br /> The work may benefit from considering the 'phase' and 'period' nature of the different behavioural measures, as they may tap different aspects of timing. Separating the behavioural metrics into those reflecting phase synchrony versus period matching may be a useful distinction, as the period-related metrics are the ones that do not have evidence of correlation with brain metrics.<br /> The manuscript does not present a very clear framework for why certain measures might be predicted to correlate with white matter structure and others not, and the pattern of results is also not easily interpretable. This may just be the nature of the data, but it would help clarify if more justification for the selection of task and stimulus rates was presented, along with an idea of the predictions made by different theoretical approaches for what relationships between this particular set of behavioural and brain data might exist. Similarly, a more nuanced discussion might further explore the potential reasons for the lack of evidence for a relationship at shorter and longer auditory interval lengths, as well as for any of the visual condition measures.

      Overall, the authors find white-matter structure relationships with absolute asynchrony measures during auditory (but not visual) synchronization-continuation at certain rates. These findings appear reasonably justified.

      eLife assessment

      We thank the Editors for identifying qualified reviewers. We agree that the “evidence supporting this claim (that ‘many breast cancer mutations are mildly deleterious’) is incomplete”. Much more detail is needed to state this decisively and we do not claim completeness here. As far as validation, we carried out synthetic testing of the models as suggested by Reviewer #1 and the results seem good.

      Reviewer #1:

      We thank the Reviewer for a very thorough examination of not only the current paper but also our previous paper. We agree that the illustration material can be overwhelming and we plan to use the Reviewer’s advice in that matter. In addition, we originally put some textbook material in the Appendix, and arguably some of it may be considered superfluous.

      Most of the references the Reviewer provides are known to us, although it is likely we should cite and discuss more. All of the above will be included in the revision we are planning.

      The Reviewer is certainly correct that population growth and spatial effects play a major role in cancer. However, the effects of constraining environment are quite strong and the reality lies somewhere between the Moran and branching process models; exactly what we attempt to clarify. As for spatial effects, most tumors extracted in clinic are dissected in bulk and sub-sampling is rare, so the spatial information is rarely accessible.

      The subsequent point of importance concerns the weak specificity of the site frequency spectra (SFS) with respect to the underlying genetic and demographic forces. This cannot be denied. However, we just meant to state that our SFS are consistent with a model involving slightly deleterious passengers.

      Regarding the validation of the estimation procedures which is a point well-taken, we carried out synthetic testing of the models as suggested by Reviewer #1 and the results seem good. This will be discussed in full in the revision.

      In our view, the most important remark is the one concerning scaling of the models. The Reviewer is certainly correct that 100 stem cells are insufficient to drive a realistic tumor. However, what we had in mind but not explained sufficiently, is that a sample of 100 cells corresponds to average-depth coverage in bulk sequencing. Therefore, the strict interpretation is that the model mirrors what is observed in the sample. A more accurate approach would be to up-scale the model and then sample 100 cells from it. The Moran-type model can be up-scaled using diffusion approximation, and we hope to include these computations in the revision. The associated criticism concerning tumor growth seems less relevant, since we experimented with less or more stringent constraints in our models.

      Reviewer #2:

      We thank Reviewer #2 for studying our paper and some very positive comments. Among others, the Reviewer underscores the fact that the Moran-type model generates SFS concordant with the data (with all necessary reservations). The Reviewer concurs with us that conditioning on non-extinction is not very common in the literature, while it should be.

      Similarly as the Reviewer, we are somewhat puzzled by the differences in behavior between models A and B. Model B seems more parsimonious, but Model A looks more similar to the critical or slightly supercritical branching process. We will work to clarify these observations.

      This study uses numerical simulations to characterize and compare variants of two widely used mathematical models and then applies those models to inferring evolutionary parameters from breast cancer data. The copious numerical results will be of some interest to mathematical biologists working with similar models. The finding that many breast cancer mutations are mildly deleterious is valuable but the evidence supporting this claim is incomplete because the mathematical modelling and statistical methods are insufficiently justified and inadequately validated.

    3. Reviewer #1 (Public Review):

      This paper can be seen as an extension of a recent study by two of the same authors [1]. In the previous paper, the authors considered two variants of the Moran process, labelled Model A and Model B, and examined differences between the evolutionary dynamics of these two models. They further described the site frequency spectra, expected allele counts, and expected singleton counts of these models, building on analytical results from prior studies, and used numerical simulations to investigate the models' evolutionary dynamics. Finally, they compared the site frequency spectra of the two models (using numerical simulations) to spectra derived from a small breast cancer data set (two sets of three samples).

      In the new paper, the authors consider the same two Moran process variants (Model A and Model B) and some related branching processes. As before, they compare the site frequency spectra and various summary statistics of these models, but here they present only numerical simulations (except that some prior analytical results are summarized in Appendix A, which are never referred to in the main text and seem unconnected to the study). They then compare the site frequency spectra of these models (again using numerical simulations) to those derived from the same breast cancer samples as before and thus infer some evolutionary parameters.

      The first main conclusion is that the critical branching process and the Moran process models behave similarly and generate similar site frequency spectra. This finding is unsurprising (indeed, the authors acknowledge that the result "has been expected"). For a reasonably large population size, the population size in the critical branching process has been shown to vary relatively little over time and the model is thus essentially a continuous time Moran process (see, for example, Equation 8.55 in ref 2). Nor is it surprising that the authors see stronger similarities when they select only the subset of branching process replicates in which the final population size is particularly close to the initial population size (this is because, in these replicates, the population size likely varies even less than usual).

      The second main conclusion is that, although "the mutational SFS alone is not adequate" to quantify the strength of selection, "All fitted values for the selective disadvantage of passenger mutations are nonzero, supporting the view that they exert deleterious selection during tumorigenesis". Although the question of whether mildly deleterious mutations play an important role in cancer evolution is of considerable interest, it's debatable whether the results presented here help resolve the issue.

      Many prominent researchers have called into question whether cancer evolutionary parameters can be reliably inferred from site frequency spectra (e.g., [3-7]), even using sophisticated statistical methods. The statistical approach used here (though not named as such in the paper) is a crude kind of approximate Bayesian computation. To improve the accuracy of the results, it would have been better to have set reasonably vague priors for the uncertain mutation rates, rather than fixing them arbitrarily. It would also have been better to have chosen a likelihood function explicitly based on an analysis of the sampling and error distributions, rather than just summing the absolute logged deviations. It is well known that "Checking the model is crucial to statistical analysis" and "A good Bayesian analysis, therefore, should include at least some check of the adequacy of the fit of the model to the data and the plausibility of the model for the purposes for which the model will be used" [8]. The authors' failure to describe any attempt to validate or check their model, using simulated data or otherwise, casts doubt on the reliability of their inferences.

      Putting aside the potential biassing effects of sampling error, measurement error, and the limitations of the authors' statistical method, it is well established that both population growth and spatial structure profoundly alter the shape of site frequency spectra in ways that can mimic the effects of selection (e.g. [9-11]). Indeed, Figures 3, 4 and 5 show that the critical and super-critical branching processes generate markedly different site frequency spectra. It follows that if the population dynamics and spatial structure of the mathematical model used for inference don't match those of the biological process that produced the data then any inferred evolutionary parameter values will be unreliable. Breast cancer has two indisputable ecological features that shape its evolutionary dynamics: the cell population expands by many orders of magnitude from a single cell, and the population is spatially structured. In the authors' mathematical model, the population size is initially 100 cells and either remains constant or varies little, and there is no spatial structure. These profound mismatches between model and data cast further doubt on what is supposed to be the paper's most important biological finding.

      In this paper the authors offer no justification for their decision to model breast cancer as a non-growing, non-spatial cell population. Nor do they engage with the extensive recent literature on the challenges of inferring evolutionary parameters from cancer site frequency spectra (they cite none of the many relevant papers listed at https://www.sottorivalab.org/neutral-evolution.html). Their 2022 paper [1] claims that, "it sometimes makes sense to consider cancer growth in the framework of constant-population models. Our models correspond to the situation in which a constant population of N "healthy" stem cells is gradually replaced by a growing clone of transformed cells with increasing fitness." No evidence was presented to support this hypothesis regarding breast cancer progression. On the other hand, a wealth of evidence supports the consensus view that, in breast cancer and other human solid tumours, the number of cells with unlimited proliferative potential is several orders of magnitude greater than 100 and grows over time (e.g. [12]).

      Analytic expressions for the site frequency spectra with neutral mutations are already known. It is well known that the site frequency spectrum of an exponentially growing population has a tail following a power law S_k ~ k^(-2) [13, 14]. Similarly, it is known that for the critical branching process or the Moran process, the site frequency spectrum at equilibrium is S_k ~ k^(-1) [13, 15]. Especially noteworthy yet uncited studies that use those results about site frequency spectra to make inferences based on sequencing data include ref 16, in which selection is inferred, and ref 17, in which evolutionary parameters of constant populations (healthy cell populations) are inferred.

      Although the paper is well written, the figures are ineffective in communicating the results. As others have put it, "A figure is meant to express an idea or introduce some facts or a result that would be too long (or nearly impossible) to explain only with words" and "If your figure is able to convey a striking message at first glance, chances are increased that your article will draw more attention from the community" [18]. On the contrary, Figures 3, 4, 5 and 6 are bewilderingly complicated, crowded, and repetitive. These figures comprise no fewer than fifty-six plots, each containing numerous curves or histograms, spread across four pages. To compare the results of different scenarios, the reader is presumably expected to put these figures side by side and try to spot the differences, hampered by inconsistent axis ranges, absence of axis labels, absence of titles, absence of legends, and unreliable captions ("cyan" seems to refer to pale blue, and "orange" to something closer to red). For example, the only notable difference between Figures 3 and 4 is in the shape of a single green curve in panel I. In the main text of a published paper, one would expect fewer, more carefully curated figures drawing attention to salient features, so that the reader can infer the main results with minimal effort. The rest can be put in supplementary figures.

      In summary, this paper adds somewhat to our understanding of some standard mathematical models; whether it tells us anything new about cancer is open to debate.

      References<br /> (1) Kurpas, Monika K., and Marek Kimmel. "Modes of selection in tumors as reflected by two mathematical models and site frequency spectra." Frontiers in Ecology and Evolution 10 (2022): 889438.<br /> (2) Bailey, Norman TJ. The elements of stochastic processes with applications to the natural sciences. John Wiley & Sons, 1964.<br /> (3) Tarabichi, Maxime, et al. "Neutral tumor evolution?." Nature Genetics 50.12 (2018): 1630-1633.<br /> (4) McDonald, Thomas O., Shaon Chakrabarti, and Franziska Michor. "Currently available bulk sequencing data do not necessarily support a model of neutral tumor evolution." Nature Genetics 50.12 (2018): 1620-1623.<br /> (5) Balaparya, Abdul, and Subhajyoti De. "Revisiting signatures of neutral tumor evolution in the light of complexity of cancer genomic data." Nature Genetics 50.12 (2018): 1626-1628.<br /> (6) Noorbakhsh, Javad, and Jeffrey H. Chuang. "Uncertainties in tumor allele frequencies limit power to infer evolutionary pressures." Nature Genetics 49.9 (2017): 1288-1289.<br /> (7) Bozic, Ivana, Chay Paterson, and Bartlomiej Waclaw. "On measuring selection in cancer from subclonal mutation frequencies." PLoS Computational Biology 15.9 (2019): e1007368.<br /> (8) Neher, Richard A., and Oskar Hallatschek. "Genealogies of rapidly adapting populations." Proceedings of the National Academy of Sciences 110.2 (2013): 437-442.<br /> (9) Gelman, Andrew, et al. Bayesian data analysis (Third Edition). Chapman and Hall/CRC, 2014.<br /> (10) Fusco, Diana, et al. "Excess of mutational jackpot events in expanding populations revealed by spatial Luria-Delbrück experiments." Nature Communications 7.1 (2016): 12760.<br /> (11) Noble, Robert, et al. "Spatial structure governs the mode of tumour evolution." Nature Ecology & Evolution 6.2 (2022): 207-217.<br /> (12) Lawson, Devon A., et al. "Single-cell analysis reveals a stem-cell program in human metastatic breast cancer cells." Nature 526.7571 (2015): 131-135.<br /> (13) Gunnarsson, Einar B., Leder, Kevin, and Foo Jasmine. "Exact site frequency spectra of neutrally evolving tumors: A transition between power laws reveals a signature of cell viability" Theoretical Population Biology 142 (2021) 67-90<br /> (14) Durrett, Richard "Branching Process Models of Cancer" Springer (2015)<br /> (15) Durrett, Richard "Probability Models for DNA Sequence Evolution" Springer Science & Business media (2008)<br /> (16) Williams, Mark J. et al. "Quantification of subclonal selection in cancer from bulk sequencing data." Nature Genetics 50 (6). 895-903 (2018)<br /> (17) Moeller, Marius E. et al. "Measures of genetic diversification in somatic tissues at bulk and single-cell resolution" eLife (2024) 12:RP89780<br /> (18) Rougier, Nicolas P., Michael Droettboom, and Philip E. Bourne. "Ten simple rules for better figures." PLoS Computational Biology 10.9 (2014): e1003833.

    4. Reviewer #2 (Public Review):


      In this manuscript, the authors present a comparison of two models of cancer evolution with advantageous drivers and deleterious passengers: a fixed-population "Moran" model, and a "Branching Process" (BP) model with dynamic population size. The Moran model is more mathematically-tractable, but since cancer is a disease of uncontrolled growth, it is unclear to me how clinically-relevant it is to consider a model with constant population size. Intriguingly, both models can explain observed Site Frequency Spectrums (SFSs) in three breast cancers, which suggests that the Moran model may have some value. This distinction between the two models is addressed well.


      The comparisons of the various BP models (extinction/non-extinction, and balanced/supercritical) are very interesting. The survivability of rare, fitness-disadvantaged clones has huge implications for treatment resistance in general - drug resistant clones are very often disadvantaged in the absence of drug. Clinical sequencing is, most decidedly, investigating population dynamics conditioned on non-extinction, however most published models do not condition on non-extinction - an unfortunate community oversight that this publication rectifies.

      Site Frequency Spectrums in three breast cancers are measured with unprecedented resolution to my knowledge (allele abundances below one in a thousand).

      Detailed description of the behavior of the various models.


      I do not believe Moran B is a useful theoretical distinction between Moran A. Incorporating fitness effects into the birth process, instead of the death process, is generally mathematically equivalent when time is measured in generations (or cell divisions). Visible differences in the two models in Figures 2-6 by all accounts seem to be due to the fact that Moran B experiences more evolution in the balanced/driver-dominated case, and less evolution in the passenger dominated case. We generally do not use arbitrary time steps for this reason - we quantify time in 'generations'.

      This investigation marks an important advancement in our understanding of motor thalamus connectivity, illustrating a complex integration of inputs that reshapes previous models. The study utilizes compelling methodologies that expose a dynamic synaptic network, although the evidence of triple-input convergence on individual neurons and for multiple driver type inputs onto motor thalamic neurons remains incomplete. Despite this, the findings provide a persuasive rationale for revisiting our perceptions of the thalamic role in motor control, with a call for further studies to substantiate the breadth of these functional interactions.

    2. Reviewer #1 (Public Review):

      The manuscript demonstrates an analysis of the synaptic organization within the motor thalamus, emphasizing the interplay between the ventrolateral (VL) and ventroanterior (VA) nuclei and their respective inputs. The primary aim is to unravel the complexities of synaptic interactions among the motor cortex's layer 5 (M1L5), the cerebellum (Cb), and the basal ganglia output nuclei (GPi and SNr), which converge upon the VA/VL nuclei of the motor thalamus. This examination is executed using a combination of anatomical tracing, optogenetics, and electrophysiological recordings in mouse brain slices, which together yield novel insights into the motor control circuitry.

      The study uncovers that contrary to traditional models that presumed segregation, some motor thalamic neurons simultaneously integrate inputs from the cerebellum and basal ganglia. Furthermore, a subset of these neurons also receive convergent inputs from M1L5 and basal ganglia, underscoring the complexity of these synaptic networks. Notably, the study reveals that both M1L5 and Cb inputs exhibit driver-type synaptic properties, suggesting a significant impact on thalamic relay neurons.

      The functional implications of this synaptic convergence suggest a complex gating mechanism by the inhibitory outputs of the basal ganglia, which could modulate information flow within the motor thalamus. This modulation is significant not only for transthalamic information processing but also for the integration of cerebellar inputs to the motor cortex. The study also highlights direct projections from M1L5 to the motor thalamus, indicating a potential direct influence on thalamic activity, in addition to the known indirect influence through the cortico-basal ganglia-thalamo-cortical loop.

      The manuscript suggests that the traditional understanding of motor thalamic connectivity requires reconsideration, and it emphasizes the necessity of further investigation to understand fully the functional implications of this synaptic convergence. Future research may focus on more direct demonstrations of triple-input convergence and its behavioral consequences, as well as cross-species comparative studies to enhance the findings' applicability.

      While the study provides valuable contributions to our knowledge of the motor thalamus, illuminating the intricate synaptic architecture of the motor thalamus and setting the stage for future explorations that will deepen our comprehension of motor control and thalamic function.

    3. Reviewer #2 (Public Review):

      This study assesses how inputs from primary motor cortex layer 5 (M1L5), basal ganglia output nuclei (GPi and SNr), and cerebellum (Cb) converge onto motor thalamus nuclei (VA/VL).

      Methodology includes anatomical tracing, optogenetics and electrophysiological recordings in mouse brain slices.

      The major findings are:<br /> - Some motor thalamic neurons receive input from both cerebellar and basal ganglia. This is contrary to the common belief that assumes these two inputs are segregated in the motor thalamus.

      - Some motor thalamus neurons receive converging input from both motor cortex (M1L5) and basal ganglia.

      - Both M1L5 and Cb inputs to the motor thalamus have driver-type synaptic properties, indicating a strong influence on thalamic relay neurons.

      Functional implications are:<br /> - Given the inhibitory nature of basal ganglia output neurons, the converging inputs can allow for basal ganglia to gate information flow through the motor thalamus. This applies to transthalamic information, ie information conveyed through the thalamus across cortical regions, as well as cerebellar information flow to motor cortex.

      - The direct projection from M1L5 to motor thalamus suggests that motor cortex can affect motor thalamic activity not only indirectly, through the traditional cortico-basal ganglia-thalamo-cortical loop, but also through direct projections.

      The study is convincing and has important implications for the field. Methodology involves elegant viral techniques.

      The main weakness is that there is no direct functional demonstration of all the 3 inputs from motor cortex, cerebellum, and basal ganglia, converging onto the same cells in motor thalamus. All the recordings concern dual area stimulations, and the anatomical studies show a very small overlap of all the 3 inputs onto motor thalamus.

      This paper presents a new method for separating organelles in an unbiased way. The method is applied to the separation of distinct subpopulations of insulin vesicles. There are concerns around whether the vesicles measured are in fact insulin vesicles and whether the observed changes in vesicle populations upon glucose stimulation are biologically meaningful, and thus it is difficult to assess at this stage how well the technique performs. This paper is likely to be of wide interest to cell biologists studying a variety of compartments, as well as to researchers in the beta cell field.

    2. Reviewer #1 (Public Review):

      This manuscript presents an exciting new method for separating insulin secretory granules using insulator-based dielectrophoresis (iDEP) of immunolabeled vesicles. The method has the advantage of being able to separate vesicles by subtle biophysical differences that do not need to be known by the experimenter, and hence could in principle be used to separate any type of organelle in an unbiased way. Any individual organelle ("particle") will have a characteristic ratio of electrokinetic to dielectrophoretic mobilities (EKMr) that will determine where it migrates in the presence of an electric field. Particles with different EKMr will migrate differently and thus can be separated. The present manuscript is primarily a methods paper to show the feasibility of the iDEP technique applied to insulin vesicles. Experiments are performed on cultured cells in low or high glucose, with the conclusion that there are several distinct subpopulations of insulin vesicles in both conditions, but that the distributions in the two conditions are different. As it is already known that glucose induces release of mature insulin vesicles and stimulates new vesicle biosynthesis and maturation, this finding is not necessarily new, but is intended as a proof of principle experiment to show that the technique works. This is a promising new technology based on solid theory that has the possibility to transform the study of insulin vesicle subpopulations, itself an emerging field. The technique development is a major strength of the paper. Also, cellular fractionation and iDEP experiments are performed well, and it is clear that the distribution of vesicle populations is different in the low and high glucose conditions. However, more work is needed to characterize the vesicle populations being separated, leaving open the possibility that the separated populations are not only insulin vesicles, but might consist of other compartments as well. It is also unclear whether the populations might represent immature and mature vesicles, distinct pools of mature vesicles such as the readily releasable pool and the reserve pool, or vesicles of different age. Without a better characterization of these populations, it is not possible to assess how well the iDEP technique is doing what is claimed.

      Major comments:

      (1) There is no attempt to relate the separated populations of vesicles to known subpopulations of insulin vesicles such as immature and mature vesicles, or the more recently characterized Syt9 and Syt7 vesicle subpopulations that differ in protein and lipid composition (Kreutzberger et al. 2020). Given that it is unclear exactly what populations of vesicles will be immunolabeled (see point #2 below), it is also possible that some of the "subpopulations" are other compartments being separated in addition to insulin vesicles. It will be important to examine other markers on these separated populations or to perform EM to show that they look like insulin vesicles.

      (2) An antibody to synaptotagmin V is used to immunolabel vesicles, but there has been confusion between synaptotagmins V and IX in the literature and it isn't clear what exactly is being recognized by this antibody (this reviewer actually thinks it is Syt 9). If it is indeed recognizing Syt 9, it might already be labeling a restricted population of insulin vesicles (Kreutzberger et al. 2020). The specificity of this antibody should be clarified. Furthermore, Figure 2 is not convincing at showing that this synaptotagmin antibody specifically labels insulin vesicles nor is there convincing colocalization of this synaptotagmin antibody with insulin vesicles. In the image shown, several cells show very weak or no staining of both insulin and the synaptotagmin. The highlighted cell appears to show insulin mainly in a perinuclear structure (probably the Golgi) rather than in mature vesicles (which should be punctate), and insulin is not particularly well-colocalized with the synaptotagmin. Other cells in the image appear to have even less colocalization of insulin and synaptotagmin, and there is no quantification of colocalization. It seems possible that this antibody is recognizing other compartments in the cell, which would change the interpretation of the populations measured in the iDEP experiments. It would also be good to perform synaptotagmin staining under glucose-stimulating conditions, in case this alters the localization.

      (3) The EKMr values of the vesicle populations between the low and high glucose conditions don't seem to precisely match. It is unclear if this just a technical limitation in comparing between experiments or instead suggests that glucose stimulation does not just change the proportion of vesicles in the subpopulations (i.e. the relative fluorescent intensities measured), but rather the nature of the subpopulations (i.e. they have distinct biophysical characteristics). This again gets to the issue of what these vesicle subpopulations represent. If glucose stimulation is simply converting immature to mature vesicles, one might expect it to change the proportion of vesicles, but not the biophysical properties of each subpopulation.

      (4) The title of the paper promises "isolation" of insulin vesicles, but the manuscript only presents separation and no isolation of the separated populations. Isolation of the separated populations is important to be able to better define what these populations are (see point #1 above). Isolation is also critical if this is to be a valuable technique in the future. Yet the paper is unclear on whether it is actually technically feasible to isolate the populations separated by iDEP. In line 367, it states "this method provides a mechanism for the isolation and concentration of fractions which show the largest difference between the two population patterns for further bioanalysis (imaging, proteomics, lipidomics, etc.)." However, in line 361 it says "developing the capability to port the collected individual boluses will enable downstream analyses such as mass spectrometry or electron microscopy," suggesting that true isolation of these populations is not yet feasible. This should be clarified.

    3. Reviewer #2 (Public Review):

      This manuscript used DC-iDEP, a technology previously used on other organelle preparations to isolate insulin secretory granules from INS1 cells based on differences in dielectrophoretic and electrokinetic properties of synaptotagmin V positive insulin granules.

      The major motivation presented for this work is to provide a methodology to allow for more sensitive isolation of subpopulations of granules allowing better understanding of the biochemical composition of these populations. This manuscript clearly demonstrates the ability of this technology to separate these subpopulations which will allow for future biochemical characterizations of insulin granules in future studies.

      After proving these subpopulations can be observed, this method was then utilized to show there are shifts in these subpopulations when granules are isolated from glucose stimulated cells. Overall the method of isolation is novel and could provide a tool for further characterization of purified secretory granules.

      The observation of glucose stimulation causing shifts in subpopulations is unsurprising. Glucose stimulation could cause a depletion of insulin and other secretory content from a subset of granules. It would be expected that this loss of content would cause a shift in electrochemical properties of the granules, but this is a nice confirmation that the isolation method has the sensitivity to delineate these changes.

      Major comments:

      (1) It is unclear what Synaptotagmin isoform is being looked at. Synaptotagmin V and IX have been repetitively interchanged in the literature. See note in syt IX section of "Moghadam and Jackson 2013 Front. Endocrinology" or read "Fukuda and Sagi-Eisenberg Calcium Bind Proteins 2008".

      The 386 aa. isoform that is abundant in PC12 cells has been robustly observed in INS1 cells in multiple studies and has been frequently referred to as syt IX. The sequence the antibody was raised against should be determined from the company where this was purchased and then this should be mapped to to which isoform of Synaptotagmin by sequence and clarified in the text.

      (2) Immunofluorescence of insulin and syt V is confusing. The example images do not appear to show robust punctate structures that are characteristic of secretory granules (in both the insulin and syt V stain).

      (3) In the discussion it says, "Finally, this method provides a mechanism for the isolation and concentration of fractions which show the largest difference between the two population patterns for further bioanalysis (imaging, proteomics, lipidomics, etc.) that otherwise would not be possible given the low-abundance components of these subpopulations."

      It would help to elaborate more on the yield and concentrations of isolated granules. This would give a better sense of what level of biochemical characterization could be performed on sub-populations of granules.

    4. Reviewer #3 (Public Review):

      The manuscript from Barekatain et al. is investigating heterogeneity within the population of insulin vesicles from an insulinoma cell line (INS-1E) in response to glucose stimulation. Prevailing dogma in the beta-cell field suggests that there are distinct pools of mature insulin granules, such as ready-releasable and a reserve pool, which contribute to distinct phases of insulin release in response to glucose stimulation. Whether these pools (and others) are distinct in protein/lipid composition or other aspects is not known, but has been suggested. In this manuscript, the authors use density gradient sedimentation to enrich for insulin vesicles, noting the existence of a number of co-purifying contaminants (ER and mitochondrial markers). Following immunolabeling with synaptotagmin V and fluorescent-conjugated secondary antibodies, insulin vesicles were applied to a microfluidic device and separated by dielectrophoretic and electrokinetic forces following an applied voltage. The equilibrium between these opposing forces was used to physically separate insulin granules. Here some differences were observed in the insulin (Syt V positive) granule populations, when isolated from cells that were either non-stimulated or stimulated with glucose, which has been suggested previously by other studies as noted by the authors; however in the current manuscript, the inclusion of a number of control experiments may provide a better context for what the data reveal about these changes.

      The major strength of the paper is in the use of the novel, highly sophisticated methodology to examine physical attributes of insulin granules and thus begin to provide some insight into the existence of distinct insulin granule populations within a beta-cell -these include insulin granules that are maturing, membrane-docked (i.e. readily releasable), in reserve, newly-synthesized, aged, etc. Whether physical differences exist between these various granule pools is not known. In this capacity, the technical abilities of the current manuscript may begin to offer some insight into whether these perceived distinctions are physical.

      The major weakness of the manuscript is that the study falls short in terms of linking the biology to the sophisticated changes observed and primarily focuses on differences in response to glucose. Without knowing what the various populations of granules are, it is challenging to understand what the changes in response to glucose mean.

      Specific concerns are as follows:

      (1) There is confusion on what the DC-iDEP separation between stimulated and stimulated cells reveals. Do these changes reflect maturation state of granules, nascent vs. old granules? Ready-releasable vs. reserve pool? The comments in the text seem to offer all possibilities.

      (2) It is unclear what we can infer regarding the physical changes of granules between the stimulated states of the cells. Without an understanding of the magnitude of the effect, it is unclear how biologically significant these changes are. For example, what degree of lipid or protein remodeling would be necessary to give a similar change?

      (3) The reliance on a single vesicle marker, Syt V, is concerning given that granule remodeling is the focus.

      (4) Additional confirmation that the isolated vesicles are in fact insulin granules would be helpful. As noted, granules were gradient enriched, but did carry contaminants. Note that the microscopy image provided does not provide any real validation for this marker.

      Further confirmation that the immune-isolated vesicles are in fact insulin granules should be included. EM with immunogold labeling post-SytV enrichment would be a potential methodology to confirm.

      (5) It would be useful to understand if the observed effects are specific to the INS-1E cell line or are a more universal effect of glucose on beta-cells.

    1. eLife assessment

      This is an important computational study that applies the machine learning method of bilinear modeling to the problem of relating gene expression to connectivity. Specifically, the author attempts to use transcriptomic data from mouse retinal neurons to predict their known connectivity with promising results. On revision, the approach was tested against a second data set from C. elegans. A limited number of genes studied in this second dataset may have resulted in performance that matched but did not exceed prior models. However, taken together, the results were felt to provide solid evidence for the value of the approach.

    1. eLife assessment

      In this important study, the authors report a novel measurement of the Escherichia coli chemotactic response and demonstrate that these bacteria display an attractant response to potassium, which is connected to intracellular pH level. The experimental evidence provided is convincing and the work will be of interest to microbiologists studying chemotaxis.

    2. Author response:

      The following is the authors’ response to the previous reviews.

      eLife assessment

      In this important study, the authors report a novel measurement of the Escherichia coli chemotactic response and demonstrate that these bacteria display an attractant response to potassium, which is connected to intracellular pH level. Whilst the experiments are mostly convincing, there are some confounders regards pH changes and fluorescent proteins that remain to be addressed.

      Public Reviews:

      Reviewer #1 (Public Review):


      This paper shows that E. coli exhibits a chemotactic response to potassium by measuring both the motor response (using a bead assay) and the intracellular signaling response (CheY phosporylation level via FRET) to step changes in potassium concentration. They find increase in potassium concentration induces a considerable attractant response, with amplitude comparable to aspartate, and cells can quickly adapt (and generally over-adapt). The authors propose that the mechanism for potassium response is through modifying intracellular pH; they find both that potassium modifies pH and other pH modifiers induce similar attractant responses. It is also shown, using Tar- and Tsr-only mutants, that these two chemoreceptors respond to potassium differently. Tsr has a standard attractant response, while Tar has a biphasic response (repellent-like then attractant-like). Finally, the authors use computer simulations to study the swimming response of cells to a periodic potassium signal secreted from a biofilm and find a phase delay that depends on the period of oscillation.


      The finding that E. coli can sense and adapt to potassium signals and the connection to intracellular pH is quite interesting and this work should stimulate future experimental and theoretical studies regarding the microscopic mechanisms governing this response. The evidence (from both the bead assay and FRET) that potassium induces an attractant response is convincing, as is the proposed mechanism involving modification of intracellular pH. The updated manuscript controls for the impact of pH on the fluorescent protein brightness that can bias the measured FRET signal. After correction the response amplitude and sharpness (hill coefficient) are comparable to conventional chemoattractants (e.g. aspartate), indicating the general mechanisms underlying the response may be similar. The authors suggest that the biphasic response of Tar mutants may be due to pH influencing the activity of other enzymes (CheA, CheR or CheB), which will be an interesting direction for future study.


      The measured response may be biased by adaptation, especially for weak potassium signals. For other attractant stimuli, the response typically shows a low plateau before it recovers (adapts). In the case of potassium, the FRET signal does not have an obvious plateau following the stimuli of small potassium concentrations, perhaps due to the faster adaptation compared to other chemoattractants. It is possible cells have already partially adapted when the response reaches its minimum, so the measured response may be a slight underestimate of the true response. Mutants without adaptation enzymes appear to be sensitive to potassium only at much larger concentrations, where the pH significantly disrupts the FRET signal; more accurate measurements would require development of new mutants and/or measurement techniques.

      We acknowledge and appreciate the reviewer's concerns regarding the potential impact of adaptation on the measured response magnitude. We have estimated the effect of adaptation on the measured response magnitude. The half-time of adaptation at 30 mM KCl was measured to be approximately 80 s, corresponding to a time constant of t = 80/ln(2) = 115.4 s, which is significantly longer than the time required for medium exchange in the flow chamber (less than 10 s). Consequently, the relative effect of adaptation on the measured response magnitude should be less than 1-exp(-10/t) = 8.3%. Even for the fastest adaptation (at the lowest KCl concentration) we measured, the effect should be less than 20%, which is within experimental uncertainties. Nevertheless, we agree that developing new techniques to measure the dose-response curve more precisely would be beneficial.

      Reviewer #2 (Public Review):

      Zhang et al investigated the biophysical mechanism of potassium-mediated chemotactic behavior in E coli. Previously, it was reported by Humphries et al that the potassium waves from oscillating B subtilis biofilm attract P aeruginosa through chemotactic behavior of motile P aeruginosa cells. It was proposed that K+ waves alter PMF of P aeruginosa. However, the mechanism was this behaviour was not elusive. In this study, Zhang et al demonstrated that motile E coli cells accumulate in regions of high potassium levels. They found that this behavior is likely resulting from the chemotaxis signalling pathway, mediated by an elevation of intracellular pH. Overall, a solid body of evidence is provided to support the claims. However, the impacts of pH on the fluorescence proteins need to be better evaluated. In its current form, the evidence is insufficient to say that the fluoresce intensity ratio results from FRET. It may well be an artefact of pH change.

      The authors now carefully evaluated the impact of pH on their FRET sensor by examining the YFP and CFP fluorescence with no-receptor mutant. The authors used this data to correct the impact of pH on their FRET sensor. This is an improvement, but the mathematical operation of this correction needs clarification. This is particularly important because, looking at the data, it is not fully convincing if the correction was done properly. For instance, 3mM KCl gives 0.98 FRET signal both in Fig3 and FigS4, but there is almost no difference between blue and red lines in Fig 3. FigS4 is very informative, but it does not address the concern raised by both reviewers that FRET reporter may not be a reliable tool here due to pH change.

      We apologize for not making the correction process clear. We corrected the impact of pH on the original signals for both CFP and YFP channels by

      where and represent the pH-corrected and original PMT signal (CFP or YFP channel) from the moment of addition of L mM KCl to the moment of its removal, respectively, and  is the correction factor, which is the ratio of PMT signal post- to pre-KCl addition for the no-receptor mutant at L mM KCl, for CFP or YFP channel as shown Fig. S5. The pH-corrected FRET response is then calculated as the ratio of the pH-corrected YFP to the pH-corrected CFP signals, normalized by the pre-stimulus ratio.

      As shown in Author response image1, which represents the same data as Fig. 3A and Fig. S5A, the original normalized FRET responses to 3 mM KCl are 0.967 for the wild-type strain (Fig. 3) and 0.981 for the no-receptor strain (Fig. S5). The standard deviation of the FRET values under steady-state conditions is 0.003. Thus, the difference in responses between the wild-type and no-receptor strains is significant and clearly exceeds the standard deviation. The pH correction factors CpH at 3 mM KCl are 1.004 for the YFP signal and 1.016 for the CFP signal. Consequently, the pH-corrected FRET responses are 0.967´1.016/1.004=0.979 for the wild-type and 0.981´1.016/1.004=0.993 for the no-receptor strain. The reason the pH-corrected FRET response for the no-receptor strain is 0.993 instead of the expected 1.000 is that this value represents the lowest observed response rather than the average value for the FRET response.

      The detailed mathematical operation for correcting the pH impact has now been included in the “FRET assay” section of Materials and Methods.

      Author response image 1.

      Chemotactic response of the wild-type strain (A, HCB1288-pVS88) and the no-receptor strain (B, HCB1414-pVS88) to stepwise addition and removal of KCl. The blue solid line denotes the original normalized signal. Downward and upward arrows indicate the time points of addition and removal of 3 mM KCl, respectively. The horizontal red dashed line denotes the original normalized FRET response value to 3 mM KCl.

      The authors show the FRET data with both KCl and K2SO4, concluding that the chemotactic response mainly resulted from potassium ions. However, this was only measured by FRET. It would be more convincing if the motility assay in Fig1 is also performed with K2SO4. The authors did not address this point. In light of complications associated with the use of the FRET sensor, this experiment is more important.

      We thank the reviewer for the suggestion. We agree that additional confirmation with a motility assay is important. To address this, we have now measured the response of the motor rotational signal to 15 mM K2SO4 using the bead assay and compared it with the response to 30 mM KCl. The results are shown in Fig. S2. The response of motor CW bias to 15 mM K2SO4 exhibited an attractant response, characterized by a decreased CW bias upon the addition of K2SO4, followed by an over-adaptation that is qualitatively similar to the response to 30 mM KCl. However, there were notable differences in the adaptation time and the presence of an overshoot. Specifically, the adaptation time to K2SO4 was shorter compared to that for KCl, and there was a notable overshoot in the CW bias during the adaptation phase. These differences may have resulted from the weaker response to K2SO4 (Fig. S1B) and additional modifications due to CysZ-mediated cellular uptake of sulfate (Zhang et al., Biochimica et Biophysica Acta 1838,1809–1816 (2014)). The faster adaptation and overshoot complicated the chemotactic drift in the microfluidic assay as in Fig. 1, such that we were unable to observe a noticeable drift in a K2SO4 gradient under the same experimental conditions used for the KCl gradient.

      The response of motor rotational signal to 15 mM K2SO4 has been added to Fig. S2.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      (1) The response curve and adaptation level/time in the main text (Fig. 4) should be replaced by the corrected counterparts (currently in Fig. S5). The current version is especially confusing because Fig. 6 shows the corrected response, but the difference from Fig. 4 is not mentioned.

      We thank the reviewer for the suggestion. We have now merged the results of the original Fig. S5 into Fig. 4.

      a. The discussion of the uncorrected response with small hill coefficient and potentially negative cooperativity was left in the text (lines 223-234), but the new measurements show this is not true for the actual response. This should be removed or significantly rephrased.

      We thank the reviewer for the suggestion. We have now removed the statement about potentially negative cooperativity and added the corrected results for the actual response.

      (2) It may be helpful to restate the definition of f_m in the methods (near Eq. 3-4).

      Thank you for the suggestion. We have now restated the definition of fm and fL below Eq. 3-4: “In the denominator on the right-hand side of Eq. 3, the two terms within the parentheses of exponential expression represent the methylation-dependent (fm) and ligand-dependent (fL) free energy, respectively.”

    3. Reviewer #1 (Public Review):


      This paper shows that E. coli exhibits a chemotactic response to potassium by measuring both the motor response (using a bead assay) and the intracellular signaling response (CheY phosporylation level via FRET) to step changes in potassium concentration. They find increase in potassium concentration induces a considerable attractant response, with amplitude comparable to aspartate, and cells can quickly adapt (and generally over-adapt). The authors propose that the mechanism for potassium response is through modifying intracellular pH; they find both that potassium modifies pH and other pH modifiers induce similar attractant responses. It is also shown, using Tar- and Tsr-only mutants, that these two chemoreceptors respond to potassium differently. Tsr has a standard attractant response, while Tar has a biphasic response (repellent-like then attractant-like). Finally, the authors use computer simulations to study the swimming response of cells to a periodic potassium signal secreted from a biofilm and find a phase delay that depends on the period of oscillation.


      The finding that E. coli can sense and adapt to potassium signals and the connection to intracellular pH is quite interesting and this work should stimulate future experimental and theoretical studies regarding the microscopic mechanisms governing this response. The evidence (from both the bead assay and FRET) that potassium induces an attractant response is convincing, as is the proposed mechanism involving modification of intracellular pH. The updated manuscript controls for the impact of pH on the fluorescent protein brightness that can bias the measured FRET signal. After correction the response amplitude and sharpness (hill coefficient) are comparable to conventional chemoattractants (e.g. aspartate), indicating the general mechanisms underlying the response may be similar. The authors suggest that the biphasic response of Tar mutants may be due to pH influencing the activity of other enzymes (CheA, CheR or CheB), which will be an interesting direction for future study.


      The measured response may be biased by adaptation, especially for weak potassium signals. For other attractant stimuli, the response typically shows a low plateau before it recovers (adapts). In the case of potassium, the FRET signal does not have an obvious plateau following the stimuli of small potassium concentrations, perhaps due to the faster adaptation compared to other chemoattractants. It is possible cells have already partially adapted when the response reaches its minimum, so the measured response may be a slight underestimate of the true response. Mutants without adaptation enzymes appear to be sensitive to potassium only at much larger concentrations, where the pH significantly disrupts the FRET signal; more accurate measurements would require the development of new mutants and/or measurement techniques.

      Note added after the second revision: The authors made a reasonable argument regarding the effects of adaptation, which were estimated to be small.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):


      The manuscript "Self-inhibiting percolation and viral spreading in epithelial tissue" describes a model based on 5-state cellular automata of development of an infection. The model is motivated and qualitatively justified by time-resolved measurements of expression levels of viral, interferon-producing, and antiviral genes. The model is set up in such a way that the crucial difference in outcomes (infection spreading vs. confinement) depends on the initial fraction of special virus-sensing cells. Those cells (denoted as 'type a') cannot be infected and do not support the propagation of infection, but rather inhibit it in a somewhat autocatalytic way. Presumably, such feedback makes the transition between two outcomes very sharp: a minor variation in concentration of ``a' cells results in qualitative change from one outcome to another. As in any percolation-like system, the transition between propagation and inhibition of infection goes through a critical state with all its attributes. A power-law distribution of the cluster size (corresponding to the fraction of infected cells) with a fairly universal exponent and a cutoff at the upper limit of this distribution.


      The proposed model suggests an explanation for the apparent diversity of outcomes of viral infections such as COVID.

      Author response: We thank the referee for the concise and accurate summary of our work.


      Those are not real points of weakness, though I think addressing them would substantially improve the manuscript.

      Author response: Below we will address these point by point.

      The key point in the manuscript is the reduction of actual biochemical processes to the NOVAa rules. I think more could be said about it, be it referring to a set of well-known connections between expression states of cells and their reaction to infection or justifying it as an educated guess.

      Author response: We have now improved this part in the model section. We have added a few sentences explaining how the cell state transitions are motivated by the UMAP results:

      “The cell state transitions triggered by IFN signaling or viral replication are known in viral infection, but how exactly the transitions are orchestrated for specific infections is poorly understood. The UMAP cell state distribution hints at possible preferred transitions between states. The closer two cell states are on the UMAP, the more likely transitions between them are, all else being equal. For instance, the antiviral state (𝐴) is easily established from a susceptible cell (𝑂), but not from the fully virus-hijacked cell (𝑉 ). The IFN-secreting cell state (𝑁) requires the co-presence of the viral and antiviral genes and thus the cell cluster is located between the antiviral state (𝐴) and virus-infected state (𝑉 ) but distant from the susceptible cells (𝑂).

      Inspired by the UMAP data visualization (Fig. 1a), we propose the following transitions between five main discrete cell states”

      Another aspect where the manuscript could be improved would be to look a little beyond the strange and 'not-so-relevant for a biomedical audience' focus on the percolation critical state. While the presented calculation of the precise percolation threshold and the critical exponent confirm the numerical skills of the authors, the probability that an actual infected tissue is right at the threshold is negligible. So in addition to the critical properties, it would be interesting to learn about the system not exactly at the threshold: For example, how the speed of propagation of infection depends on subcritical p_a and what is the cluster size distribution for supercritical p_a.

      Author response: We agree that further exploring the model away from the critical threshold is worthwhile. While our main focus has been on explaining the large degree of heterogeneity in outcomes – readily explained as a consequence of the sharp threshold-like behavior – we now include plots of the time-evolution of the infection (as well as the remaining states) over time for subcritical values of pa. The plots can be found in Figure S4 of the supplement.

      Reviewer #2 (Public Review):

      Xu et al. introduce a cellular automaton model to investigate the spatiotemporal spreading of viral infection. In this study, the author first analyzes the single-cell RNA sequencing data from experiments and identifies four clusters of cells at 48 hours post-viral infection, including susceptible cells (O), infected cells (V), IFN-secreting cells (N), and antiviral cells (A). Next, a cellular automaton model (NOVAa model) is introduced by assuming the existence of a transient pre-antiviral state (a). The model consists of an LxL lattice; each site represents one cell. The cells change their state following the rules depending on the interaction of neighboring cells. The model introduces a key parameter, p_a, representing the fraction of pre-antiviral state cells. Cell apoptosis is omitted in the model. Model simulations show a threshold-like behavior of the final attack rate of the virus when p_a changes continuously. There is a critical value p_c, so that when p_a < p_c, infections typically spread to the entire system, while at a higher p_a > p_c, the propagation of the infected state is inhibited. Moreover, the radius R that quantifies the diffusion range of N cells may affect the critical value p_c; a larger R yields a smaller value of the critical value p_c. The structure of clusters is different for different values of R; greater R leads to a different microscopic structure with fewer A and N cells in the final state. Compared with the single-cell RNA seq data, which implies a low fraction of IFN-positive cells - around 1.7% - the model simulation suggests R=5. The authors also explored a simplified version of the model, the OVA model, with only three states. The OVA model also has an outbreak size. The OVA model shows dynamics similar to the NOVAa model. However, the change in microstructure as a function of the IFN range R observed in the NOVAa model is not observed in the OVA model.

      Author response: We thank the referee for the comprehensive summary of our work.

      Data and model simulation mainly support the conclusions of this paper, but some weaknesses should be considered or clarified.

      Author response: Thank you - we will address these point by point below.

      (1) In the automaton model, the authors introduce a parameter p_a, representing the fraction of pre-antiviral state cells. The authors wrote: ``The parameter p_a can also be understood as the probability that an O cell will switch to the N or A state when exposed to the virus of IFNs, respectively.' Nevertheless, biologically, the fraction of pre-antiviral state cells does not mean the same value as the probability that an O cell switches to the N or A state. Moreover, in the numerical scheme, the cell state changes according to the deterministic role N(O)=a and N(a)=A. Hence, the probability p_a did not apply to the model simulation. It may need to clarify the exact meaning of the parameter p_a.

      Author response: We acknowledge that this was an imprecise formulation, and have now changed it.

      What we tried to convey with that comment was that, alternatively to having a certain fraction of cells be in the a state initially, one could instead have devised a model in which We should note that even the current model has a level of stochasticity, since we choose the cells to be updated with a constant probability rate - we choose N cells to update in each timestep, with replacement.

      However, based on your suggestion, we simulated a version of the dynamics which included stochastic conversion, i.e. each action of a cell on a nearby cell happens only with a probability p_conv (and the original model is recovered as the p_conv=1 scenario). Of course, this slows down the dynamics (or effectively rescales time by a factor p_conv), but crucially we find that it does not appreciably affect the location of the threshold p_c. Below we include a parameter scan across p_a values for R=1 and p_conv=0.5, which shows that the threshold continues to appear at around p_a=27%. each O-state cell simply had a probability to act as an a-state cell upon exposure to the virus or to interferons, i.e. to switch to an N state (if exposed to virus) or to the A state (if exposed to interferons). In this simplified model, there would be no functional difference, since it would simply amount to whether each cell had a probability to be designated an a-cell initially (as in our model), or upon exposure. So our remark mainly served to explain that the role of the p_a parameter is simply to encode that a certain fraction of virus-naive cells behave this way (whether predetermined or not).

      (2) The current model is deterministic. However, biologically, considering the probabilistic model may be more realistic. Are the results valid when the probability update strategy is considered? By the probability model, the cells change their state randomly to the state of the neighbor cells. The probability of cell state changes may be relevant for the threshold of p_a. It is interesting to know how the random response of cells may affect the main results and the critical value of p_a.

      Author response: This is a good point - we are firm believers in the importance of stochasticity. We should note that even the current model has a level of stochasticity, since we choose the cells to be updated with a constant probability rate - we choose N cells to update in each timestep, with replacement.

      However, based on your suggestion, we simulated a version of the dynamics which included stochastic conversion, i.e. each action of a cell on a nearby cell happens only with a probability p_conv (and the original model is recovered as the p_conv=1 scenario). Of course, this slows down the dynamics (or effectively rescales time by a factor p_conv), but crucially we find that it does not appreciably affect the location of the threshold p_c. Below we include a parameter scan across p_a values for R=1 and p_conv=0.5, which shows that the threshold continues to appear at around p_a=27%.

      We now discuss these findings in the supplement and include the figure below as Fig. S5.

      Author response image 1.

      (3) Figure 2 shows a critical value p_c = 27.8% following a simulation on a lattice with dimension L = 1000. However, it is unclear if dimension changes may affect the critical value.

      Author response: Re-running the simulations on a lattice 4x as large (i.e. L=2000) yields a similar critical value of 27-28% for R=1, so we are confident that finite size effects do not play a major role at L=1000 and beyond. For R=5, however, we find that a minimum lattice size greater than L=1000 is necessary to determine the critical threshold. Concretely, we find that the threshold value pc for R=5 changes somewhat when the lattice size is increased from 1000 to 2000, but is invariant under a change from 2000 to 3000, so we conclude that L=2000 is sufficient for R=5. The pc value for R=5 cited in the manuscript (~0.4%) was determined from simulations at L=2000.

      Reviewer #3 (Public Review):


      This study considers how to model distinct host cell states that correspond to different stages of a viral infection: from naïve and susceptible cells to infected cells and a minority of important interferon-secreting cells that are the first line of defense against viral spread. The study first considers the distinct host cell states by analyzing previously published single-cell RNAseq data. Then an agent-based model on a square lattice is used to probe the dependence of the system on various parameters. Finally, a simplified version of the model is explored, and shown to have some similarity with the more complex model, yet lacks the dependence on the interferon range. By exploring these models one gains an intuitive understanding of the system, and the model may be used to generate hypotheses that could be tested experimentally, telling us "when to be surprised" if the biological system deviates from the model predictions.

      Author response: Thank you for the summary! We agree with the role that you describe for a model such as this one.


      -  Clear presentation of the experimental findings and a clear logical progression from these experimental findings to the modeling.

      -  The modeling results are easy to understand, revealing interesting behavior and percolation-like features.

      -  The scaling results presented span several decades and are therefore compelling. - The results presented suggest several interesting directions for theoretical follow-up work, as well as possible experiments to probe the system (e.g. by stimulating or blocking IFN secretion).


      -  Since the "range" of IFN is an important parameter, it makes sense to consider lattice geometries other than the square lattice, which is somewhat pathological. Perhaps a hexagonal lattice would generalize better.

      -  Tissues are typically three-dimensional, not two-dimensional. (Epithelium is an exception). It would be interesting to see how the modeling translates to the three-dimensional case. Percolation transitions are known to be very sensitive to the dimensionality of the system.

      Author response: We agree that probing different lattice geometries (2- and 3-dimensional alike) would be interesting and worthwhile. However, for this manuscript, we prefer to confine the analysis to the current, simple case. We do agree, however, that an extensive exploration of the role of geometry is an interesting future possibility.

      -  The fixed time-step of the agent-based modeling may introduce biases. I would consider simulating the system with Gillespie dynamics where the reaction rates depend on the ambient system parameters.

      -  Single-cell RNAseq data typically involves data imputation due to the high sparsity of the measured gene expression. More information could be provided on this crucial data processing step since it may significantly alter the experimental findings.

      Justification of claims and conclusions:

      The claims and conclusions are well justified.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      It is necessary to explain what UMAP does. Is clustering done in the space of twenty-something original dimensions or 2D? How UMAP1 and UMAP2 are selected and are those the same in all plots?

      Author response: We have now added a few sentences to clarify the point raised above - the second snippet explains how clustering is performed:

      “As a dimension reduction algorithm, UMAP is a manifold learning technique that favors the preservation of local distances over global distances (McInnes et al., 2018; Becht et al., 2019). It constructs a weighted graph from the data points and optimizes the graph layout in the low-dimensional space.”

      “We cluster the cells with the principal components analysis (PCA) results from their gene expression. With the first 16 principal components, we calculate k-nearest neighbors and construct the shared nearest neighbor graph of the cells then optimize the modularity function to determine clusters. We present the cluster information on the UMAP plane and use the same UMAP coordinates for all the plots in this paper hereafter.”

      Figure 1, what do bars in the upper right corners of panels d,e,f, and g indicate? ``Averaged' refers to time average? Something is missing in ``Cell proportions are labeled with corresponding colors in a)' .

      Author response: Thank you - we have now modified the figure caption. The bars in the upper right corners of panels d, e, f are color keys for gene expression, the brighter the color is, the higher the gene expression is.

      “Averaged” gene expression refers to the mean expression of that particular gene across the cells within each indicated cluster.

      The lines in c) correspond to cell proportions in different states at different time points. The same state in 1) and c) is shown in the same color.

      Line 46, ``However' does not sound right in this context. Would ``Also' be better?

      Author response: We agree and have corrected it in the revised manuscript.

      Line 96``The viral genes are also partially expressed in these cells, but different from the 𝑁 cluster, the antiviral genes are fully expressed (Fig. S1 and S2).' The sentence needs to be rephrased.

      Author response: We have rephrased the sentence: “As in the N cluster, the viral gene E is barely detected in these cells, indicating incomplete viral replication. However, in contrast to the N cluster, the antiviral genes are expressed to their full extent (Fig. S1 and S2).”

      Line 126, missing "be", ``large' -> ``larger'.

      Author response: Thank you, we have now corrected these typos.

      Line 139-140 The logical link between ignoring apoptosis and the diffusion of IFN is unclear.

      Author response: We modified the sentence as “Here, we assume that the secretion of IFNs by the 𝑁 cells is a faster process than possible apoptosis (Wen et al., 1997; Tesfaigzi, 2006) of these cells and that the diffusion of IFNs to the neighborhood is not significantly affected by apoptosis.”

      Fig. 2a Do the yellow arrows show the effect of IFN and the purple arrows the propagation of viral infection?

      Author response: That is correct. We have added this information to the figure caption: “The straight black arrows indicate transitions between cell states. The curved yellow arrows indicate the effects of IFNs on activating antiviral states. The curved purple arrows indicate viral spread to cells with 𝑂 and 𝑎 states.”

      Fig. 3, n(s) as the axis label vs P(s) in the text? How do the curves in panel a) look when the p_a is well above or below p_c?

      Author response: Thank you. We have edited the labels in the figure to reflect the symbols used in the text.

      Boundary conditions? From Fig. 4, apparently periodic?

      Author response: Yes, we use periodic boundary conditions in the model. We clarify it in the model section now (last sentence).

      It will be good to see a plot with time dependences of all cell types for a couple of values of p_a, illustrating propagation and cessation of the infection.

      Author response: We agree, and have added a Figure S4 in the supplement which explores exactly that. Thank you for the suggestion.

      A verbal qualitative description of why p_a has such importance and how the infection is terminated for large p_a would help.

      Reviewer #2 (Recommendations For The Authors):

      Below are two minor comments:

      (1) In the single-cell RNA sequencing data analysis, the authors describe the cell clusters O, V, A, and N. However, showing how the clusters are identified from the data might be more straightforward.

      Author response: Technically, we cluster the cells using principal components analysis (PCA) results of their gene expression. With the first 16 principal components, we calculate k-nearest neighbors and construct the shared nearest neighbor graph of the cells and then optimize the modularity function to determine clusters. We manually annotate the clusters with O, V, A, and N based on the detected abundance of viral genes, antiviral genes, and IFNs.

      (2) In Figure 3, what does n(s) mean in Figure 3a? And what is the meaning of the distribution P(s) of infection clusters? It may be stated clearly.

      Author response: The use of n(s) was inconsistent, and we have now edited the figure to instead say P(s), to harmonize it with the text. P(s) is the distribution of cluster sizes, s, expressed as a fraction of the whole system. In other words, once a cluster has reached its final size, we record s=(N+V)/L^2 where N and V are the number of N and V state cells in the cluster (note that, by design, each simulation leads to a single cluster, since we seed the infection in one lattice point). We now indicate more clearly in the caption and the main text what exactly P(s) and s refer to.

      Reviewer #3 (Recommendations For The Authors):

      - Would the authors kindly share the simulation code with the community? Also, the data analysis code should be shared to follow current best practices. This needs to be standard practice in all publications. I would go as far as to say that in 2024 publishing a data analysis / simulation study without sharing the relevant code should be ostracized by the community.

      Author response: We absolutely agree and have created a GitHub repository in which we share the C++ source code for the simulations and a Python notebook for plotting. The public repository can be found at https://github.com/BjarkeFN/ViralPercolation. We add this information in supplement under section “Code availability”.


      - I would avoid the use of the wording "critical" threshold since this is almost guaranteed to infuriate a certain type of reader.


      - Line 265 has a curious use of " ... " which should be replaced with something more appropriate.

      Author response: Thank you for pointing it out! We have checked the typos.

    2. Reviewer #1 (Public Review):


      The manuscript describes a model based on 5-state cellular automata of development of an infection. The model is motivated and qualitatively justified by time-resolved measurements of expression levels of viral, interferon-producing, and antiviral genes. The model is set up in such a way that the crucial difference in outcomes (infection spreading vs. confinement) depends on the initial fraction of special virus-sensing cells. Those cells (denoted as 'type a') cannot be infected and do not support the propagation of infection, but rather inhibit it in a somewhat autocatalytic way. Presumably, such feedback makes the transition between two outcomes very sharp: a minor variation in concentration of 'a' cells results in qualitative change from one outcome to another. As in any percolation-like system, the transition between propagation and inhibition of infection goes through a critical state with all its attributes, including a power-law distribution of the cluster size (corresponding to the fraction of infected cells) with a fairly universal exponent and a cutoff at the upper limit of this distribution.


      The proposed model suggests a well-justified explanation for the frequently observed yet puzzling diversity of outcomes of viral infections such as COVID.



      This study presents a cellular automaton model to study the dynamics of virus-induced signalling and innate host defense against viruses such as SARS-CoV-2 in epithelial tissue. The simulations and data analysis are convincing and represent a valuable contribution that would be of interest to researchers studying the dynamics of viral propagation.

    4. Reviewer #2 (Public Review):

      Xu et al. introduce a cellular automaton model to investigate the spatiotemporal spreading of viral infection. In this study, the author first analyzes the single-cell RNA sequencing data from experiments and identifies four clusters of cells at 48 hours post-viral infection, including susceptible cells (O), infected cells (V), IFN-secreting cells (N), and antiviral cells (A). Next, a cellular automaton model (NOVAa model) is introduced by assuming the existence of a transient pre-antiviral state (a). The model consists of an LxL lattice; each site represents one cell. The cells change their state following the rules depending on the interaction of neighboring cells. The model introduces a key parameter, p_a, representing the fraction of pre-antiviral state cells. Cell apoptosis is omitted in the model. Model simulations show a threshold-like behavior of the final attack rate of the virus when p_a changes continuously. There is a critical value p_c, so that when p_a < p_c, infections typically spread to the entire system, while at a higher p_a > p_c, the propagation of the infected state is inhibited. Moreover, the radius R that quantifies the diffusion range of N cells may affect the critical value p_c; a larger R yields a smaller value of the critical value p_c. The authors further examine the result with stochastic version dynamics, and the main findings are unchanged upon stochastic dynamics. The structure of clusters is different for different values of R; greater R leads to a different microscopic structure with fewer A and N cells in the final state. Compared with the single-cell RNA seq data, which implies a low fraction of IFN-positive cells of around 1.7%, the model simulation suggests R=5. The authors also explored a simplified version of the model, the OVA model, with only three states. The OVA model also has an outbreak size. The OVA model shows dynamics similar to the NOVAa model. However, the change in microstructure as a function of the IFN range R observed in the NOVAa model is not observed in the OVA model.

    5. Reviewer #3 (Public Review):


      This study considers how to model distinct host cell states that correspond to different stages of a viral infection: from naïve and susceptible cells to infected cells and a minority of important interferon-secreting cells that are the first line of defense against viral spread. The study first considers the distinct host cell states by analyzing previously published single-cell RNAseq data. Then an agent-based model on a square lattice is used to probe the dependence of the system on various parameters. Finally, a simplified version of the model is explored, and shown to have some similarity with the more complex model, yet lacks the dependence on the interferon range. By exploring these models one gains an intuitive understanding of the system, and the model may be used to generate hypotheses that could be tested experimentally, telling us "when to be surprised" if the biological system deviates from the model predictions.


      - Clear presentation of the experimental findings and a clear logical progression from these experimental findings to the modeling.<br /> - The modeling results are easy to understand, revealing interesting behavior and percolation-like features.<br /> - The scaling results presented span several decades and are therefore compelling.<br /> - The results presented suggest several interesting directions for theoretical follow-up work, as well as possible experiments to probe the system (e.g. by stimulating or blocking IFN secretion).


      - The fixed time-step of the agent-based modeling may introduce biases. I would consider simulating the system with Gillespie dynamics where the reaction rates depend on the ambient system parameters.<br /> - Single-cell RNAseq data requires careful handling or it may generate false leads. The strength of the RNAseq evidence presented is not clear.

      Two places where the manuscript could be extended:

      - Since the "range" of IFN is an important parameter, it makes sense to consider other lattice geometries other than the square lattice, which is somewhat pathological. Perhaps a hexagonal lattice would generalize better.<br /> - Tissues are typically three-dimensional, not two-dimensional. (Epithelium is an exception). It would be interesting to see how the modeling translates to the three-dimensional case. Percolations transitions are known to be very sensitive to the dimensionality of the system.

      Justification of claims and conclusions:

      The claims and conclusions are well justified.

      This valuable study reports that actin-related proteins may be involved in transcriptional regulation during spermatogenesis. The supporting data remain incomplete, and more extensive disentanglement from the canonical role of these actin-related proteins and the experimental validation of in silico predictions are required. This work will be of interest to reproductive biologists and other researchers working on non-canonical roles of actin and actin-related proteins.

    2. Reviewer #1 (Public Review):


      This study offers a new perspective. ACTL7A and ACTL7B play roles in epigenetic regulation in spermiogenesis. Actin-like 7 A (ACTL7A) is essential for acrosome formation, fertilization, and early embryo development. ACTL7A variants cause acrosome detachment responsible for male infertility and early embryonic arrest. It has been reported that ACTL7A is localized on the acrosome in mouse sperms (Boëda et al., 2011). Previous studies have identified ACTL7A mutations (c.1118G>A:p.R373H; c.1204G>A:p.G402S, c.1117C>T:p.R373C), All these variants were located in the actin domain and were predicted to be pathogenic, affecting the number of hydrogen bonds or the arrangement of nearby protein structures (Wang et al., 2023; Xin et al., 2020; Zhao et al., 2023; Zhou et al., 2023). This work used AI to model the role of ACTL7A/B in the nucleosome remodeling complex and proposed a testis-specific conformation of SCRAP complex. This is different from previous studies.


      This study provides a new perspective to reveal the additional roles of these proteins.


      The results section contains a substantial background description. However, the results and discussion sections require streamlining. There is a lack of mutual support for data between the sections, and direct data to support the authors' conclusions are missing.

    3. Reviewer #2 (Public Review):


      How dynamics of gene expression accompany cell fate and cellular morphological changes is important for our understanding of molecular mechanisms that govern development and diseases. The phenomenon is particularly prominent during spermatogenesis, the process which spermatogonia stem cells develop into sperm through a series of steps of cell division, differentiation, meiosis, and cellular morphogenesis. The intricacy of various aspects of cellular processes and gene expression during spermatogenesis remains to be fully understood. In this study, the authors found that testis-specific actin-related proteins (which usually participate in modifying cells' cytoskeletal systems) ACTL7A and ACTL7B were expressed and localized in the nuclei of mouse spermatocytes and spermatids. Based on this observation, the authors analyzed protein sequence conservations of ACTL7B across dozens of species and identified a putative nuclear localization sequence (NLS) that is often responsible for the nuclear import of proteins that carry them. Using molecular biology experiments in a heterologous cell system, the authors verified the potential role of this internal NLS and found it indeed could facilitate the nuclear localization of marker proteins when expressed in cells. Using gene-deleted mouse models they generated previously, the authors showed that deletion of Actl7b caused changes in gene expression and mis-localization of nucleosomal histone H3 and chromatin regulator histone deacetylase HDAC1 and 2, supporting their proposed roles of ACTL7B in regulating gene expression. The authors further used alpha-Fold 2 to model the potential protein complexes that could be formed between the ARPs (ACTL7A and ACTL7B) and known chromatin modifiers, such as INO80 and SWI/SNF complexes and found that consistent with previous findings, it is likely that ACTL7A and ACTL7B interact with the chromatin-modifying complexes through binding to their alpha-helical HSA domain cooperatively. These results suggest that ACTL7B possesses novel functions in regulating chromatin structure and thus gene expression beyond conventional roles of cytoskeleton regulation, providing alternative pathways for understanding how gene expression is regulated during spermatogenesis and the etiology of relevant infertility diseases.


      The authors provided sufficient background to the study and discussions of the results. Based on their previous research, this study utilized numerous methods, including protein complex structural modeling method alpha-fold 2 Multimers, to further investigate the functional roles of ACTL7B. The results presented here are in general of good quality. The identification of a potential internal NLS in ACTL7B is mostly convincing, in line with the phenotypes presented in the gene deletion model.


      While the study offered an interesting new look at the functions of ARP proteins during spermatogenesis, some of the study is mainly theoretical speculations, including the protein complex formation. Some of the results may need further experimental verifications, for example, differentially expressed genes that were found in potentially spermatogenic cells at different developmental stages, in order to support the conclusions and avoid undermining the significance of the study.

    4. Reviewer #3 (Public Review):

      In this manuscript, Pierre Ferrer and colleagues explore the exciting possibility that, in the male germ line, the composition and function of deeply conserved chromatin remodeling complexes is fine-tuned by the addition of testis-specific actin-related proteins (ARPs). In this regard, the Authors aim to extend previously reported non-canonical (transcriptional) roles of ARPs in somatic cells to the unique developmental context of the germ line. The manuscript is focused on the potential regulatory role in post-meiotic transcription of two ARPs: ACTL7A and ACTL7B (particularly the latter). The canonical function of both testis-specific ARPs in spermatogenesis is well established, as they have been previously shown to be required for the extensive cellular morphogenesis program driving post-meiotic development (spermiogenesis). Disentangling the actual functions of ACTL7A and ACTL7B as transcriptional regulators from their canonical role in the profound morphological reshaping of post-meiotic cells (a process that also deeply impacts nuclear architecture and regulation) represents a key challenge in terms of interpreting the reported findings (see below).

      The authors begin by documenting, via fluorescence microscopy, the intranuclear localization of ACTL7B. This ARP is convincingly shown to accumulate in the nucleus of spermatocytes and spermatids. Using a series of elegant reporter-based experiments in a somatic cell line, the authors map the driver of this nuclear accumulation to a potential NLS sequence in the ACTL7B actin-like body domain. Ferrer and colleagues then performed a testicular RNA-seq analysis in ACTL7B KO mice to define the putative role of ACTL7B in male germ cell transcription. They report substantial changes to the testicular transcriptome - particularly the upregulation of several classes of genes - in ACTL7B KO mice. However, wild-type testes were used as controls for this experiment, thus introducing a clear confounding effect to the analysis (ACTL7B KO testes have extensive post-meiotic defects due to the canonical role of ACTL7B in spermatid development). Then, the authors employ cutting-edge AI-driven approaches to predict that both ACTL7A and ACTL7B are likely to bind to four key chromatin remodeling complexes. Although these predictions are based on a robust methodology, they would certainly benefit from experimental validation. Finally, the authors associate the loss of ACTL7B with decreased lysine acetylation and lower levels of the HDAC1 and HDAC3 chromatin remodelers in the nucleus of developing spermatids.

      Globally, these data may provide important insight into the unique processes male germ cells employ to sustain their extraordinarily complex transcriptional program. Furthermore, the concept that (comparably younger) testis-specific proteins can be incorporated into ancient chromatin remodeling complexes to modulate their function in the germ line is timely and exciting.

      It is my opinion that the manuscript would benefit from additional experimental validation to better support the authors' conclusions. In particular, I believe that addressing two critical points would substantially strengthen the message of the manuscript:

      (1) The proposed role of ACTL7B in post-meiotic transcriptional regulation temporally overlaps with the protein's previously reported canonical functions in spermiogenesis (PMID: 36617158 and 37800308). Indeed, the canonical functions of ACTL7B have been shown to have a profound effect at the level of spermatid morphology and to impact nuclear organization. This potentially renders the observed transcriptional deregulation in ACTL7B KO testes an indirect consequence of spermatid morphology defects. I acknowledge that it is experimentally difficult to disentangle the proposed intranuclear roles of ACTL7B from the protein's well-documented cytoplasmic function. Perhaps the generation of a NLS-scrambled ACTL7B variant could offer some insight. In light of the substantial investment this approach would represent, I would suggest, as an alternative, that instead of using wild-type testes as controls for the transcriptome and chromatin localization assays, the authors consider the possibility of using testicular tissue from a mutant with similarly abnormal spermiogenesis but due to transcription-independent defects. This would, in my opinion, offer a more suitable baseline to compare ACTL7B KO testes with.

      (2) The manuscript would greatly benefit if experimental validation of the AI-driven predictions were to be provided (in terms of the binding capacity of ACTL7A and ACTL7B to key chromatin remodeling complexes). More so it seems that the authors have the technical expertise / available mass spectrometry data required for this purpose (lines 664-665). Still on this topic, given the predicted interactions of ACTL7A and ACTL7B with the SRCAP, EP400, SMARCA2 and SMARCA4 complexes (Figure 7), it is rather counter-intuitive that the Authors chose for their immunofluorescence assays, in ACTL7B KO testes, to determine the chromatin localization of HDAC1 and HDAC3, rather than that of any of above four complexes.

      The authors develop a novel genetic strategy for specific and comprehensive labeling of axo-axonic cells, also referred to as chandelier cells, in the mouse brain. The approach and analysis are rigorous such that the data convincingly support the key conclusions, including the expanded distribution of axo-axonic cells throughout the brain. This study provides important new information about the distribution of a significant neuronal cell type, as well as new tools for future studies. This work will be of broad interest to neuroscientists who work on the anatomical and functional organization of neural circuits.

    2. Reviewer #2 (Public Review):


      The goals of this study were to develop a genetic approach that would specifically and comprehensively target axo-axonic cells (AACs) throughout the brain and then to describe the patterns and characteristics of the targeted AACs in multiple, selected brain regions. The investigators have been successful in providing the most complete description of the regional distribution of putative (pAACs) throughout the brain to date. The supporting evidence is convincing, and the findings should serve as a guide for more detailed studies of AACs within each brain region and lead to new insights into their connectivity and functional organization of this important group of GABAergic interneurons.


      The study has numerous strengths. A major strength is the development of a unique intersectional genetic strategy that uses cell lineage (Nkx2.1) and molecular (Unc5b or Pthlh) markers to identify AACs specifically and, apparently, nearly completely throughout the mouse brain. While AACs have been described previously in the cerebral cortex, hippocampus and amygdala, there has been no specific genetic marker that selectively identifies all AACs in these regions.

      Importantly, the current genetic strategy labels pAACs in additional brain regions, including the claustrum-insular complex, extended amygdala, and several olfactory centers in which AACs have not been previously recognized. In general, the findings provide support for the specificity of the methods for targeting AACs and include several examples of labeling near markers of axon initial segments, providing validation of their AAC identity.

      The descriptions and numerous low magnification images of the brain provide a roadmap for subsequent, detailed studies of AACs in numerous brain regions. The overview and summaries of the findings in the Abstract, Introduction and Discussion are particularly clear and helpful in placing the extensive regional descriptions of AACs in context.


      Considering the unique and striking characteristics of AACs, it would have been ideal to include a clear, high resolution confocal image of an AAC from the Unc5b;Nkx2.1 mouse that would display the beauty of these cells with their numerous cartridges of axon terminals, emanating from a single AAC. While several cells are illustrated, the processes are often obscured by other labeling or the background created by the blue Dapi labeling. A high-resolution image of an isolated cell would not only support the identity of the cells as AACs but also demonstrate the potential advantages of their labeling for more detailed anatomical and neurophysiological studies. High magnification views of the axon terminals adjacent to AnkG-labeled axon initial segments are included and provide strong support for the identity of the cells. However, they cannot convey the extensiveness and patterns of the axonal arborizations of these cells.

      The intersectional genetic methods included use of the lineage marker Nkx2.1 with either Unc5b or Pthlh as the molecular marker. As described, the mice with intersectional targeting of Nkx2.1 and Unc5b appear to show the most specific brain-wide labeling for AACs, and the majority of the descriptions are from these mice. The targeting with Nkx2.1 and Pthlh is less convincing and there appears to be a disconnect between the descriptions and the images. While the descriptions emphasize that the labeling is very similar in the two types of mice, the images suggest distinct differences, including labeling of non AACs in striatum and layer 4 of the cortex in the Pthlh;Nkx2.1 mouse, as described in the manuscript. In addition, the Pthlh;Nkx2.1 mouse has higher cell targeting in some regions and fewer labeled cells in others. Perhaps it would be more accurate to present the Pthlh;Nkx2.1 mouse as differing from the Unc5b;Nkx2.1 mouse, but useful for AAC labeling in select regions and under some conditions, such as following tamoxifen administration at specific ages. As currently presented, the inclusion of the Pthlh;Nkx2.1 detracts from the otherwise convincing argument that the Unc5b;Nkx2.1 mouse provides a specific and comprehensive way to identify AACs.

    3. Reviewer #3 (Public Review):


      Raudales et al. aimed at providing an insight into the brain-wide distribution and synaptic connectivity of bona fide GABAergic inhibitory interneuron subtypes focusing on the axo-axonic cell (AAC), one of the most distinctive interneuron subtypes, which innervates the axon initial segments of glutamatergic projection neurons. They establish intersectional genetic strategies that enable them to specifically and comprehensively capture AACs based on their lineage (Nkx2.1) and marker expression (Unc5b, Pthlh). They find that AACs are deployed across essentially all the pallium-derived brain structures as well as anterior olfactory nucleus, taenia tecta, and lateral septum. They show that AACs in distinct areas and layers of the neocortex as well as different subregions of the hippocampal formation display unique soma and synaptic density and morphological variations. Rabies virus-based retrograde monosynaptic input tracing reveals that AACs in the neocortex, the hippocampus, and the basolateral amygdala receive synaptic inputs from common as well as specific brain regions and supports the utility of this novel genetic approach. This study elucidates brain-wide neuroanatomical features and morphological variations of AACs with solid techniques and analysis. Their novel AAC-targeting strategies will facilitate the study of their development and function in different brain regions. The conclusions in this paper are well supported by the data. However, there are a few minor comments.

      (1) The authors added a description about validation of ChCs in the method section: "Validation was conducted with high-magnification confocal microscopy and defined by a cell exhibiting at least two RFP-labelled axons colocalized with AIS labelled by AnkryinG or Phospho-IκBα". However, this does not clearly define pAACs themselves. If they follow this criteria, an RFP-labeled cell exhibiting only one synaptic cartridge that is colocalized with an AIS should be a pAAC. Is this what the authors are triying to say?

      On the other hand, in the response to reviewers, the authors apparently define pAACs in a different way, in which they more focus on the number of cells exhibiting cartridges that are associated with AISs in a certain anatomical region rather than the number of cartridges per cell.

      "For BNST we did not positively identify more than a few exhibiting overlap with AnkryinG/IκBα, so we currently leave them as pAACs"<br /> "Putative AAC (pAACs) refers to populations in which relatively few single cell examples of AACs exhibiting co-localized cartridges were found"

      The authors need to directly define pAACs.

      (2) In the response to reviewers, the authors claimed that both Pthlh and Unc5b mice are useful for studying developing AACs. It would be nice if they include this content in the text (e.g. Discussion).

    4. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):


      In this manuscript, the authors set out to develop genetic tools that can specifically and comprehensively label Axo-Axonic Cells (AACs), also known as Chandelier cells. These AACs possess unique morphological and connectivity features, making them an ideal subject for studying various aspects of cell types across different experimental methods. To achieve both specificity and comprehensiveness in AAC labeling, the authors employ an intersectional strategy that combines lineage origin and molecular markers. This approach successfully targets AACs across the mouse brain and reveals their widespread distribution in various brain structures beyond the previously known regions. Additionally, the authors utilize rabies transneuronal labeling to provide a comprehensive overview of AACs, their variations, and input sources throughout the brain. This experimental approach offers a powerful model system for investigating the role of AACs in circuit development and function across diverse brain regions.


      Genetic Tools and Specificity: The authors' genetic tools show qualitative evidence of specificity for AACs, opening new avenues for targeted research on these cells. The use of intersectional strategies enhances the precision of AAC labeling.

      Widespread Distribution: The study significantly broadens our understanding of AAC distribution, revealing their presence in brain regions beyond what was previously documented. This expanded knowledge is a valuable contribution to the field.

      Transneuronal Labeling: The inclusion of rabies transneuronal labeling provides a comprehensive view of AACs, their variations, and input sources, allowing for a more holistic understanding of their role in neural circuits.


      Quantitative Analysis: While the claim of specificity appears qualitatively convincing, the manuscript could be improved with more quantitative analysis.

      We are glad that the reviewers appreciated our multimodal and brain-wide characterizations of the AAC population. We include many qualitative AAC examples and would like to highlight the quantitative nature of our whole brain cell body and cartridge analyses, made possible by transgenic targeting and our serial two-photon tomography imaging platform (STP). In addition to providing this brain wide AAC atlas, we also propose AACs as perhaps one of the best case examples for a bona fide cell type, which may inspire further in-depth anatomical and functional studies of AACs, and efforts to capture other ground truth cell types.

      Comprehensiveness Claim: The assertion of comprehensiveness, implying labeling "almost all" AACs in all brain regions, is challenging to substantiate conclusively. Acknowledging the limitations of proving complete comprehensiveness and discussing them in the discussion section would be more appropriate than asserting it in the results section.

      We thank the reviewer for this suggestion and have revised the results and discussion sections accordingly. The issue of how to access comprehensiveness in AAC labeling is a fair and important point, as dense brain-wide AAC labeling has not been achieved and assessed before. Previous studies had used less efficient and specific methods for capturing AACs, primarily in select areas of cortex, hippocampus, and amygdala. These AAC populations are recapitulated by our genetic strategies with higher density and specificity. It does not seem that we have missed any previously-reported AAC populations; in fact, we discovered multiple previously unreported populations. Another evidence supporting our “comprehensive” labeling of AACs is that two independent Unc5b and Pthlh transgenic strategies showed very similar AAC distribution patterns (Fig. 1 Suppl. 3). However, we recognize that probably the only way to fully assess “completeness” of labeling may be to compare with anatomical ground truth, such as by dense EM reconstruction of all AACs across the brain volume. This is currently not technically possible but may become feasible in the future. 

      Local Inputs: While the manuscript focuses on inter-areal inputs to AACs, it would benefit from exploring local inputs as well. Identifying the local neurons that target AACs and analyzing their patterns could provide valuable insights into AAC function within specific brain regions.

      This is a good suggestion. However, our serial two-photon tomography imaging platform does not have the capability for reliably preserving tissue sections for immunohistochemical processing afterward. Additionally, though our starter AAV injections were limited to 100-150nL, there were far too many input cells labelled at the injection side to resolve individual input cells and correlate with their synaptic partners (e.g. a rabies-labelled pyramidal cell within the injection site may still project to starter cell few hundred microns away). Thus, our rabies input mapping was best suited for characterizing long-range inputs and was the focus here. For studying local inputs to AACs, future studies could combine very dilute starter AAV injections with multi-marker characterization of cell types by immunohistochemistry or FISH.  

      Discussion Focus: The discussion section should delve deeper into the biological implications of the findings, moving beyond technical significance. Exploring similarities and differences in input patterns between AACs and other cell types, and linking them to the locations of starter cells or specific connectivity patterns in the brain, would enrich the discussion. For instance, investigating whether input patterns can be predicted based on the locations of starter cells or connectivity specificity could provide valuable insights.

      We thank the reviewer for this suggestion. We have expanded the discussion to include more on the relevance and implications of our input mapping results to different starter populations of AACs.

      Reviewer #2 (Public Review):


      The goals of this study were to develop a genetic approach that would specifically and comprehensively target axo-axonic cells (AACs) throughout the brain and then to describe the patterns and characteristics of the targeted AACs in multiple, selected brain regions. The investigators have been successful in providing the most complete description of the regional distribution of putative (pAACs) throughout the brain to date. The supporting evidence is convincing, even though incomplete in some brain regions. The findings should serve as a guide for more detailed studies of AACs within each brain region and lead to new insights into the connectivity and functional organization of this important group of GABAergic interneurons.


      The study has numerous strengths. A major strength is the development of a unique intersectional genetic strategy that uses cell lineage (Nkx2.1) and molecular (Unc5b or Pthlh) markers to identify axo-axonic AACs specifically and, apparently, nearly completely throughout the mouse brain. While AACs have been described previously in the cerebral cortex, hippocampus, and amygdala, there has been no specific genetic marker that selectively identifies all AACs in these regions.

      The current genetic strategy has labeled pAACs in a large number of additional brain regions, including the claustrum-insular complex, extended amygdala, and several olfactory centers. In general, the findings provide support for the specificity of the methods for targeting AACs, and include some examples of labeling near markers of axon initial segments. However, the Investigators are careful to refer to labeled neurons as "putative AACs" as they have not been fully characterized and their identity verified.

      The descriptions and numerous low-magnification images of the brain provide a roadmap for subsequent, detailed studies of AACs in numerous brain regions. The overview and summaries of the findings in the Abstract, Introduction, and Discussion are particularly clear and helpful in placing the extensive regional descriptions of AACs in context.


      One weakness of the study is the lack of an illustration of the high-resolution cell labeling that can be achieved with the methods, including labeling of numerous rows of axon terminals in contact with axon initial segments. The initial images of the brain-wide distribution of putative AACs are necessarily presented at low magnification. Although the authors indicate that the cells have "highly characteristic AAC labeling patterns throughout the neocortex, hippocampus and BLA", these morphological details cannot be visualized by the reader at the current magnification, even when the images are enlarged on the computer screen. Some of the details become evident in later Figures, but an initial illustration of single cell labeling with confocal microscopy, or tracing of their characteristic axonal arbors, would support the specificity of the labeling in the low magnification images.

      We thank the reviewer for the suggestion. We have now added high-resolution images showing the colocalization of AAC axon boutons (cartridges) along AnkG positive postsynaptic axon initial segments in Fig. 2 Suppl. 1, Figure 1 panels a, d, e, and Fig. 4 panels b, c. These images unequivocally demonstrate AAC identity and specificity.

      Table 1 indicates that the AAC identity of the cells has been validated in many brain regions but not in all. The methods used for validation have not been described and should be included for completeness. The authors are careful to acknowledge that labeled cells in some regions have not been validated and refer to such cells as pAACs.

      Validation was defined by colocalization of RFP-labelled AAC cartridges and AnkryinG or Phospho-IκBα-labelled axon initial segments, imaged by confocal microscopy. We provide high-magnification examples throughout figures 2-6 and supplements. We have also tried to clarify this better in the methods section entitled “Immunohistochemistry.” Putative AAC (pAACs) refers to populations in which relatively few single cell examples of AACs exhibiting co-localized cartridges were found, largely due to the sparsity of the low tamoxifen dosage used (see response above).

      The intersectional genetic methods included the use of the lineage marker Nkx2.1 with either Unc5b or Pthlh as the molecular marker. As described, the mice with intersectional targeting of Nkx2.1 and Unc5b appear to show the most specific brain-wide labeling for AACs, and the majority of the descriptions are from these mice. The targeting with Nkx2.1 and Pthlh is less convincing. The title for Figure 1 Supplemental Figure 3 suggests a similar AAC distribution in the Pthlh;Nkx2.1 mouse compared to the Unc5b;Nkx2.1 mouse. However, the descriptions of the individual panels suggest a number of inconsistencies and non-AAC labeling. The heavy labeling in the caudate and cells in layer 4 is particularly problematic. Based on the data presented, it appears that heavy labeling achieved in these mice could not be relied on for specific labeling of all AACs, although specific labeling could be achieved under some conditions, such as following tamoxifen administration at select ages.

      The reviewer is correct about Pthlh being less specific for AACs than Unc5b when crossed to a constitutive Nkx2.1 recombinase driver line. Pthlh/Nkx2.1 intersection labeled a set of layer 4 cells in somatosensory cortex and dense cells in striatum, which are clearly not AACs. But these are the only main difference compared to Unc5b/Nkx2.1 intersection. As the reviewer points out, it is only when Pthlh is crossed to an inducible Nkx2.1-CreER line and induced embryonically with tamoxifen that there is more specific AAC labeling (at least in cortex). We included this data as well as the intersection with VIP-Cre in case either of these are useful to researchers studying fate-mapping of AACs or bipolar cell interneurons. We have also revised the title of Fig. 1 Suppl. 3 to better convey this.

      The methods described for dense labeling and single-cell labeling are described briefly in the methods. Some discussion of the development of the methods would be useful, including how it was determined that methods for heavy labeling identified AACs specifically and completely.

      We have added a description on the development of these to the methods section entitled “Animals.”

      Reviewer #3 (Public Review):


      Raudales et al. aimed at providing an insight into the brain-wide distribution and synaptic connectivity of bona fide GABAergic inhibitory interneuron subtypes focusing on the axo-axonic cell (AAC), one of the most distinctive interneuron subtypes, which innervates the axon initial segments of glutamatergic projection neurons. They establish intersectional genetic strategies that enable them to specifically and comprehensively capture AACs based on their lineage (Nkx2.1) and marker expression (Unc5b, Pthlh). They find that AACs are deployed across essentially all the pallium-derived brain structures as well as the anterior olfactory nucleus, taenia tecta, and lateral septum. They show that AACs in distinct areas and layers of the neocortex as well as different subregions of the hippocampal formation display unique soma and synaptic density and morphological variations. Rabies virus-based retrograde monosynaptic input tracing reveals that AACs in the neocortex, the hippocampus, and the basolateral amygdala receive synaptic inputs from common as well as specific brain regions and supports the utility of this novel genetic approach. This study elucidates brain-wide neuroanatomical features and morphological variations of AACs with solid techniques and analysis. Their novel AAC-targeting strategies will facilitate the study of their development and function in different brain regions. The conclusions in this paper are well supported by the data. However, there are a few comments to strengthen this study.

      (1) The definition of putative AAC (pAAC) is unclear and Table 1 may not be accurate. Although the authors find synaptic cartridges of RFP-labeled cells in the claustro-insular complex and the dorsal endopiriform nuclei, they still consider these cells as pAACs (not validated). The authors claim that without examining the presence of synaptic cartridges, RFP-labeled cells in the hypothalamus and the bed nuclei of the stria terminalis (BNST) are pAACs while those in the L4 of the somatosensory cortex in Pthlh;Nkx2.1;Ai65 mice are non-AACs. In Table 1, the BNST is supposed to contain AACs (validated), but in the text, the authors claim that RFP-labeled cells in the BNST are pAACs. Could the authors clarify how AACs, pAACs, and non-AACs are defined?

      We thank the reviewer for their interest and comments on our work. Please see our response to reviewer 2 for clarification on putative pAACs. Additionally, we have clarified in the methods under “Immunohistochemistry” how we defined AACs, pAAC, and non-AACs. For BNST we did not positively identify more than a few exhibiting overlap with AnkryinG/IκBα, so we currently leave them as pAACs—Table 1 has been corrected to reflect this.

      (2) The intersectional strategies presented in this study could also specifically capture developing AACs. If so, how early are AACs labeled in the brain? It would also be nice if the authors could add a simple schematic like Fig. 1a showing the time course of Pthlh expression.

      We thank the reviewer for suggesting the application of our method in studying AAC development. As the onset of Unc5b is in early postnatal time, tamoxifen induction of Unc5b-CreER in early postnatal days can enable studies of AAC neurite and synapse development, maturation, and plasticity. Similarly, Pthlh expression in the brain is relatively low/absent at P4 and present at P14 and later timepoints. Pthlh-Flp;Nkx2.1-Cre intersection can be used to study postnatal AAC development and plasticity.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      While the claim of specificity appears qualitatively convincing, additional quantitative analysis would make the authors' claim much stronger. For example in Figure 4 (f-h), where the authors show an overlap of AAC axons with AnkG labeling, there also appears to be a region of AAC axon lacking adjacent AnkG labeling. The author could quantify the fraction of cartridges that overlap with AnkG labeling in different brain regions, potentially stringing their claim that pAACs are AACs as well as providing important documentation of the diversity or homogeneity of compartment targeting across the brain.

      As mentioned previously, we only performed AnkG co-labeling analysis on low-dose tamoxifen/sparsely labelled samples in which we could readily differentiate individual cells. This was performed on samples with the Ai65 cytoplasmic reporter—for validation purposes we could positively identify co-labelled cartridges, but it would be more difficult to accurately identify any cartridges not co-labeled (since the entire axon was labelled with RFP). For precisely identifying and mapping AAC cartridge locations we found the intersectional synaptophysin-EGFP reporter (Fig. 2k-n) to be a more precise method for specifically labeling the “cartridge” segment of AAC axons. However, we did not try AnkG staining on samples from this reporter line, as they were set aside for STP imaging.

      Regarding the claim of comprehensiveness, labeling "almost all" AACs in all brain regions is a high standard and challenging to demonstrate conclusively. The study already significantly expands our understanding of AAC distribution, and the authors might consider discussing the limitations of proving complete comprehensiveness in the discussion rather than claiming it in the results section.

      We again thank the reviewer for this critique. As mentioned above, we have revised the results and discussion sections to better convey this point across.

      Furthermore, the manuscript connectivity section primarily focuses on inter-areal inputs to AACs, but it could benefit from exploring local inputs as well. By identifying the local neurons that target AACs, the authors could ask if there is any general property or rule of the local projections to AACs across the brain, or at least within the cortex. Moreover, a clear indication of the injection site would be helpful, particularly in Figure 7, where there seems to be some discrepancy between the histograms and fluorescent images regarding local projections. The histograms of Figure 7, seem to indicate that the local projection to AACs is a small fraction of all the presynaptic neurons, however, the fluorescent image for the SSp seems to suggest otherwise with many fluorescent cells in the injected area.

      We thank the reviewer for these comments. Regarding the local inputs in the rabies tracing datasets, it is a limitation (as mentioned above) of our STP platform’s inability to preserve tissue for immunohistochemistry labeling as well as our relatively dense starter cell labeling. Instead, our focus here was on long-range inputs (i.e. outside the ipsilateral ARA area of injection), which was simply not known for these AAC populations. We have revised the Figure 7 legend and added a description in the methods section to more clearly indicate that we only included long-range input projections in the Figure 7 histograms.

      In the discussion, the authors should delve more into the biological implications of their findings rather than solely emphasizing the technical significance. They could explore the similarities and differences in input patterns between AACs and other cell types, potentially linking them to the locations of their starter cells or specific connectivity patterns in the brain. For example, the authors could check if the input patterns could be predicted from the projections to the layers where their starter cells are located (either from an Atlas like the Allen Connectivity Atlas, or from retrograde rabies injections in the same locations). Can the differences between the input patterns to PVC and AAC be predicted for their location versus some specificity of connections?

      Thank you for the extensive comment. We address this point above, and have revised our discussion accordingly.

      Reviewer #2 (Recommendations For The Authors):

      The Figure legends vary in completeness and quality.

      (1) The legend for Figure 1 is very informative, and section e-g serves as a useful guide, as the legend includes the names of the brain regions related to the abbreviations and also indicates the specific panels that show the identified structures. Because of the large number of structures and the number of panels in each Figure, it would be ideal to follow the same pattern in the remaining figures.

      (2) Several edits are needed in the legend for Figure 1 Supplement Figure 1. The descriptions of a-f could be improved by providing general terms to describe the brain regions associated with the latter list of abbreviations (as has been done with the identification of the cerebral cortex, hippocampus, and olfactory centers and their related panels). One suggestion would be to write out insula, claustrum, and endopiriform prior to listing the abbreviations (AI, CLA, EP) (b-c) and adding amygdaloid complex and extended amygdala before the abbreviations (COA, BLA, MeA) (d-f) and (BST) (d).

      We thank the reviewer, as the suggestion of further expanding the abbreviations is a good one. As such, we have revised/reorganized the anatomical abbreviations in the figure legends for Figure 1 Supplement Figures 1, 2, and 3.

      Descriptions for Panels g-j require editing to link the appropriate panels and the descriptions. Panels for BSTpr appear to be g-h (rather than f-g) and i,j (rather than h-i.

      We have fixed this typo in the legend for Figure 1 Supplement Figure 1.

      Descriptions for Panels k-n could be edited to include abbreviations for the identified brain regions. For example, include the abbreviation ARHP after arcuate nuclei and indicate panels m-n (rather than j-l); include PVP after paraventricular and indicate panel n (rather than m); include DMPH after dorsomedial nuclei and indicate k-m (rather than j-l).

      Thank you for the suggestion. We have expanded the abbreviations in Figure 1 Supplement 1 accordingly.

      Reviewer #3 (Recommendations For The Authors):

      (1) Please clarify if tdTomato, EGFP (from helper AAVs), and RFP (from rabies virus) are native signals or IHC signals in legends.

      We have added the descriptors “native” or “stained” to all figure legends containing fluorescent images.

      (2) Fig. 4b and c: Please add insets of high-magnification images showing AAC boutons along AnkG-labeled AISs.

      We have added these insets to Fig. 4b and c.

      (3) Fig. 7S1: It appears that d and e are reversed. Judging from the positions of starter cells, d is for PV-Cre? Please make sure. It is also better to draw the laminar border in d and e.

      The original genotype labels are correct for Fig. 7S1 d and e. We have added the laminar borders as suggested.

      (4) Fig. 9b: Just for consistency, please label with the name of the helper AAV.


      (5) Line 617: intragranular>>>infragranular?

      Corrected, thank you.

      (6) It may be unclear to some readers if the images in the figures are from confocal or STP. The authors may want to clarify that all images in the figures are generated by confocal microscopy in the method section.

      We have clarified this better in the methods section, “Microcopy and image analysis.”

      (7) The authors should clarify that STP was used to map input cells to the brain in the result section.

      We have added this description in the results section.

    1. eLife assessment

      This useful study provides a novel method to detect sleep cycles based on variations in the slope of the power spectrum from electroencephalography signals. The method, dispensing with time-consuming and potentially subjective manual identification of sleep cycles, is supported by solid evidence and analyses but some aspects could be better illustrated and the source of the discrepancies between classical and fractal cycles should be identified. This study will be of interest to researchers and clinicians working on sleep and brain dynamics.

    2. Reviewer #1 (Public Review):


      In this study, Rosenblum et al introduce a novel and automatic way of calculating sleep cycles from human EEG. Previous results have shown that the slope of the non-oscillatory component of the power spectrum (called the aperiodic or fractal component) changes with the sleep stage. Building on this, the authors present an algorithm that extracts the continuous-time fluctuations in the fractal slope and propose that peaks in this variable can be used to identify sleep cycle limits. Cycles defined in this way are termed "fractal cycles". The main focus of the article is a comparison of fractal and classical, manually defined sleep cycles in numerous datasets.


      The manuscript amply illustrates through examples the strong overlap between fractal and classical cycle identification. Accordingly, a high percentage (81%) can be matched one-to-one between methods and sleep cycle duration is well correlated (around R = 0.5). Moreover, the methods track certain global changes in sleep structure in different populations: shorter cycles in children and longer cycles in patients medicated with REM-suppressing anti-depressants. Finally, a major strength of the results is that they show similar agreement between fractal and classical sleep cycle length in 5 different data sets, showing that it is robust to changes in recording settings and methods.

      These results suggest that the fractal cycle methodology could provide a valuable new method to study sleep architecture and avoid the time-consuming steps of manual cycle identification. Moreover, it has the potential to be applied to animal studies which rarely deal with sleep cycle structure.


      The match between fractal and classical cycles is not one-to-one. For example, the fractal method identifies a correlation between age and cycle duration in adults that is not apparent with the classical method. This raises the question as to whether differences are due to one method being more reliable than another or whether they are also identifying different underlying biological differences. It is not clear for example whether the agreement between the two methods is better or worse than between two human scorers, which generally serve as a gold standard to validate novel methods. The authors provide some insight into differences between the methods that could account for differences in results. However, given that the fractal method is automatic it would be important to clearly identify criteria for recordings in which it will produce similar results to the classical method.

    3. Reviewer #2 (Public Review):


      This study focused on using strictly the slope of the power spectral density (PSD) to perform automated sleep scoring and evaluation of the durations of sleep cycles. The method appears to work well because the slope of the PSD is highest during slow-wave sleep, and lowest during waking and REM sleep. Therefore, when smoothed and analyzed across time, there are cyclical variations in the slope of the PSD, fit using an IRASA (Irregularly resampled auto-spectral analysis) algorithm proposed by Wen & Liu (2016).


      The main novelty of the study is that the non-fractal (oscillatory) components of the PSD that are more typically used during sleep scoring can be essentially ignored because the key information is already contained within the fractal (slope) component. The authors show that for the most part, results are fairly consistent between this and conventional sleep scoring, but in some cases show disagreements that may be scientifically interesting.


      One weakness of the study, from my perspective, was that the IRASA fits to the data (e.g. the PSD, such as in Figure 1B), were not illustrated. One cannot get a sense of whether or not the algorithm is based entirely on the fractal component or whether the oscillatory component of the PSD also influences the slope calculations. This should be better illustrated, but I assume the fits are quite good.

      The cycles detected using IRASA are called fractal cycles. I appreciate the use of a simple term for this, but I am also concerned whether it could be potentially misleading? The term suggests there is something fractal about the cycle, whereas it's really just that the fractal component of the PSD is used to detect the cycle. A more appropriate term could be "fractal-detected cycles" or "fractal-based cycle" perhaps?

      The study performs various comparisons of the durations of sleep cycles evaluated by the IRASA-based algorithm vs. conventional sleep scoring. One concern I had was that it appears cycles were simply identified by their order (first, second, etc.) but were not otherwise matched. This is problematic because, as evident from examples such as Figure 3B, sometimes one cycle conventionally scored is matched onto two fractal-based cycles. In the case of the Figure 3B example, it would be more appropriate to compare the duration of conventional cycle 5 vs. fractal cycle 7, rather than 5 vs. 5, as it appears is currently being performed.

      There are a few statements in the discussion that I felt were either not well-supported. L629: about the "little biological foundation" of categorical definitions, e.g. for REM sleep or wake? I cannot agree with this statement as written. Also about "the gradual nature of typical biological processes". Surely the action potential is not gradual and there are many other examples of all-or-none biological events.

      The authors appear to acknowledge a key point, which is that their methods do not discriminate between awake and REM periods. Thus their algorithm essentially detected cycles of slow-wave sleep alternating with wake/REM. Judging by the examples provided this appears to account for both the correspondence between fractal-based and conventional cycles, as well as their disagreements during the early part of the sleep cycle. While this point is acknowledged in the discussion section around L686. I am surprised that the authors then argue against this correspondence on L695. I did not find the "not-a-number" controls to be convincing. No examples were provided of such cycles, and it's hard to understand how positive z-values of the slopes are possible without the presence of some wake unless N1 stages are sufficient to provide a detected cycle (in which case, then the argument still holds except that its alterations between slow-wave sleep and N1 that could be what drives the detection).

      To me, it seems important to make clear whether the paper is proposing a different definition of cycles that could be easily detected without considering fractals or spectral slopes, but simply adjusting what one calls the onset/offset of a cycle, or whether there is something fundamentally important about measuring the PSD slope. The paper seems to be suggesting the latter but my sense from the results is that it's rather the former.

    4. Author response:

      We thank the reviewers and editors for their review and assessment of our manuscript and comprehensive feedback. The manuscript will be revised to address all the reviewers’ comments. Specifically, to address the comment of Reviewer 1 and the editor regarding the lack of quantitative comparison between the classical and fractal cycle approaches and identification of the source of the discrepancies between classical and fractal cycles, we plan to perform and report the following analyses and comparisons:

      (1) Intra-method reliability

      a) Classical cycles. An additional scorer will independently define onsets and offsets of all classical sleep cycles for all datasets and mark sleep cycles with skipped REM sleep. Likewise, we will perform automatic sleep cycle detection. We will add a new Supplementary table showing the averaged cycle durations obtained by the two scorers and automatic algorithm as well as the inter-scorer rate agreement and update the Supplemental Excel file with corresponding information for each cycle for each participant for each dataset.

      b) Fractal cycles. We will correlate the durations of fractal cycles calculated using the parameters defined in the Main text with those calculated using different parameters, namely, the longer and shorter smoothing window lengths, higher and lower minimum peak prominence. Likewise, we will correlate the durations of fractal cycles calculated using frontal vs other available electrodes.

      (2) Origin of method differences

      In the current version of our Manuscript, we describe a few possible sources of discrepancies between classical and fractal cycle durations and numbers. Following the suggestion of one of the reviewers, in the revised Manuscript, we will quantify the sources of discrepancies between the two methods in order to identify the “criteria for recordings in which fractal cycles will produce similar results to the classical method”. Specifically, we will calculate the correlation between the difference in classical vs fractal sleep cycle durations on one side, and either the amplitudes of fractal descend/ascend, relative durations of cycles with skipped REM sleep and wake after sleep onset, or peak flatness on the other side.    

      In addition, we will include a new figure, illustrating the goodness of fit of the data as assessed by the IRASA method. Likewise, we will update Supplementary File 1 (that shows classical and fractal sleep cycles for each participant) with marks that highlight the onsets and offsets of sleep cycles as well as the cycles with skipped REM sleep.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Reviewer #1 (Public Review):

      In this paper, the authors evaluate the utility of brain age derived metrics for predicting cognitive decline by performing a 'commonality' analysis in a downstream regression that enables the different contribution of different predictors to be assessed. The main conclusion is that brain age derived metrics do not explain much additional variation in cognition over and above what is already explained by age. The authors propose to use a regression model trained to predict cognition ('brain cognition') as an alternative suited to applications of cognitive decline. While this is less accurate overall than brain age, it explains more unique variance in the downstream regression.  

      Importantly, in this revision, we clarified that we did not intend to use Brain Cognition as an alternative approach. This is because, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Here we made this point more explicit and further stated that the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. By examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. 

      REVISED VERSION: while the authors have partially addressed my concerns, I do not feel they have addressed them all. I do not feel they have addressed the weight instability and concerns about the stacked regression models satisfactorily.

      Please see our responses to Reviewer #1 Public Review #3 below

      I also must say that I agree with Reviewer 3 about the limitations of the brain age and brain cognition methods conceptually. In particular that the regression model used to predict fluid cognition will by construction explain more variance in cognition than a brain age model that is trained to predict age. This suffers from the same problem the authors raise with brain age and would indeed disappear if the authors had a separate measure of cognition against which to validate and were then to regress this out as they do for age correction. I am aware that these conceptual problems are more widespread than this paper alone (in fact throughout the brain age literature), so I do not believe the authors should be penalised for that. However, I do think they can make these concerns more explicit and further tone down the comments they make about the utility of brain cognition. I have indicated the main considerations about these points in the recommendations section below. 

      Thank you so much for raising this point. We now have the following statement in the introduction and discussion to address this concern (see below). 

      Briefly, we made it explicit that, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. That is, the relationship between Brain Cognition and fluid cognition indicates the upper limit of Brain Age’s capability in capturing fluid cognition. More importantly, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age. And this is the third goal of this present study. 

      From Introduction:

      “Third and finally, certain variation in fluid cognition is related to brain MRI, but to what extent does Brain Age not capture this variation? To estimate the variation in fluid cognition that is related to the brain MRI, we could build prediction models that directly predict fluid cognition (i.e., as opposed to chronological age) from brain MRI data. Previous studies found reasonable predictive performances of these cognition-prediction models, built from certain MRI modalities (Dubois et al., 2018; Pat, Wang, Anney, et al., 2022; Rasero et al., 2021; Sripada et al., 2020; Tetereva et al., 2022; for review, see Vieira et al., 2022). Analogous to Brain Age, we called the predicted values from these cognition-prediction models, Brain Cognition. The strength of an out-of-sample relationship between Brain Cognition and fluid cognition reflects variation in fluid cognition that is related to the brain MRI and, therefore, indicates the upper limit of Brain Age’s capability in capturing fluid cognition. This is, by design, the variation in fluid cognition explained by Brain Cognition should be higher or equal to that explained by Brain Age. Consequently, if we included Brain Cognition, Brain Age and chronological age in the same model to explain fluid cognition, we would be able to examine the unique effects of Brain Cognition that explain fluid cognition beyond Brain Age and chronological age. These unique effects of Brain Cognition, in turn, would indicate the amount of co-variation between brain MRI and fluid cognition that is missed by Brain Age.”

      From Discussion:

      “Third, by introducing Brain Cognition,  we showed the extent to which Brain Age indices were not able to capture the variation in fluid cognition that is related to brain MRI. More specifically, using Brain Cognition allowed us to gauge the variation in fluid cognition that is related to the brain MRI, and thereby, to estimate the upper limit of what Brain Age can do. Moreover, by examining what was captured by Brain Cognition, over and above Brain Age and chronological age via the unique effects of Brain Cognition, we were able to quantify the amount of co-variation between brain MRI and fluid cognition that was missed by Brain Age.

      From our results, Brain Cognition, especially from certain cognition-prediction models such as the stacked models, has relatively good predictive performance, consistent with previous studies (Dubois et al., 2018; Pat, Wang, Anney, et al., 2022; Rasero et al., 2021; Sripada et al., 2020; Tetereva et al., 2022; for review, see Vieira et al., 2022). We then examined Brain Cognition using commonality analyses (Nimon et al., 2008) in multiple regression models having a Brain Age index, chronological age and Brain Cognition as regressors to explain fluid cognition. Similar to Brain Age indices, Brain Cognition exhibited large common effects with chronological age. But more importantly, unlike Brain Age indices, Brain Cognition showed large unique effects, up to around 11%. As explained above, the unique effects of Brain Cognition indicated the amount of co-variation between brain MRI and fluid cognition that was missed by a Brain Age index and chronological age. This missing amount was relatively high, considering that Brain Age and chronological age together explained around 32% of the total variation in fluid cognition. Accordingly, if a Brain Age index was used as a biomarker along with chronological age, we would have missed an opportunity to improve the performance of the model by around one-third of the variation explained.” 

      This is a reasonably good paper and the use of a commonality analysis is a nice contribution to understanding variance partitioning across different covariates. I have some comments that I believe the authors ought to address, which mostly relate to clarity and interpretation 

      Reviewer #1 Public Review #1

      First, from a conceptual point of view, the authors focus exclusively on cognition as a downstream outcome. I would suggest the authors nuance their discussion to provide broader considerations of the utility of their method and on the limits of interpretation of brain age models more generally. 

      Thank you for your comments on this issue. 

      We now discussed the broader consideration in detail:

      (1) the consistency between our findings on fluid cognition and other recent works on brain disorders, 

      (2) the difference between studies investigating the utility of Brain Age in explaining cognitive functioning, including ours and others (e.g., Butler et al., 2021; Cole, 2020, 2020; Jirsaraie, Kaufmann, et al., 2023) and those explaining neurological/psychological disorders (e.g., Bashyam et al., 2020; Rokicki et al., 2021)


      (3) suggested solutions we and others made to optimise the utility of Brain Age for both cognitive functioning and brain disorders.

      From Discussion:

      “This discrepancy between the predictive performance of age-prediction models and the utility of Brain Age indices as a biomarker is consistent with recent findings (for review, see Jirsaraie, Gorelik, et al., 2023), both in the context of cognitive functioning (Jirsaraie, Kaufmann, et al., 2023) and neurological/psychological disorders (Bashyam et al., 2020; Rokicki et al., 2021). For instance,  combining different MRI modalities into the prediction models, similar to our stacked models, ocen leads to the highest performance of age prediction models, but does not likely explain the highest variance across different phenotypes, including cognitive functioning and beyond (Jirsaraie, Gorelik, et al., 2023).”

      “There is a notable difference between studies investigating the utility of Brain Age in explaining cognitive functioning, including ours and others (e.g., Butler et al., 2021; Cole, 2020, 2020; Jirsaraie, Kaufmann, et al., 2023) and those explaining neurological/psychological disorders (e.g., Bashyam et al., 2020; Rokicki et al., 2021). We consider the former as a normative type of study and the lader as a case-control type of study (Insel et al., 2010; Marquand et al., 2016). Those case-control Brain Age studies focusing on neurological/psychological disorders often build age-prediction models from MRI data of largely healthy participants (e.g., controls in a case-control design or large samples in a population-based design), apply the built age-prediction models to participants without vs. with neurological/psychological disorders and compare Brain Age indices between the two groups. On the one hand, this means that case-control studies treat Brain Age as a method to detect anomalies in the neurological/psychological group (Hahn et al., 2021). On the other hand, this also means that case-control studies have to ignore underfided models when applied prediction models built from largely healthy participants to participants with neurological/psychological disorders (i.e., Brain Age may predict chronological age well for the controls, but not for those with a disorder). On the contrary, our study and other normative studies focusing on cognitive functioning often build age prediction models from MRI data of largely healthy participants and apply the built age prediction models to participants who are also largely healthy. Accordingly, the age prediction models for explaining cognitive functioning in normative studies, while not allowing us to detect group-level anomalies, do not suffer from being under-fided. This unfortunately might limit the generalisability of our study into just the normative type of study. Future work is still needed to test the utility of brain age in the case-control case.”

      “Next, researchers should not select age-prediction models based solely on age-prediction performance. Instead, researchers could select age-prediction models that explained phenotypes of interest the best. Here we selected age-prediction models based on a set of features (i.e., modalities) of brain MRI. This strategy was found effective not only for fluid cognition as we demonstrated here, but also for neurological and psychological disorders as shown elsewhere (Jirsaraie, Gorelik, et al., 2023; Rokicki et al., 2021). Rokicki and colleagues (2021), for instance, found that, while integrating across MRI modalities led to age prediction models with the highest age-prediction performance, using only T1 structural MRI gave age-prediction models that were better at classifying Alzheimer’s disease. Similarly, using only cerebral blood flow gave age-prediction models that were better at classifying mild/subjective cognitive impairment, schizophrenia and bipolar disorder. 

      As opposed to selecting age-prediction models based on a set of features, researchers could also select age-prediction models based on modelling methods. For instance, Jirsaraie and colleagues (2023) compared gradient tree boosting (GTB) and deep-learning brain network (DBN) algorithms in building age-prediction models. They found GTB to have higher age prediction performance but DBN to have better utility in explaining cognitive functioning. In this case, an algorithm with better utility (e.g., DBN) should be used for explaining a phenotype of interest. Similarly, Bashyam and colleagues (2020) built different DBN-based age-prediction models, varying in age-prediction performance. The DBN models with a higher number of epochs corresponded to higher age-prediction performance. However, DBN-based age-prediction models with a moderate (as opposed to higher or lower) number of epochs were better at classifying Alzheimer’s disease, mild cognitive impairment and schizophrenia. In this case, a model from the same algorithm with better utility (e.g., those DBN with a moderate epoch number) should be used for explaining a phenotype of interest.

      Accordingly, this calls for a change in research practice, as recently pointed out by Jirasarie and colleagues (2023, p7), “Despite mounting evidence, there is a persisting assumption across several studies that the most accurate brain age models will have the most potential for detecting differences in a given phenotype of interest”. Future neuroimaging research should aim to build age-prediction models that are not necessarily good at predicting age, but at capturing phenotypes of interest.”

      Reviewer #1 Public Review #2

      Second, from a methods perspective, there is not a sufficient explanation of the methodological procedures in the current manuscript to fully understand how the stacked regression models were constructed. I would request that the authors provide more information to enable the reader to beUer understand the stacked regression models used to ensure that these models are not overfit. 

      Thank you for allowing us an opportunity to clarify our stacked model. We made additional clarification to make this clearer (see below). We wanted to confirm that we did not use test sets to build a stacked model in both lower and higher levels of the Elastic Net models. Test sets were there just for testing the performance of the models.  

      From Methods:

      “We used nested cross-validation (CV) to build these prediction models (see Figure 7). We first split the data into five outer folds, leaving each outer fold with around 100 participants. This number of participants in each fold is to ensure the stability of the test performance across folds. In each outer-fold CV loop, one of the outer folds was treated as an outer-fold test set, and the rest was treated as an outer-fold training set. Ultimately, looping through the nested CV resulted in a) prediction models from each of the 18 sets of features as well as b) prediction models that drew information across different combinations of the 18 separate sets, known as “stacked models.” We specified eight stacked models: “All” (i.e., including all 18 sets of features),  “All excluding Task FC”, “All excluding Task Contrast”, “Non-Task” (i.e., including only Rest FC and sMRI), “Resting and Task FC”, “Task Contrast and FC”, “Task Contrast” and “Task FC”. Accordingly, there were 26 prediction models in total for both Brain Age and Brain Cognition.

      To create these 26 prediction models, we applied three steps for each outer-fold loop. The first step aimed at tuning prediction models for each of 18 sets of features. This step only involved the outer-fold training set and did not involve the outer-fold test set. Here, we divided the outer-fold training set into five inner folds and applied inner-fold CV to tune hyperparameters with grid search. Specifically, in each inner-fold CV, one of the inner folds was treated as an inner-fold validation set, and the rest was treated as an inner-fold training set. Within each inner-fold CV loop, we used the inner-fold training set to estimate parameters of the prediction model with a particular set of hyperparameters and applied the estimated model to the inner-fold validation set. Acer looping through the inner-fold CV, we, then, chose the prediction models that led to the highest performance, reflected by coefficient of determination (R2), on average across the inner-fold validation sets. This led to 18 tuned models, one for each of the 18 sets of features, for each outer fold.

      The second step aimed at tuning stacked models. Same as the first step, the second step only involved the outer-fold training set and did not involve the outer-fold test set. Here, using the same outer-fold training set as the first step, we applied tuned models, created from the first step, one from each of the 18 sets of features, resulting in 18 predicted values for each participant. We, then, re-divided this outer-fold training set into new five inner folds. In each inner fold, we treated different combinations of the 18 predicted values from separate sets of features as features to predict the targets in separate “stacked” models. Same as the first step, in each inner-fold CV loop, we treated one out of five inner folds as an inner-fold validation set, and the rest as an inner-fold training set. Also as in the first step, we used the inner-fold training set to estimate parameters of the prediction model with a particular set of hyperparameters from our grid. We tuned the hyperparameters of stacked models using grid search by selecting the models with the highest R2 on average across the inner-fold validation sets. This led to eight tuned stacked models.

      The third step aimed at testing the predictive performance of the 18 tuned prediction models from each of the set of features, built from the first step, and eight tuned stacked models, built from the second step. Unlike the first two steps, here we applied the already tuned models to the outer-fold test set. We started by applying the 18 tuned prediction models from each of the sets of features to each observation in the outer-fold test set, resulting in 18 predicted values. We then applied the tuned stacked models to these predicted values from separate sets of features, resulting in eight predicted values. 

      To demonstrate the predictive performance, we assessed the similarity between the observed values and the predicted values of each model across outer-fold test sets, using Pearson’s r, coefficient of determination (R2) and mean absolute error (MAE). Note that for R2, we used the sum of squares definition (i.e., R2 \= 1 – (sum of squares residuals/total sum of squares)) per a previous recommendation (Poldrack et al., 2020). We considered the predicted values from the outer-fold test sets of models predicting age or fluid cognition, as Brain Age and Brain Cognition, respectively.”

      Author response image 1.

      Diagram of the nested cross-validation used for creating predictions for models of each set of features as well as predictions for stacked models. 

      Note some previous research, including ours (Tetereva et al., 2022), splits the observations in the outer-fold training set into layer 1 and layer 2 and applies the first and second steps to layers 1 and 2, respectively. Here we decided against this approach and used the same outer-fold training set for both first and second steps in order to avoid potential bias toward the stacked models. This is because, when the data are split into two layers, predictive models built for each separate set of features only use the data from layer 1, while the stacked models use the data from both layers 1 and 2. In practice with large enough data, these two approaches might not differ much, as we demonstrated previously (Tetereva et al., 2022).

      Reviewer #1 Public Review #3

      Please also provide an indication of the different regression strengths that were estimated across the different models and cross-validation splits. Also, how stable were the weights across splits? 

      The focus of this article is on the predictions. Still, it is informative for readers to understand how stable the feature importance (i.e., Elastic Net coefficients) is. To demonstrate the stability of feature importance, we now examined the rank stability of feature importance using Spearman’s ρ (see Figure 4). Specifically, we correlated the feature importance between two prediction models of the same features, used in two different outer-fold test sets. Given that there were five outer-fold test sets, we computed 10 Spearman’s ρ for each prediction model of the same features.  We found Spearman’s ρ to be varied dramatically in both age-prediction (range\=.31-.94) and fluid cognition-prediction (range\=.16-.84) models. This means that some prediction models were much more stable in their feature importance than others. This is probably due to various factors such as a) the collinearity of features in the model, b) the number of features (e.g., 71,631 features in functional connectivity, which were further reduced to 75 PCAs, as compared to 19 features in subcortical volume based on the ASEG atlas), c) the penalisation of coefficients either with ‘Ridge’ or ‘Lasso’ methods, which resulted in reduction as a group of features or selection of a feature among correlated features, respectively, and d) the predictive performance of the models. Understanding the stability of feature importance is beyond the scope of the current article. As mentioned by Reviewer 1, “The predictions can be stable when the coefficients are not,” and we chose to focus on the prediction in the current article.   

      Author response image 2.

      Stability of feature importance (i.e., Elastic Net Coefficients) of prediction models. Each dot represents rank stability (reflected by Spearman’s ρ) in the feature importance between two prediction models of the same features, used in two different outer-fold test sets. Given that there were five outer-fold test sets, there were 10 Spearman’s ρs for each prediction model.  The numbers to the right of the plots indicate the mean of Spearman’s ρ for each prediction model.  

      Reviewer #1 Public Review #4

      Please provide more details about the task designs, MRI processing procedures that were employed on this sample in addition to the regression methods and bias correction methods used. For example, there are several different parameterisations of the elastic net, please provide equations to describe the method used here so that readers can easily determine how the regularisation parameters should be interpreted.  

      Thank you for the opportunity for us to provide more methodical details.

      First, for the task design, we included the following statements:

      From Methods:

      “HCP-A collected fMRI data from three tasks: Face Name (Sperling et al., 2001), Conditioned Approach Response Inhibition Task (CARIT) (Somerville et al., 2018) and VISual MOTOR (VISMOTOR) (Ances et al., 2009). 

      First, the Face Name task (Sperling et al., 2001) taps into episodic memory. The task had three blocks. In the encoding block [Encoding], participants were asked to memorise the names of faces shown. These faces were then shown again in the recall block [Recall] when the participants were asked if they could remember the names of the previously shown faces. There was also the distractor block [Distractor] occurring between the encoding and recall blocks. Here participants were distracted by a Go/NoGo task. We computed six contrasts for this Face Name task: [Encode], [Recall], [Distractor], [Encode vs. Distractor], [Recall vs. Distractor] and [Encode vs. Recall].

      Second, the CARIT task (Somerville et al., 2018) was adapted from the classic Go/NoGo task and taps into inhibitory control. Participants were asked to press a budon to all [Go] but not to two [NoGo] shapes. We computed three contrasts for the CARIT task: [NoGo], [Go] and [NoGo vs. Go]. 

      Third, the VISMOTOR task (Ances et al., 2009) was designed to test simple activation of the motor and visual cortices. Participants saw a checkerboard with a red square either on the lec or right. They needed to press a corresponding key to indicate the location of the red square. We computed just one contrast for the VISMOTOR task: [Vismotor], which indicates the presence of the checkerboard vs. baseline.” 

      Second, for MRI processing procedures, we included the following statements.

      From Methods:

      “HCP-A provides details of parameters for brain MRI elsewhere (Bookheimer et al., 2019; Harms et al., 2018). Here we used MRI data that were pre-processed by the HCP-A with recommended methods, including the MSMALL alignment (Glasser et al., 2016; Robinson et al., 2018) and ICA-FIX (Glasser et al., 2016) for functional MRI. We used multiple brain MRI modalities, covering task functional MRI (task fMRI), resting-state functional MRI (rsfMRI) and structural MRI (sMRI), and organised them into 19 sets of features.”

      “Sets of Features 1-10: Task fMRI contrast (Task Contrast)

      Task contrasts reflect fMRI activation relevant to events in each task. Bookheimer and colleagues (2019) provided detailed information about the fMRI in HCP-A. Here we focused on the pre-processed task fMRI Connectivity Informatics Technology Initiative (CIFTI) files with a suffix, “_PA_Atlas_MSMAll_hp0_clean.dtseries.nii.” These CIFTI files encompassed both the cortical mesh surface and subcortical volume (Glasser et al., 2013). Collected using the posterior-to-anterior (PA) phase, these files were aligned using MSMALL (Glasser et al., 2016; Robinson et al., 2018), linear detrended (see hdps://groups.google.com/a/humanconnectome.org/g/hcp-users/c/ZLJc092h980/m/GiihzQAUAwAJ) and cleaned from potential artifacts using ICA-FIX (Glasser et al., 2016). 

      To extract Task Contrasts, we regressed the fMRI time series on the convolved task events using a double-gamma canonical hemodynamic response function via FMRIB Software Library (FSL)’s FMRI Expert Analysis Tool (FEAT) (Woolrich et al., 2001). We kept FSL’s default high pass cutoff at 200s (i.e., .005 Hz). We then parcellated the contrast ‘cope’ files, using the Glasser atlas (Gordon et al., 2016) for cortical surface regions and the Freesurfer’s automatic segmentation (aseg) (Fischl et al., 2002) for subcortical regions. This resulted in 379 regions, whose number was, in turn, the number of features for each Task Contrast set of features. “ 

      “Sets of Features 11-13: Task fMRI functional connectivity (Task FC)

      Task FC reflects functional connectivity (FC ) among the brain regions during each task, which is considered an important source of individual differences (Elliod et al., 2019; Fair et al., 2007; Gradon et al., 2018). We used the same CIFTI file “_PA_Atlas_MSMAll_hp0_clean.dtseries.nii.” as the task contrasts. Unlike Task Contrasts, here we treated the double-gamma, convolved task events as regressors of no interest and focused on the residuals of the regression from each task (Fair et al., 2007). We computed these regressors on FSL, and regressed them in nilearn (Abraham et al., 2014). Following previous work on task FC (Elliod et al., 2019), we applied a highpass at .008 Hz. For parcellation, we used the same atlases as Task Contrast (Fischl et al., 2002; Glasser et al., 2016). We computed Pearson’s correlations of each pair of 379 regions, resulting in a table of 71,631 non-overlapping FC indices for each task. We then applied r-to-z transformation and principal component analysis (PCA) of 75 components (Rasero et al., 2021; Sripada et al., 2019, 2020). Note to avoid data leakage, we conducted the PCA on each training set and applied its definition to the corresponding test set. Accordingly, there were three sets of 75 features for Task FC, one for each task. 

      Set of Features 14: Resting-state functional MRI functional connectivity (Rest FC) Similar to Task FC, Rest FC reflects functional connectivity (FC ) among the brain regions, except that Rest FC occurred during the resting (as opposed to task-performing) period. HCPA collected Rest FC from four 6.42-min (488 frames) runs across two days, leading to 26-min long data (Harms et al., 2018). On each day, the study scanned two runs of Rest FC, starting with anterior-to-posterior (AP) and then with posterior-to-anterior (PA) phase encoding polarity. We used the “rfMRI_REST_Atlas_MSMAll_hp0_clean.dscalar.nii” file that was preprocessed and concatenated across the four runs.  We applied the same computations (i.e., highpass filter, parcellation, Pearson’s correlations, r-to-z transformation and PCA) with the Task FC. 

      Sets of Features 15-18: Structural MRI (sMRI)

      sMRI reflects individual differences in brain anatomy. The HCP-A used an established preprocessing pipeline for sMRI (Glasser et al., 2013). We focused on four sets of features: cortical thickness, cortical surface area, subcortical volume and total brain volume. For cortical thickness and cortical surface area, we used Destrieux’s atlas (Destrieux et al., 2010; Fischl, 2012) from FreeSurfer’s “aparc.stats” file, resulting in 148 regions for each set of features. For subcortical volume, we used the aseg atlas (Fischl et al., 2002) from FreeSurfer’s “aseg.stats” file, resulting in 19 regions. For total brain volume, we had five FreeSurfer-based features: “FS_IntraCranial_Vol” or estimated intra-cranial volume, “FS_TotCort_GM_Vol” or total cortical grey mader volume, “FS_Tot_WM_Vol” or total cortical white mader volume, “FS_SubCort_GM_Vol” or total subcortical grey mader volume and “FS_BrainSegVol_eTIV_Ratio” or ratio of brain segmentation volume to estimated total intracranial volume.”

      Third, for regression methods and bias correction methods used, we included the following statements:

      From Methods:

      “For the machine learning algorithm, we used Elastic Net (Zou & Hastie, 2005). Elastic Net is a general form of penalised regressions (including Lasso and Ridge regression), allowing us to simultaneously draw information across different brain indices to predict one target variable. Penalised regressions are commonly used for building age-prediction models (Jirsaraie, Gorelik, et al., 2023). Previously we showed that the performance of Elastic Net in predicting cognitive abilities is on par, if not better than, many non-linear and morecomplicated algorithms (Pat, Wang, Bartonicek, et al., 2022; Tetereva et al., 2022). Moreover, Elastic Net coefficients are readily explainable, allowing us the ability to explain how our age-prediction and cognition-prediction models made the prediction from each brain feature (Molnar, 2019; Pat, Wang, Bartonicek, et al., 2022) (see below). 

      Elastic Net simultaneously minimises the weighted sum of the features’ coefficients. The degree of penalty to the sum of the feature’s coefficients is determined by a shrinkage hyperparameter ‘a’: the greater the a, the more the coefficients shrink, and the more regularised the model becomes. Elastic Net also includes another hyperparameter, ‘ℓ! ratio’, which determines the degree to which the sum of either the squared (known as ‘Ridge’; ℓ! ratio=0) or absolute (known as ‘Lasso’; ℓ! ratio=1) coefficients is penalised (Zou & Hastie, 2005). The objective function of Elastic Net as implemented by sklearn (Pedregosa et al., 2011) is defined as:

      where X is the features, y is the target, and b is the coefficient. In our grid search, we tuned two Elastic Net hyperparameters: a using 70 numbers in log space, ranging from .1 and 100, and ℓ!-ratio using 25 numbers in linear space, ranging from 0 and 1.

      To understand how Elastic Net made a prediction based on different brain features, we examined the coefficients of the tuned model. Elastic Net coefficients can be considered as feature importance, such that more positive Elastic Net coefficients lead to more positive predicted values and, similarly, more negative Elastic Net coefficients lead to more negative predicted values (Molnar, 2019; Pat, Wang, Bartonicek, et al., 2022). While the magnitude of Elastic Net coefficients is regularised (thus making it difficult for us to interpret the magnitude itself directly), we could still indicate that a brain feature with a higher magnitude weights relatively stronger in making a prediction. Another benefit of Elastic Net as a penalised regression is that the coefficients are less susceptible to collinearity among features as they have already been regularised (Dormann et al., 2013; Pat, Wang, Bartonicek, et al., 2022).

      Given that we used five-fold nested cross validation, different outer folds may have different degrees of ‘a’ and ‘ℓ! ratio’, making the final coefficients from different folds to be different. For instance, for certain sets of features, penalisation may not play a big part (i.e., higher or lower ‘a’ leads to similar predictive performance), resulting in different ‘a’ for different folds. To remedy this in the visualisation of Elastic Net feature importance, we refitted the Elastic Net model to the full dataset without spli{ng them into five folds and visualised the coefficients on brain images using Brainspace (Vos De Wael et al., 2020) and Nilern (Abraham et al., 2014) packages. Note, unlike other sets of features, Task FC and Rest FC were modelled acer data reduction via PCA. Thus, for Task FC and Rest FC, we, first, multiplied the absolute PCA scores (extracted from the ‘components_’ attribute of ‘sklearn.decomposition.PCA’) with Elastic Net coefficients and, then, summed the multiplied values across the 75 components, leaving 71,631 ROI-pair indices.


    1. eLife assessment

      This valuable work describes a new protein factor that is required for filamentous phage assembly. Convincing evidence is provided for the binding of PSB15 to the packaging signal of the single-stranded DNA, Trx, and cardiolipin, and a mechanism for how the phage DNA is targeted to the assembly site in the bacterial inner membrane is presented. The work will be of interest to microbiologists.