10,000 Matching Annotations
  1. Last 7 days
    1. Reviewer #1 (Public review):

      Summary:

      This computational modelling study addresses the important question of how neurons can learn non-linear functions using biologically realistic plasticity mechanisms. The study extends the previous related work on metaplasticity by Khodadadi et al. (2025), using the same detailed biophysical model and basic study design, while significantly simplifying the synaptic plasticity rule by removing non-linearities, reducing the number of free parameters, and limiting plasticity to only excitatory synapses. The rule itself is supervised by the presence or absence of a binary dopamine reward signal, and gated by separate calcium-sensitive thresholds for potentiation and depression. The author shows that, when paired with a strong form of dendritic non-linearity called a "plateau potential" and appropriate pre-existing dendritic clustering of features, this simpler learning mechanism can solve a non-linear classification task similar to the classic XOR logic operator, with equal or better performance than the previous publication. The primary claims of this publication are that metaplasticity is required for learning non-linear feature classification, and that simultaneous dynamics in two separate thresholds (for potentiation and depression) are critical in this process. By systematically studying the properties of a biophysically plausible supervised learning rule, this paper adds interesting insights into the mechanics of learning complex computations in single neurons.

      Strengths:

      The simplified form of the learning rule makes it easier to understand and study than previous metaplasticity rules, and makes the conclusions more generalizable, while preserving biological realism. Since similar biophysical mechanisms and dynamics exist in many different cell types across the whole brain, the proposed rule could easily be integrated into a wide range of computational models specializing in brain regions beyond the striatum (which is the focus of this study), making it of broad interest to computational neuroscientists. The general approach of systematically fixing or modifying each variable while observing the effects and interactions with other variables is sound and brings great clarity to understanding the dynamic properties and mechanics of the proposed learning rule.

      Weaknesses:

      General notes

      (1) The credibility of the main claims is mainly limited by the very narrow range of model parameters that was explored, including several seemingly arbitrary choices that were not adequately justified or explored.

      (2) The choice to use a morphologically detailed biophysical model, rather than a simpler multi-compartment model, adds a great deal of complexity that further increases uncertainty as to whether the conclusions can generalize beyond the specific choices of model and morphology studied in this paper.

      (3) The requirement for pre-existing synaptic clustering, while not implausible, greatly limits the flexibility of this rule to solve non-linear problems more generally.

      (4) In order to claim that two thresholds are truly necessary, the author would have to show that other well-known rules with a single threshold (e.g., BCM) cannot solve this problem. No such direct head-to-head comparisons are made, raising the question of whether the same task could be achieved without having two separate plasticity thresholds.

      Specific notes

      (1) Regarding the limited hyperparameter search:

      (a) On page 5, the author introduces the upper LTP threshold Theta_LTP. It is not clear why this upper threshold is necessary when the weights are already bounded by w_max. Since w_max is just another hyperparameter, why not set it to a lower value if the goal is to avoid excessively strong synapses? The values of w_max and Theta_LTP appear to have been chosen arbitrarily, but this question could be resolved by doing a proper hyperparameter search over w_max in the absence of an upper Theta_LTP.

      (b) The author does not explore the effect of having separate learning rates for theta_LTP and theta_LTD, which could also improve learning performance in the NFBP. A more comprehensive exploration of these parameters would make the inclusion of theta_max (and the specific value chosen) a lot less arbitrary.

      (c) Figure 4 Supplements 3-4: The author shows results for a hyperparameter search of the learning rule parameters, which is important to see. However, the parameter search is very limited: only 3 parameter values were tried, and there is no explanation or rationale for choosing these specific parameters. In particular, the metaplasticity learning rates do not even span one order of magnitude. If the author wants to claim that the learning rule is insensitive to this parameter, it should be explored over a much broader range of values (e.g., something like the range [0.1-10]).

      (2) Regarding the similarity to BCM, the author would ideally directly implement the BCM learning rule in their model, but at the least the author could have shown whether a slight variant of their rule presented here can be effective: for example having a single (plastic, not fixed) Ca-dependent threshold that applies to both LTP and LTD, with a single learning rate parameter.

      (3) This paper is extremely similar (and essentially an extension) to the work of Khodadadi et al. (2025). Yet this paper is not mentioned at all in the introduction, and the relation between these papers is not made clear until the discussion, leaving me initially puzzled as to what problems this paper addresses that have not already been extensively solved. The introduction could be reworked to make this connection clearer while pointing out the main differences in approach (e.g., the important distinction between "boosting" nonlinearities and plateau potentials).

      (4) The introduction is missing some citations of other recent work that has addressed single-neuron non-linear computation and learning, such as Gidon et al (2020); Jones & Kording (2021).

      (5) Figure 1: The figure prominently features mGluR next to the CaV channel, but there is no mention of mGluR in the introduction. The introduction should be updated to include this.

      (6) Could the author explain why there is a non-monotonic increase/decrease in the [Ca]_L in Figure 2B_4? Perhaps my confusion comes from not understanding what a single line represents. Does each line represent the [Ca] in a single spine (and if so, which spine), or is each line an average of all the spines in a given stim condition?

      (7) Row 124 (page 4): L-type Ca microdomains (in which ions don't diffuse and therefore don't interact with Ca_NMDA) is a critical assumption of this model. The references for this appear only in the discussion, so when reading this paper, I found myself a bit confused about why the same ion is treated as two completely independent variables with separate dynamics. Highlighting the assumption (with citations) a bit more clearly in the results section when describing the rule would help with understanding.

      (8) Row 149 (page 5): The current formulation of the update rule is not actually multiplicative. The fact that the update is weight-dependent alone does not make it a multiplicative rule, and judging by equation (1) it appears to simply be an additive rule with a weight regularization term that guarantees weight bounds. For example, a similar weight-dependent update is also a core component of BTSP (Milstein et al. 2021; Galloni et al. 2025), which is another well-known *additive* rule. An actual multiplicative rule implies that the update itself is applied via a multiplication, i.e. w_new = w_old * delta_w

      For an example of a genuinely multiplicative rule, see: Cornford et al. 2024, "Brain-like learning with exponentiated gradients"). Multiplicative rules have very different properties to additive rules, since larger weights tend to grow quickly while small weights shrink towards 0.

      (9) Equation 1 (page 5): Shouldn't the depression term be written as: (w_min - w)? This term would be negative if w is larger than w_min, leading to LTD. As it is written now, a large w and small w_min would just cause further potentiation instead of depression.

      (10) In the introduction, the teaching signal is described in binary terms (DA peak, or DA pause), but in Equation 1, it actually appears to take on 3 different values. Could the author clarify what the difference is between a "DA pause" and the "no DA" condition? The way I read it, pause = absence of DA = no DA

      (11) Figure 3: In these experimental simulations, DA feedback comes in 400ms after the stimulus. The author could motivate this choice a bit better and explain the significance of this delay. Clearly, the equations have a delta_t term, but as far as the learning algorithm is concerned, it seems like learning would be more effective at delta_t=0. Is the choice of 400ms mainly motivated by experimental observations? On a related note, is it meaningful that the 200ms delta_t before the next stimulus is shorter than the 400ms pause from the first stimulus? Wouldn't the DA that arrives shortly before a stimulus also have an effect on the learning rule?

      (12) Figure 4C: How is it possible that the theta_LTP value goes higher than the upper threshold (dashed line)? Equation 3 implies that it should always be lower.

      (13) Row 429 (page 11): The statement that "without metaplasticity the NFBP cannot be solved" is overly general and not supported by the evidence presented. There exist many papers in which people solve similar non-linear feature learning problems with Hebbian or other bio-plausible rules that don't have metaplasticity. A more accurate statement that can be made here is that the specific rule presented in this paper requires metaplasticity.

      (14) The methods section does not make any mention of publicly available code or a GitHub repository. The author should add a link to the code and put some effort into improving the documentation so that others can more easily assess the code and reproduce the simulations.

    1. [Enter FRIAR LAURENCE] Friar Laurence. Romeo, come forth; come forth, thou fearful man: 1870Affliction is enamour'd of thy parts, And thou art wedded to calamity. [Enter ROMEO] Romeo. Father, what news? what is the prince's doom? What sorrow craves acquaintance at my hand, 1875That I yet know not? Friar Laurence. Too familiar Is my dear son with such sour company: I bring thee tidings of the prince's doom. Romeo. What less than dooms-day is the prince's doom? 1880 Friar Laurence. A gentler judgment vanish'd from his lips, Not body's death, but body's banishment. Romeo. Ha, banishment! be merciful, say 'death;' For exile hath more terror in his look, Much more than death: do not say 'banishment.' 1885 Friar Laurence. Hence from Verona art thou banished: Be patient, for the world is broad and wide. Romeo. There is no world without Verona walls, But purgatory, torture, hell itself. Hence-banished is banish'd from the world, 1890And world's exile is death: then banished, Is death mis-term'd: calling death banishment, Thou cutt'st my head off with a golden axe, And smilest upon the stroke that murders me. Friar Laurence. O deadly sin! O rude unthankfulness! 1895Thy fault our law calls death; but the kind prince, Taking thy part, hath rush'd aside the law, And turn'd that black word death to banishment: This is dear mercy, and thou seest it not. Romeo. 'Tis torture, and not mercy: heaven is here, 1900Where Juliet lives; and every cat and dog And little mouse, every unworthy thing, Live here in heaven and may look on her; But Romeo may not: more validity, More honourable state, more courtship lives 1905In carrion-flies than Romeo: they my seize On the white wonder of dear Juliet's hand And steal immortal blessing from her lips, Who even in pure and vestal modesty, Still blush, as thinking their own kisses sin; 1910But Romeo may not; he is banished: Flies may do this, but I from this must fly: They are free men, but I am banished. And say'st thou yet that exile is not death? Hadst thou no poison mix'd, no sharp-ground knife, 1915No sudden mean of death, though ne'er so mean, But 'banished' to kill me?—'banished'? O friar, the damned use that word in hell; Howlings attend it: how hast thou the heart, Being a divine, a ghostly confessor, 1920A sin-absolver, and my friend profess'd, To mangle me with that word 'banished'?

      friar tells romeo that the prince had shown him mercy instead of killing him he decided to banish him but romeo insist that this is worse then executing him because he would be separated from Juliet

    2. Benvolio. I pray thee, good Mercutio, let's retire: The day is hot, the Capulets abroad, 1500And, if we meet, we shall not scape a brawl; For now, these hot days, is the mad blood stirring. Mercutio. Thou art like one of those fellows that when he enters the confines of a tavern claps me his sword upon the table and says 'God send me no need of 1505thee!' and by the operation of the second cup draws it on the drawer, when indeed there is no need. Benvolio. Am I like such a fellow? Mercutio. Come, come, thou art as hot a Jack in thy mood as any in Italy, and as soon moved to be moody, and as 1510soon moody to be moved. Benvolio. And what to? Mercutio. Nay, an there were two such, we should have none shortly, for one would kill the other. Thou! why, thou wilt quarrel with a man that hath a hair more, 1515or a hair less, in his beard, than thou hast: thou wilt quarrel with a man for cracking nuts, having no other reason but because thou hast hazel eyes: what eye but such an eye would spy out such a quarrel? Thy head is as fun of quarrels as an egg is full of 1520meat, and yet thy head hath been beaten as addle as an egg for quarrelling: thou hast quarrelled with a man for coughing in the street, because he hath wakened thy dog that hath lain asleep in the sun: didst thou not fall out with a tailor for wearing 1525his new doublet before Easter? with another, for tying his new shoes with old riband? and yet thou wilt tutor me from quarrelling! Benvolio. An I were so apt to quarrel as thou art, any man should buy the fee-simple of my life for an hour and a quarter. 1530 Mercutio. The fee-simple! O simple! Benvolio. By my head, here come the Capulets. Mercutio. By my heel, I care not. [Enter TYBALT and others] Tybalt. Follow me close, for I will speak to them. 1535Gentlemen, good den: a word with one of you. Mercutio. And but one word with one of us? couple it with something; make it a word and a blow. Tybalt. You shall find me apt enough to that, sir, an you will give me occasion. 1540 Mercutio. Could you not take some occasion without giving? Tybalt. Mercutio, thou consort'st with Romeo,— Mercutio. Consort! what, dost thou make us minstrels? an thou make minstrels of us, look to hear nothing but discords: here's my fiddlestick; here's that shall 1545make you dance. 'Zounds, consort! Benvolio. We talk here in the public haunt of men: Either withdraw unto some private place, And reason coldly of your grievances, Or else depart; here all eyes gaze on us. 1550 Mercutio. Men's eyes were made to look, and let them gaze; I will not budge for no man's pleasure, I.

      benvolio is trying to get mercutio to leave because it is hot outside and their rivals the capulets are here mercutio jokes that benvolio is actually the one that likes to start fights and is easily angered

    1. Specificity A thesis statement must concentrate on a specific area of a general topic. As you may recall, the creation of a thesis statement begins when you choose a broad subject and then narrow down its parts until you pinpoint a specific aspect of that topic. For example, health care is a broad topic, but a proper thesis statement would focus on a specific area of that topic, such as options for individuals without health-care coverage. Precision A strong thesis statement must be precise enough to allow for a coherent argument and to remain focused on the topic.

      Specificity: narrowing down thesis. Precision: Hitting the nail on the head so to speak, with the thesis statement.

    1. Genetic Variants Were Associated With the Prognosis of Head and Neck Squamous Carcinoma

      [Paper-level Aggregated] PMCID: PMC7099049

      Evidence Type(s): Prognostic, Functional

      Justification: Prognostic: The SNPs rs16879870, rs2641256, rs2761591, and rs854936 were significantly associated with HNSCC survival, indicating their potential role in predicting patient outcomes. Functional: The genotypes of rs16879870 and rs854936 were significantly associated with the expression of genes GJB7 and RTN4R, respectively, suggesting a functional impact on gene expression related to HNSCC.

      Gene→Variant (gene-first): FGFR4(2264):AUC from 0 NA:rs16879870 SCIMP(388325):rs2641256 DCDC1(341019):rs2761591 NA:rs854936

      Genes: FGFR4(2264) NA SCIMP(388325) DCDC1(341019)

      Variants: AUC from 0 rs16879870 rs2641256 rs2761591 rs854936

    1. Investigating the Feasibility of Targeted Next-Generation Sequencing to Guide the Treatment of Head and Neck Squamous Cell Carcinoma

      [Paper-level Aggregated] PMCID: PMC6333965

      Evidence Type(s): Oncogenic, Prognostic

      Justification: Oncogenic: The text mentions that mutations in PIK3CA (E545K) and CDKN2A (R58X) are potentially targetable, indicating their role in cancer development. Additionally, TP53 mutations (R209Q/W, R243W/Q) are associated with cell cycle deregulation, further supporting their oncogenic potential. Prognostic: The text states that mutations in TP53, CDKN2A, and CCND1 are significantly associated with poorer overall survival, indicating their prognostic value in the context of cancer outcomes.

      Gene→Variant (gene-first): PIK3CA(5290):E545K TP53(7157):R209Q/W TP53(7157):R243W/Q CDKN2A(1029):R58X

      Genes: PIK3CA(5290) TP53(7157) CDKN2A(1029)

      Variants: E545K R209Q/W R243W/Q R58X

    1. eLife Assessment

      This valuable study describes significant differences in prey capture behavior between PSD-95 knock-out and wild-type mice, despite prior work by the same authors showing only modest visual deficits in the former. The data convincingly demonstrated prey capture performance in PSD-95 knock-out mice to improve under monocular viewing conditions. However, this finding alone was inadequate to support the interpretation of results as revealing a deficit in binocular visual integration, especially given the lack of eye and head tracking data or consideration of alternative explanations for the observed behavior.

    2. Reviewer #1 (Public review):

      Summary:

      PSD95 has long been studied in detail to understand molecular mechanisms of synaptic plasticity as related to specific cell types (excitatory), circuits (visual cortex) and circuit development and function (ocular dominance plasticity ). While much was known about the molecular and cellular details of its function, it remained unclear whether and how it might contribute to the development of specific aspects of visual perception. While overall vision is preserved in PSD95 KO (Knockout) mice, studying natural, visually-guided prey capture behavior revealed robust, yet specific, perturbations to binocular processing during the behavior.

      Strengths:

      A major strength of the paper is being able to quantify precise measures of the visual aspects versus the motor aspects of prey pursuit. Comparing changes in behavior due to monocular occlusion was particularly revealing that mice indeed employ binocular summation to extract visual cues useful for prey pursuit. This result further suggested that in cases with poor binocular vision, monocular input can improve perceptual and behavioral processes as it does in human subjects with comparable challenges.

      The study not only provided a useful finding regarding the function of PSD95, but also outlined a useful general approach toward identifying and quantifying specific deficits in binocular summation. This is likely to broadly impact studies of visual system development, behavior, and neural circuit function. The careful attention to details, observations, and openness about subject variance will also be helpful to those studying specific visual pursuit and natural prey capture behavior in the mouse.

      Weaknesses:

      Lack of eye movement monitoring and detailed head movement analysis preclude total certainty for the interpretation of observed behaviors.

    3. Author response:

      We thank the reviewers for their thorough and constructive evaluation of our manuscript titled “PSD-95 drives binocular vision maturation critical for predation”. The reviewers raised several important conceptual and technical points. Here, we address and provide additional context on the major themes and outline our planned revisions.

      We acknowledge that the current prey capture task cannot directly adjudicate between PSD-95 binocular vision impairments or sensorimotor processing deficits. However, we did not observe any major impairment supporting a sensorimotor processing deficit, in contrast to a major impairment in line with binocular vision impairment. Evidence from Huang et al. (2015) [1], Favaro et al. (2018) [2] and our data with the visual water task (VWT) — thus requiring identical sensorimotor but differential visual processing—clearly demonstrated intact visual acuity but impaired orientation discrimination in PSD-95 KO mice. Therefore, we believe that a binocular integration deficit is the most likely explanation of PSD-95 KO binocular impairments. In line with this, it is unlikely that aberrations in binocular eye movements account for the observations. We appreciate that alternative explanations remain possible and merit explicit discussion. Accordingly, we intend to expand the discussion of these alternatives.

      Importantly, we will provide additional experimental data demonstrating that knock-down of PSD-95 in V1 but not in superior colliculus, significantly decreases orientation discrimination analyzed with the VWT, as we had shown for PSD-95 KO mice (while control knock-down does not have this effect). We believe that this new evidence better delineates the potential neuroanatomical locus of the PSD-95-associated deficits.

      Furthermore, we will provide additional head movement analyses, as suggested by Reviewer 1. Specifically, we will investigate the head angle in relation to the cricket (azimuth) in time (±1 second) around prey contact under light and dark conditions.

      We will also address the potential impact of PSD-95 KO learning deficits. We agree that there are more impairments in the PSD-95 KO brain, as has been published previously. But strikingly, the binocular impairment was dominating the sensory processing. This cannot be convincingly explained by learning deficits. In fact, we have observed improved learning of PSD-95 KO mice with some tasks (e.g. cocaine conditioned place preference) [3], but no significant differences in the VWT [1,2]. Learning differences were described for another PSD-95 mouse line, expressing the N-terminus with two PDZ domains [4]. To avoid potential learning dependent confounds, we have chosen salient stimuli, like water aversion, and prey capture to reduce impacts of potential learning defects.

      We agree on the strength of the random dot stereograms to isolate stereoscopic computations. However, it requires special filters in front of either eye, which renders it unsuitable for the VWT. The lengthy training with less silent stimuli of water reward, could potentially add additional confounds of PSD-95 KO deficits. Thus, we think that this would be something for future experiments to allow for integration of different visual inputs. However, the combined improved performance of WT mice with binocular vision for prey capture (depth percept) and orientation discrimination (summation) is already supporting the importance of binocular vision in mice and the dominant defect in PSD-95 KO mice.

      Finally, we will address the other points raised by the reviewers through clearer exposition and reorganization of the manuscript.

      Once again, we would like to thank the reviewers for their thoughtful and constructive feedback, which we believe will substantially strengthen the manuscript.

      (1) Huang, X., Stodieck, S. K., Goetze, B., Cui, L., Wong, M. H., Wenzel, C., Hosang, L., Dong, Y., Löwel, S., and Schlüter, O. M. (2015). Progressive maturation of silent synapses governs the duration of a critical period. Proc. Natl. Acad. Sci. 112, E3131–E3140. https://doi.org/10.1073/pnas.1506488112.

      (2) Favaro, P.D., Huang, X., Hosang, L., Stodieck, S., Cui, L., Liu, Y., Engelhardt, K.-A., Schmitz, F., Dong, Y., Löwel, S., et al. (2018). An opposing function of paralogs in balancing developmental synapse maturation. PLOS Biol. 16, e2006838. https://doi.org/10.1371/journal.pbio.2006838.

      (3) Shukla, A., Beroun, A., Panopoulou, M., Neumann, P.A., Grant, S.G., Olive, M.F., Dong, Y., and Schlüter, O.M. (2017). Calcium‐permeable AMPA receptors and silent synapses in cocaine‐conditioned place preference. EMBO J. 36, 458–474. https://doi.org/10.15252/embj.201695465.

      (4) Migaud, M., Charlesworth, P., Dempster, M., Webster, L.C., Watabe, A.M., Makhinson, M., He, Y., Ramsay, M.F., Morris, R.G.M., Morrison, J.H., et al. (1998). Enhanced long-term potentiation and impaired learning in mice with mutant postsynaptic density-95 protein. Nature 396, 433–439. https://doi.org/10.1038/24790.

    1. Reviewer #1 (Public review):

      Review of the revised submission:

      I thank the authors for their detailed consideration of my comments and for the additional data, analyses, and clarifications they have incorporated. The new behavioral experiments, quantification of targeted manipulations, and expanded methodological details strengthen the manuscript and address many of my initial concerns. While some questions remain for future work, the authors' careful responses and the additional evidence provided help resolve the main issues I raised, and I am generally satisfied with the revisions.

      Review of original submission:

      Summary

      In this article, Kawanabe-Kobayashi et al., aim to examine the mechanisms by which stress can modulate pain in mice. They focus on the contribution of noradrenergic neurons (NA) of the locus coeruleus (LC). The authors use acute restraint stress as a stress paradigm and found that following one hour of restraint stress mice display mechanical hypersensitivity. They show that restraint stress causes the activation of LC NA neurons and the release of NA in the spinal cord dorsal horn (SDH). They then examine the spinal mechanisms by which LC→SDH NA produces mechanical hypersensitivity. The authors provide evidence that NA can act on alphaA1Rs expressed by a class of astrocytes defined by the expression of Hes (Hes+). Furthermore, they found that NA, presumably through astrocytic release of ATP following NA action on alphaA1Rs Hes+ astrocytes, can cause an adenosine-mediated inhibition of SDH inhibitory interneurons. They propose that this disinhibition mechanism could explain how restraint stress can cause the mechanical hypersensitivity they measured in their behavioral experiments.

      Strengths:

      (1) Significance. Stress profoundly influences pain perception; resolving the mechanisms by which stress alters nociception in rodents may explain the well-known phenomenon of stress-induced analgesia and/or facilitate the development of therapies to mitigate the negative consequences of chronic stress on chronic pain.

      (2) Novelty. The authors' findings reveal a crucial contribution of Hes+ spinal astrocytes in the modulation of pain thresholds during stress.

      (3) Techniques. This study combines multiple approaches to dissect circuit, cellular, and molecular mechanisms including optical recordings of neural and astrocytic Ca2+ activity in behaving mice, intersectional genetic strategies, cell ablation, optogenetics, chemogenetics, CRISPR-based gene knockdown, slice electrophysiology, and behavior.

      Weaknesses:

      (1) Mouse model of stress. Although chronic stress can increase sensitivity to somatosensory stimuli and contribute to hyperalgesia and anhedonia, particularly in the context of chronic pain states, acute stress is well known to produce analgesia in humans and rodents. The experimental design used by the authors consists of a single one-hour session of restraint stress followed by 30 min to one hour of habituation and measurement of cutaneous mechanical sensitivity with von Frey filaments. This acute stress behavioral paradigm corresponds to the conditions in which the clinical phenomenon of stress-induced analgesia is observed in humans, as well as in animal models. Surprisingly, however, the authors measured that this acute stressor produced hypersensitivity rather than antinociception. This discrepancy is significant and requires further investigation.

      (2) Specifically, is the hypersensitivity to mechanical stimulation also observed in response to heat or cold on a hotplate or coldplate?

      (3) Using other stress models, such as a forced swim, do the authors also observe acute stress-induced hypersensitivity instead of stress-induced antinociception?

      (4) Measurement of stress hormones in blood would provide an objective measure of the stress of the animals.

      (5) Results:

      (a) Optical recordings of Ca2+ activity in behaving rodents are particularly useful to investigate the relationship between Ca2+ dynamics and the behaviors displayed by rodents.

      (b) The authors report an increase in Ca2+ events in LC NA neurons during restraint stress: Did mice display specific behaviors at the time these Ca2+ events were observed such as movements to escape or orofacial behaviors including head movements or whisking?

      (c) Additionally, are similar increases in Ca2+ events in LC NA neurons observed during other stressful behavioral paradigms versus non-stressful paradigms?

      (d) Neuronal ablation to reveal the function of a cell population.

      (e) The proportion of LC NA neurons and LC→SDH NA neurons expressing DTR-GFP and ablated should be quantified (Figures 1G and J) to validate the methods and permit interpretation of the behavioral data (Figures 1H and K). Importantly, the nocifensive responses and behavior of these mice in other pain assays in the absence of stress (e.g., hotplate) and a few standard assays (open field, rotarod, elevated plus maze) would help determine the consequences of cell ablation on processing of nociceptive information and general behavior.

      (f) Confirmation of LC NA neuron function with other methods that alter neuronal excitability or neurotransmission instead of destroying the circuit investigated, such as chemogenetics or chemogenetics, would greatly strengthen the findings. Optogenetics is used in Figure 1M, N but excitation of LC→SDH NA neuron terminals is tested instead of inhibition (to mimic ablation), and in naïve mice instead of stressed mice.

      (g) Alpha1Ars. The authors noted that "Adra1a mRNA is also expressed in INs in the SDH".

      (h) The authors should comprehensively indicate what other cell types present in the spinal cord and neurons projecting to the spinal cord express alpha1Ars and what is the relative expression level of alpha1Ars in these different cell types.

      (i) The conditional KO of alpha1Ars specifically in Hes5+ astrocytes and not in other cell types expressing alpha1Ars should be quantified and validated (Figure 2H).

      (j) Depolarization of SDH inhibitory interneurons by NA (Figure 3). The authors' bath applied NA, which presumably activates all NA receptors present in the preparation.

      k) The authors' model (Figure 4H) implies that NA released by LC→SDH NA neurons leads to the inhibition of SDH inhibitory interneurons by NA. In other experiments (Figure 1L, Figure 2A), the authors used optogenetics to promote the release of endogenous NA in SDH by LC→SDH NA neurons. This approach would investigate the function of NA endogenously released by LC NA neurons at presynaptic terminals in the SDH and at physiological concentrations and would test the model more convincingly compared to the bath application of NA.

      (l) As for other experiments, the proportion of Hes+ astrocytes that express hM3Dq, and the absence of expression in other cells, should be quantified and validated to interpret behavioral data.

      (m) Showing that the effect of CNO is dose-dependent would strengthen the authors' findings.

      (n) The proportion of SG neurons for which CNO bath application resulted in a reduction in recorded sIPSCs is not clear.

      (o) A1Rs. The specific expression of Cas9 and guide RNAs, and the specific KD of A1Rs, in inhibitory interneurons but not in other cell types expressing A1Rs should be quantified and validated.

      (6) Methods:

      It is unclear how fiber photometry is performed using "optic cannula" during restraint stress while mice are in a 50ml falcon tube (as shown in Figure 1A).

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1 (Public review):

      Summary:

      In this article, Kawanabe-Kobayashi et al., aim to examine the mechanisms by which stress can modulate pain in mice. They focus on the contribution of noradrenergic neurons (NA) of the locus coeruleus (LC). The authors use acute restraint stress as a stress paradigm and found that following one hour of restraint stress mice display mechanical hypersensitivity. They show that restraint stress causes the activation of LC NA neurons and the release of NA in the spinal cord dorsal horn (SDH). They then examine the spinal mechanisms by which LC→SDH NA produces mechanical hypersensitivity. The authors provide evidence that NA can act on alphaA1Rs expressed by a class of astrocytes defined by the expression of Hes (Hes+). Furthermore, they found that NA, presumably through astrocytic release of ATP following NA action on alphaA1Rs Hes+ astrocytes, can cause an adenosine-mediated inhibition of SDH inhibitory interneurons. They propose that this disinhibition mechanism could explain how restraint stress can cause the mechanical hypersensitivity they measured in their behavioral experiments.

      Strengths:

      (1) Significance. Stress profoundly influences pain perception; resolving the mechanisms by which stress alters nociception in rodents may explain the well-known phenomenon of stress-induced analgesia and/or facilitate the development of therapies to mitigate the negative consequences of chronic stress on chronic pain.

      (2) Novelty. The authors' findings reveal a crucial contribution of Hes+ spinal astrocytes in the modulation of pain thresholds during stress.

      (3) Techniques. This study combines multiple approaches to dissect circuit, cellular, and molecular mechanisms including optical recordings of neural and astrocytic Ca2+ activity in behaving mice, intersectional genetic strategies, cell ablation, optogenetics, chemogenetics, CRISPR-based gene knockdown, slice electrophysiology, and behavior.

      Weaknesses:

      (1) Mouse model of stress. Although chronic stress can increase sensitivity to somatosensory stimuli and contribute to hyperalgesia and anhedonia, particularly in the context of chronic pain states, acute stress is well known to produce analgesia in humans and rodents. The experimental design used by the authors consists of a single one-hour session of restraint stress followed by 30 min to one hour of habituation and measurement of cutaneous mechanical sensitivity with von Frey filaments. This acute stress behavioral paradigm corresponds to the conditions in which the clinical phenomenon of stress-induced analgesia is observed in humans, as well as in animal models. Surprisingly, however, the authors measured that this acute stressor produced hypersensitivity rather than antinociception. This discrepancy is significant and requires further investigation.

      We thank the reviewer for evaluating our work and for highlighting both its strengths and weaknesses. As stated by the reviewer, numerous studies have reported acute stress-induced antinociception. However, as shown in a new additional table (Table S1) in which we have summarized previously published data using the acute restraint stress model employed in our present study, most studies reporting antinociceptive effects of acute restraint stress assessed behavioral responses to heat stimuli or formalin. This observation is consistent with the findings from our previous study (Uchiyama et al., Mol Brain, 2022 (PMID: 34980215)). The present study also confirms that acute restraint stress reduces behavioral responses to noxious heat (see also our response to Comment #2 below). In contrast to the robust and consistent antinociceptive effects observed with thermal stimuli, some studies evaluating behavioral responses to mechanical stimuli have reported stress-induced hypersensitivity (see Table S1), which aligns with our current findings. Taken together, these data support our original notion that the effects of acute stress on pain-related behaviors depend on several factors, including the nature, duration, and intensity of the stressor, as well as the sensory modality assessed in behavioral tests. We have incorporated this discussion and Table S1 into the revised manuscript (lines 344-353). Furthermore, we have slightly modified the text including the title, replacing "pain facilitation" with "mechanical pain hypersensitivity" to more accurately reflect our research focus and the conclusion of this study that LC<sup>→SDH</sup> NAergic signaling to spinal astrocytes is required for stress-induced mechanical pain hypersensitivity. Finally, while mouse models of stress could provide valuable insights, the clinical relevance of stress-induced mechanical pain hypersensitivity remains to be elucidated and requires further investigation. We hope these clarifications address your concerns.

      (2) Specifically, is the hypersensitivity to mechanical stimulation also observed in response to heat or cold on a hotplate or coldplate?

      Thank you for your important comment. We have now conducted additional behavioral experiments to assess responses to heat using the hot-plate test. We found that mice subjected to restraint stress did not exhibit behavioral hypersensitivity to heat stimuli; instead, they displayed antinociceptive responses (Figure S2; lines 95-98). These results are consistent with our previous findings (Uchiyama et al., Mol Brain, 2022 (PMID: 34980215)) as well as numerous other reports (Table S1).

      (3) Using other stress models, such as a forced swim, do the authors also observe acute stress-induced hypersensitivity instead of stress-induced antinociception?

      As suggested by the reviewer, we conducted a forced swim test. We found that mice subjected to forced swimming, which has been reported to produce analgesic effects on thermal stimuli (Contet et al., Neuropsychopharmacology, 2006 (PMID: 16237385)), did not exhibit any changes in mechanical pain hypersensitivity (Figure S2; lines 98-99). Furthermore, a previous study demonstrated that mechanical pain sensitivity is enhanced by other stress models, such as exposure to an elevated open platform for 30 min (Kawabata et al., Neuroscience, 2023 (PMID: 37211084)). However, considering our data showing that changes in mechanosensory behavior induced by restraint stress depend on the duration of exposure (Figure S1), and that restraint stress also produced an antinociceptive effect on heat stimuli (Figure S2), stress-induced modulation of pain is a complex phenomenon influenced by multiple factors, including the stress model, intensity, and duration, as well as the sensory modality used for behavioral testing (lines 100-103).

      (4) Measurement of stress hormones in blood would provide an objective measure of the stress of the animals.

      A previous study has demonstrated that plasma corticosterone levels—a stress hormone—are elevated following a 1-hour exposure to restraint stress in mice (Kim et al., Sci Rep, 2018 (PMID: 30104581)), using a stress protocol similar to that employed in our current study. We have included this information with citing this paper (lines 104-105).

      (5) Results:

      (a) Optical recordings of Ca2+ activity in behaving rodents are particularly useful to investigate the relationship between Ca2+ dynamics and the behaviors displayed by rodents.

      In the optical recordings of Ca<sup>2+</sup> activity in LC neurons, we monitored mouse behavior during stress exposure. We have now included a video of this in the revised manuscript (video; lines 111-114).

      (b) The authors report an increase in Ca2+ events in LC NA neurons during restraint stress: Did mice display specific behaviors at the time these Ca2+ events were observed such as movements to escape or orofacial behaviors including head movements or whisking?

      By reanalyzing the temporal relationship between Ca<sup>2+</sup> events and mouse behavior during stress exposure, we found that the Ca<sup>2+</sup> transients and escape behaviors (struggling) occurred almost simultaneously (video). A similar temporal correlation is also observed in Ca<sup>2+</sup> responses in the bed nucleus of the stria terminalis (Luchsinger et al., Nat Commun, 2021 (PMID: 34117229)). The video file has been included in the revised manuscript (video; lines 111-113, 552-553, 573-575).

      Additionally, as described in the Methods section and shown in Figure S2 of the initial version (now Figure S3), non-specific signals or artifacts—such as those caused by head movements—were corrected (although such responses were minimal in our recordings).

      (c) Additionally, are similar increases in Ca2+ events in LC NA neurons observed during other stressful behavioral paradigms versus non-stressful paradigms?

      We appreciate the reviewer's valuable suggestion. Since the present, initial version of our manuscript focused on acute restraint stress, we did not measure Ca<sup>2+</sup> events in LC-NA neurons in other stress models, but a recent study has shown an increase in Ca<sup>2+</sup> responses in LC-NA neurons by social defeat stress (Seiriki et al., BioRxiv, https://www.biorxiv.org/content/10.1101/2025.03.07.641347v1).

      (d) Neuronal ablation to reveal the function of a cell population.

      This method has been widely used in numerous previous studies as an effective experimental approach to investigate the role of specific neuronal populations—including SDH-projecting LC-NA neurons (Ma et al., Brain Res, 2022 (PMID: 34929182); Kawanabe et al., Mol Brain, 2021 (PMID: 33971918))—in CNS function.

      (e) The proportion of LC NA neurons and LC→SDH NA neurons expressing DTR-GFP and ablated should be quantified (Figures 1G and J) to validate the methods and permit interpretation of the behavioral data (Figures 1H and K). Importantly, the nocifensive responses and behavior of these mice in other pain assays in the absence of stress (e.g., hotplate) and a few standard assays (open field, rotarod, elevated plus maze) would help determine the consequences of cell ablation on processing of nociceptive information and general behavior.

      As suggested, we conducted additional experiments to quantitatively analyze the number of LC<sup>→SDH</sup>-NA neurons. We used WT mice injected with AAVretro-Cre into the SDH (L4 segment) and AAV-FLEx[DTR-EGFP] into the LC. In these mice, 4.4% of total LC-NA neurons [positive for tyrosine hydroxylase (TH)] expressed DTR-GFP, representing the LC<sup>→SDH</sup>-NA neuronal population (Figure S4; lines 126-127). Furthermore, treatment with DTX successfully ablated the DTR-expressing LC<sup>→SDH</sup>-NA neurons. Importantly, the neurons quantified in this analysis were specifically those projecting to the L4 segment of the SDH; therefore, the total number of SDH-projecting LC-NA neurons across all spinal segments is expected to be much higher.

      We also performed the rotarod and paw-flick tests to assess motor function and thermal sensitivity following ablation of LC<sup>→SDH</sup>-NA neurons. No significant differences were observed between the ablated and control groups (Figure S5; lines 131-134), indicating that ablation of these neurons does not produce non-specific behavioral deficits in motor function or other sensory modalities.

      (f) Confirmation of LC NA neuron function with other methods that alter neuronal excitability or neurotransmission instead of destroying the circuit investigated, such as chemogenetics or chemogenetics, would greatly strengthen the findings. Optogenetics is used in Figure 1M, N but excitation of LCLC<sup>→SDH</sup> NA neuron terminals is tested instead of inhibition (to mimic ablation), and in naïve mice instead of stressed mice.

      We appreciate the reviewer’s comment. The optogenetic approach is useful for manipulating neuronal excitability; however, prolonged light illumination (> tens of seconds) can lead to undesirable tissue heating, ionic imbalance, and rebound spikes (Wiegert et al., Neuron, 2017 (PMID: 28772120)), making it difficult to apply in our experiments, in which mice are exposed to stress for 60 min. For this reason, we decided to employ the cell-ablation approach in stress experiments, as it is more suitable than optogenetic inhibition. In addition, as described in our response to weakness (1)-a) by Reviewer 3 (Public review), we have now demonstrated the specific expression of DTRs in NA neurons in the LC, but not in A5 or A7 (Figure S4; lines 127-128), confirming the specificity of LCLC<sup>→SDH</sup>-NAergic pathway targeting in our study. Chemogenetics represent another promising approach to further strengthen our findings on the role of LCLC<sup>→SDH</sup>-NA neurons, but this will be an important subject for future studies, as it will require extensive experiments to assess, for example, the effectiveness of chemogenetic inhibition of these neurons during 60 min of restraint stress, as well as optimization of key parameters (e.g., systemic DCZ doses).

      (g) Alpha1Ars. The authors noted that "Adra1a mRNA is also expressed in INs in the SDH".

      The expression of α<sub>1A</sub>Rs in inhibitory interneurons in the SDH is consistent with our previous findings (Uchiyama et al., Mol Brain, 2022 (PMID: 34980215)) as well as with scRNA-seq data (http://linnarssonlab.org/dorsalhorn/, Häring et al., Nat Neurosci, 2018 (PMID: 29686262)).

      (h) The authors should comprehensively indicate what other cell types present in the spinal cord and neurons projecting to the spinal cord express alpha1Ars and what is the relative expression level of alpha1Ars in these different cell types.

      According to the scRNA-seq data (https://seqseek.ninds.nih.gov/genes, Russ et al., Nat Commun, 2021 (PMID: 34588430); http://linnarssonlab.org/dorsalhorn/, Häring et al., Nat Neurosci, 2018 (PMID: 29686262)), we confirmed that α<sub>1A</sub>Rs are predominantly expressed in astrocytes and inhibitory interneurons in the spinal cord. Also, an α<sub>1A</sub>R-expressing excitatory neuron population (Glut14) expresses Tacr1, GPR83, and Tac1 mRNAs, markers that are known to be enriched in projection neurons of the SDH. This raises the possibility that α<sub>1A</sub> Rs may also be expressed in a subset of projection neurons, although further experiments are required to confirm this. In DRG neurons, α<sub>1A</sub>R expression was detected to some extent, but its level seems to be much lower than in the spinal cord (http://linnarssonlab.org/drg/ Usoskin et al., Nat Neurosci, 2015 (PMID: 25420068)). Consistent with this, primary afferent glutamatergic synaptic transmission has been shown to be unaffected by α<sub>1A</sub>R agonists (Kawasaki et al., Anesthesiology, 2003 (PMID: 12606912); Li and Eisenach, JPET, 2001 (PMID: 11714880)). This information has been incorporated into the Discussion section (lines 317-319).

      (i) The conditional KO of alpha1Ars specifically in Hes5+ astrocytes and not in other cell types expressing alpha1Ars should be quantified and validated (Figure 2H).

      We have previously shown a selective KO of α<sub>1A</sub>R in Hes5<sup>+</sup> astrocytes in the same mouse line (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)). This information has been included in the revised text (line 166-167).

      (j) Depolarization of SDH inhibitory interneurons by NA (Figure 3). The authors' bath applied NA, which presumably activates all NA receptors present in the preparation.

      We believe that the reviewer’s concern may pertain to the possibility that NA acts on non-Vgat<sup>+</sup> neurons, thereby indirectly causing depolarization of Vgat<sup>+</sup> neurons. As described in the Method section of the initial version, in our electrophysiological experiments, we added four antagonists for excitatory and inhibitory neurotransmitter receptors—CNQX (AMPA receptor), MK-801 (NMDA receptor), bicuculline (GABA<sub>A</sub> receptor), and strychnine (glycine receptor)—to the artificial cerebrospinal fluid to block synaptic inputs from other neurons to the recorded Vgat<sup>+</sup> neurons. Since this method is widely used for this purpose in many previous studies (Wu et al., J Neurosci, 2004 (PMID: 15140934); Liu et al., Nat Neurosci, 2010 (PMID: 20835251)), it is reasonable to conclude that NA directly acts on the recorded SDH Vgat<sup>+</sup> interneurons to produce excitation (lines 193-196).

      (k) The authors' model (Figure 4H) implies that NA released by LC→SDH NA neurons leads to the inhibition of SDH inhibitory interneurons by NA. In other experiments (Figure 1L, Figure 2A), the authors used optogenetics to promote the release of endogenous NA in SDH by LC→SDH NA neurons. This approach would investigate the function of NA endogenously released by LC NA neurons at presynaptic terminals in the SDH and at physiological concentrations and would test the model more convincingly compared to the bath application of NA.

      We appreciate the reviewer’s valuable comment. As noted, optogenetic stimulation of LC<sup>→SDH</sup>-NA neurons would indeed be useful to test this model. However, in our case, it is technically difficult to investigate the responses of Vgat<sup>+</sup> inhibitory neurons and Hes5<sup>+</sup> astrocytes to NA endogenously released from LC<sup>→SDH</sup>-NA neurons. This would require the use of Vgat-Cre or Hes5-CreERT2 mice, but employing these lines precludes the use of NET-Cre mice, which are necessary for specific and efficient expression of ChrimsonR in LC<sup>→SDH</sup>-NA neurons. Nevertheless, all of our experimental data consistently support the proposed model, and we believe that the reviewer will agree with this, without additional experiments that is difficult to conduct because of technical limitations (lines 382-388).

      (l) As for other experiments, the proportion of Hes+ astrocytes that express hM3Dq, and the absence of expression in other cells, should be quantified and validated to interpret behavioral data.

      We thank the reviewer for raising this point. In our experiments, we used an HA-tag (fused with hM3Dq) to confirm hM3Dq expression. However, it is difficult to precisely analyze individual astrocytes because, as shown in Figure 3J, the boundaries of many HA-tag<sup>+</sup> astrocytes are indistinguishable. This seems to be due to the membrane localization of HA-tag, the complex morphology of astrocytes, and their tile-like distribution pattern (Baldwin et al., Trends Cell Biol, 2024 (PMID: 38180380)). Nevertheless, our previous study demonstrated that ~90% of astrocytes in the superficial laminae are Hes5<sup>+</sup> (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), and intra-SDH injection of AAV-hM3Dq labeled the majority of superficial astrocytes (Figure 3J). Thus, AAV-FLEx[hM3Dq] injection into Hes5-CreERT2 mice allows efficient expression of hM3Dq in Hes5<sup>+</sup> astrocytes in the SDH. Importantly, our previous studies using Hes5-CreERT2 mice have confirmed that hM3Dq is not expressed in other cell types (neurons, oligodendrocytes, or microglia) (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652); Kagiyama et al., Mol Brain, 2025 (PMID: 40289116)). This information regarding the cell-type specificity has now been briefly described in the revised version (lines 218-219).

      (m) Showing that the effect of CNO is dose-dependent would strengthen the authors' findings.

      Thank you for your comment. We have now demonstrated a dose-dependent effect of CNO on Ca<sup>2+</sup> responses in SDH astrocytes (please see our response to Major Point (4) from Reviewer #2 (Recommendations for the Authors) (Figure S7; lines 225-228). In addition, we also confirmed that the effect of CNO is not nonspecific, as CNO application did not alter sIPSCs in spinal cord slices prepared from mice lacking hM3Dq expression in astrocytes (Figure S7; lines 225-228).

      (n) The proportion of SG neurons for which CNO bath application resulted in a reduction in recorded sIPSCs is not clear.

      We have included individual data points in each bar graph to more clearly illustrate the effect of CNO on each neuron (Figure 3L, N).

      (o) A1Rs. The specific expression of Cas9 and guide RNAs, and the specific KD of A1Rs, in inhibitory interneurons but not in other cell types expressing A1Rs should be quantified and validated.

      In addition to the data demonstrating the specific expression of SaCas9 and sgAdora1 in Vgat<sup>+</sup> inhibitory neurons shown in Figure 3G of the initial version, we have now conducted the same experiments with a different sample and confirmed this specificity: SaCas9 (detected via HA-tag) and sgAdora1 (detected via mCherry) were expressed in PAX2<sup>+</sup> inhibitory neurons (Author response image 1). Furthermore, as shown in Figure 3H and I in the initial version, the functional reduction of A<sub>1</sub>Rs in inhibitory neurons was validated by electrophysiological recordings. Together, these results support the successful deletion of A<sub>1</sub>Rs in inhibitory neurons.

      Author response image 1.

      Expression of HA-tag and mCherry in inhibitory neurons (a different sample from Figure 3G) SaCas9 (yellow, detected by HA-tag) and mCherry (magenta) expression in the PAX2<sup>+</sup> inhibitory neurons (cyan) at 3 weeks after intra-SDH injection of AAV-FLEx[SaCas9-HA] and AAV-FLEx[mCherry]-U6-sgAdora1 in Vgat-Cre mice. Arrowheads indicate genome-editing Vgat<sup>+</sup> cells. Scale bar, 25 µm.

      (6) Methods:

      It is unclear how fiber photometry is performed using "optic cannula" during restraint stress while mice are in a 50ml falcon tube (as shown in Figure 1A).

      We apologize for the omission of this detail in the Methods section. To monitor Ca<sup>2+</sup> events in LC-NA neurons during restraint stress, we created a narrow slit on the top of the conical tube, allowing mice to undergo restraint stress while connected to the optic fiber (see video). This information has now been added to the Methods section (lines 552-553).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Scientific rigor:

      It is unclear if the normal distribution of the data was determined before selecting statistical tests.

      We apologize for omitting this description. For all statistical analyses in this study, we first assessed the normality of the data and then selected appropriate statistical tests accordingly. We have added this information to the revised manuscript (lines 711-712).

      (2) Nomenclature:

      (a) Mouse Genome Informatics (MGI) nomenclature should be used to describe mouse genotypes (i.e., gene name in italic, only first letter is capitalized, alleles in superscript).

      (b) FLEx should be used instead of flex.

      Thank you for the suggestion. We have corrected these terms (including FLEx) according to MGI nomenclature.

      Reviewer #2 (Public review):

      Summary:

      This study investigates the role of spinal astrocytes in mediating stress-induced pain hypersensitivity, focusing on the LC (locus coeruleus)-to-SDH (spinal dorsal horn) circuit and its mechanisms. The authors aimed to delineate how LC activity contributes to spinal astrocytic activation under stress conditions, explore the role of noradrenaline (NA) signaling in this process, and identify the downstream astrocytic mechanisms that influence pain hypersensitivity.

      The authors provide strong evidence that 1-hour restraint stress-induced pain hypersensitivity involves the LC-to-SDH circuit, where NA triggers astrocytic calcium activity via alpha1a adrenoceptors (alpha1aRs). Blockade of alpha1aRs on astrocytes - but not on Vgat-positive SDH neurons - reduced stress-induced pain hypersensitivity. These findings are rigorously supported by well-established behavioral models and advanced genetic techniques, uncovering the critical role of spinal astrocytes in modulating stress-induced pain.

      However, the study's third aim - to establish a pathway from astrocyte alpha1aRs to adenosine-mediated inhibition of SDH-Vgat neurons - is less compelling. While pharmacological and behavioral evidence is intriguing, the ex vivo findings are indirect and lack a clear connection to the stress-induced pain model. Despite these limitations, the study advances our understanding of astrocyte-neuron interactions in stress-pain contexts and provides a strong foundation for future research into glial mechanisms in pain hypersensitivity.

      Strengths:

      The study is built on a robust experimental design using a validated 1-hour restraint stress model, providing a reliable framework to investigate stress-induced pain hypersensitivity. The authors utilized advanced genetic tools, including retrograde AAVs, optogenetics, chemogenetics, and subpopulation-specific knockouts, allowing precise manipulation and interrogation of the LC-SDH circuit and astrocytic roles in pain modulation. Clear evidence demonstrates that NA triggers astrocytic calcium activity via alpha1aRs, and blocking these receptors effectively reduces stress-induced pain hypersensitivity.

      Weaknesses:

      Despite its strengths, the study presents indirect evidence for the proposed NA-to-astrocyte(alpha1aRs)-to-adenosine-to-SDH-Vgat neurons pathway, as the link between astrocytic adenosine release and stress-induced pain remains unclear. The ex vivo experiments, including NA-induced depolarization of Vgat neurons and chemogenetic stimulation of astrocytes, are challenging to interpret in the stress context, with the high CNO concentration raising concerns about specificity. Additionally, the role of astrocyte-derived D-serine is tangential and lacks clarity regarding its effects on SDH Vgat neurons. The astrocyte calcium signal "dip" after LC optostimulation-induced elevation are presented without any interpretation.

      We appreciate the reviewer's careful reading of our paper. According to the reviewer's comments, we have performed new additional experiments and added some discussion in the revised manuscript (please see the point-by-point responses below).

      Reviewer #2 (Recommendations for the authors):

      The astrocyte-mediated pathway of NA-to-astrocyte (alpha1aRs)-to-adenosine-to-SDH Vgat neurons (A1R) in the context of stress-induced pain hypersensitivity requires more direct evidence. While the data showing that the A1R agonist CPT inhibits stress-induced hypersensitivity and that stress combined with Aβ fiber stimulation increases pERK in the SDH are intriguing, these findings primarily support the involvement of A1R on Vgat neurons and are only behaviorally consistent with SDH-Vgat neuronal A1R knockdown. The role of astrocytes in this pathway in vivo remains indirect. The ex vivo chemogenetic Gq-DREADD stimulation of SDH astrocytes, which reduced sIPSCs in Vgat neurons in a CPT-dependent manner, needs revision with non-DREADD+CNO controls to validate specificity. Furthermore, the ex vivo bath application of NA causing depolarization in Vgat neurons, blocked by CPT, adds complexity to the data leaving me wondering how astrocytes are involved in such processes, and it does not directly connect to stress-induced pain hypersensitivity. These findings are potentially useful but require additional refinement to establish their relevance to the stress model.

      We thank the reviewer for the insightful feedback. First, regarding the role of astrocytes in this pathway in vivo, we showed in the initial version that mechanical pain hypersensitivities induced by intrathecal NA injection and by acute restraint stress were attenuated by both pharmacological blockade and Vgat<sup>+</sup> neuron-specific knockdown of A<sub>1</sub>Rs (Figure 4A, B). Given that NA- and stress-induced pain hypersensitivity is mediated by α<sub>1A</sub>R-dependent signaling in Hes5<sup>+</sup> astrocytes (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652); this study), these findings provide in vivo evidence supporting the involvement of the NA → Hes5<sup>+</sup> astrocyte (via α<sub>1A</sub>Rs) → adenosine → Vgat<sup>+</sup> neuron (via A<sub>1</sub>Rs) pathway. As noted in the reviewer’s major comment (2), in vivo monitoring of adenosine dynamics in the SDH during stress exposure would further substantiate the astrocyte-to-neuron signaling pathway. However, we did not detect clear signals, potentially due to several technical limitations (see our response below). Acknowledging this limitation, we have now added a new paragraph in the end of Discussion section to address this issue. Second, the specificity of the effect of CNO has now been validated by additional experiments (see our response to major point (4)). Third, the reviewer’s concern regarding the action of NA on Vgat<sup>+</sup> neurons has also been addressed (see our response to major point (3) below).

      Major points:

      (1) The in vivo pharmacology using DCK to antagonize D-serine signaling from alpha1a-activated astrocytes is tangential, as there is limited evidence on how Vgat neurons (among many others) respond to D-serine. This aspect requires more focused exploration to substantiate its relevance.

      We propose that the site of action of D-serine in our neural circuit model is the NMDA receptors (NMDARs) on excitatory neurons, a notion supported by our previous findings (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652); Kagiyama et al., Mol Brain, 2025 (PMID: 40289116)). However, we cannot exclude the possibility that D-serine also acts on NMDARs expressed by Vgat<sup>+</sup> inhibitory neurons. Nevertheless, given that intrathecal injection of D-serine in naïve mice induces mechanical pain hypersensitivity (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), it appears that the pronociceptive effect of D-serine in the SDH is primarily associated with enhanced pain processing and transmission, presumably via NMDARs on excitatory neurons. We have added this point to the Discussion section in the revised manuscript (lines 325-330).

      (2) Additionally, employing GRAB-Ado sensors to monitor adenosine dynamics in SDH astrocytes during NA signaling would significantly strengthen conclusions about astrocyte-derived adenosine's role in the stress model.

      We agree with the reviewer’s comment. Following this suggestion, we attempted to visualize NA-induced adenosine (and ATP) dynamics using GRAB-ATP and GRAB-Ado sensors (Wu et al., Neuron, 2022 (PMID: 34942116); Peng et al., Science, 2020 (PMID: 32883833)) in acutely isolated spinal cord slices from mice after intra-SDH injection of AAV-hSyn-GRABATP<sub>1.0</sub> and -GRABAdo<sub>1.0</sub>. We confirmed expression of these sensors in the SDH (Author response image 2a) and observed increased signals after bath application of ATP (0.1 or 1 µM) or adenosine (1 µM) (Author response image 2b, c). However, we were unable to detect clear signals following NA stimulation (Author response image 2b, c). The reason for this lack of detectable changes remains unclear. If the release of adenosine from astrocytes is a highly localized phenomenon, it may be measurable using high-resolution microscopy capable of detecting adenosine levels at the synaptic level and more sensitive sensors. Further investigation will therefore be required (lines 340-341).

      Author response image 2.

      Ex vivo imaging of GRAB-ATP and GRAB-Ado sensors.(a) Representative images of GRAB<sub>ATP1.0</sub> (left, green) or GRAB<sub>Ado1.0</sub> (right, green) expression in the SDH at 3 weeks after SDH injection of AAV-hSyn-GRAB<sub>Ado1.0</sub> or AAV-hSyn-GRAB<sub>Ado1.0</sub> in Hes5-CreERT2 mice. Scale bar, 200 µm. (b) Left: Representative fluorescence images showing GRAB<sub>ATP1.0</sub> responses before and after perfusion with NA or ATP. Right: Representative traces showing responses to ATP (0.1 and 1 µM) or NA (10 µM). (c) Left: Representative fluorescence images showing GRABAdo1.0 responses before and after perfusion with NA or adenosine (Ado). Right: Representative traces showing responses to Ado (0.01, 0.1, and 1 µM), NA (10 µM), or no application (negative control).

      (3) The interpretation of Figure 3D is challenging. The manuscript implies that 20 μM NA acts on Adra1a receptors on Vgat neurons to depolarize them, but this concentration should also activate Adra1a on astrocytes, leading to adenosine release and potential inhibition of depolarization. The observation of depolarization despite these opposing mechanisms requires explanation, as does the inhibition of depolarization by bath-applied A1R agonist. Of note, 20 μM NA is a high concentration for Adra1a activation, typically responsive at nanomolar levels. The discussion should reconcile this with prior studies indicating dose-dependent effects of NA on pain sensitivity (e.g., Reference 22).

      Like the reviewer, we also considered that bath-applied NA could activate α<sub>1A</sub>Rs expressed on Hes5<sup>+</sup> astrocytes. To clarify this point, we have performed additional patch-clamp recordings and found that knockdown of A<sub>1</sub>Rs in Vgat<sup>+</sup> neurons tended to increase the proportion of Vgat<sup>+</sup> neurons with NA-induced depolarizing responses (Figure S8). Therefore, it is conceivable that NA-induced excitation of Vgat<sup>+</sup> neurons may involve both a direct effect of NA activating α<sub>1A</sub>Rs in Vgat<sup>+</sup> neurons and an indirect inhibitory signaling from NA-stimulated Hes5<sup>+</sup> astrocytes via adenosine (lines 298-300).

      The concentration of NA used in our ex vivo experiments is higher than that typically used in vitro with αR-<sub>1A</sub>expressing cell lines or primary culture cells, but is comparable to concentrations used in other studies employing spinal cord slices (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652); Baba et al., Anesthesiology, 2000 (PMID: 10691236); Lefton et al., Science, 2025 (PMID: 40373122)). In slice experiments, drugs must diffuse through the tissue to reach target cells, resulting in a concentration gradient. Therefore, higher drug concentrations are generally necessary in slice experiments, in contrast to cultured cell experiments, where drugs are directly applied to target cells. Importantly, we have previously shown that the pharmacological effects of 20 μM NA on Vgat<sup>+</sup> neurons and Hes5<sup>+</sup> astrocytes are abolished by loss of α<sub>1A</sub>Rs in these cells (Uchiyama et al., Mol Brain, 2022 (PMID: 34980215); Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), confirming the specificity of these NA actions.

      Regarding the dose-dependent effect of NA on pain sensitivity, NA-induced pain hypersensitivity is abolished in Hes5<sup>+</sup> astrocyte-specific α<sub>1A</sub>R-KO mice (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), indicating that this behavior is mediated by α<sub>1A</sub>Rs expressed on Hes5<sup>+</sup> astrocytes. In contrast, the suppression of pain sensitivity by high doses of NA was unaffected in the KO mice (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), suggesting that other adrenergic receptors may contribute to this phenomenon. Clarifying the responsible receptors will require future investigation.

      (4) In Figure 3K-M, the CNO concentration used (100 μM) is unusually high compared to standard doses (1 to a few μM), raising concerns about potential off-target effects. Including non-hM3Dq controls and using lower CNO concentrations are essential to validate the specificity of the observed effects. Similarly, the study should clarify whether astrocyte hM3Dq stimulation alone (without NA) would induce hyperpolarization in Vgat neurons and how this interacts with NA-induced depolarization.

      We acknowledge that the concentration of CNO used in our experiments is relatively high compared to that used in other reports. However, in our experiments, application of CNO at 1, 10, and 100 μM induced Ca<sup>2+</sup> increases in GCaMP6-expressing astrocytes in spinal cord slices in a concentration-dependent manner (Figure S7). Among these, 100 μM CNO most effectively replicated the NA-induced Ca<sup>2+</sup> signals in astrocytes. Based on these findings, we selected this concentration for use in both the current and previous studies (Kohro et al., Nat Neurosci., 2020 (PMID: 33020652)). Importantly, to rule out non-specific effects, we conducted control experiments using spinal cord slices from mice that did not express hM3Dq in astrocytes and confirmed that CNO had no effect on Ca<sup>2+</sup> responses in astrocytes and sIPSCs in substantial gelatinosa (SG) neurons (Figure S7; lines 223-228). Thus, although the CNO concentration used is relatively high, the observed effects of CNO are not non-specific but result from the chemogenetic activation of hM3Dq-expressing astrocytes.

      In this study, we used Hes5-CreERT2 and Vgat-Cre mice to manipulate gene expression in Hes5<sup>+</sup> astrocytes and Vgat<sup>+</sup> neurons, respectively. In order to fully address the reviewer’s comment, the use of both Cre lines is necessary. However, simultaneous and independent genetic manipulation in each cell type using Cre activity alone is not feasible with the current genetic tools. We have mentioned this as a technical limitation in the Discussion section (lines 382-388).

      (5) The role of D-serine released by hM3Dq-stimulated astrocytes in (separately) modulating sub-types of neurons including excitatory neurons and Vgat positives needs more detailed discussion. If no effect of D-serine on Vgat neurons is observed, this should be explicitly stated, and the discussion should address why this might be the case.

      As mentioned in our response to Major Point (1) above, we have added a discussion of this point in the revised manuscript (lines 325-330).

      (6) Finally, the observed "dip" in astrocyte calcium signals below baseline following the large peaks with LC optostimulation should be discussed further, as understanding this phenomenon could provide valuable insights into astrocytic signaling dynamics in the context of single acute or repetitive chronic stress.

      Thank you for your comment. We found that this phenomenon was not affected by pretreatment with the α<sub>1A</sub>R-specific antagonist silodosin (Author response image 3), which effectively suppressed Ca<sup>2+</sup> elevations evoked by stimulation of LC-NA neurons (Figure 2F). This implies that the phenomenon is independent of α<sub>1A</sub>R signaling. Elucidating the detailed underlying mechanism remains an important direction for future investigation.

      Author response image 3.

      The observed "dip" in astrocyte Ca<sup>2+</sup> signals was not affected by pretreatment with the α<sub>1A</sub>R-specific antagonist silodosin. Representative traces of astrocytic GCaMP6m signals in response to optogenetic stimulation of LC-NAe<sup>→SDH</sup>rgic axons/terminals in a spinal cord slice. Each trace shows the GCaMP6m signal before and after optogenetic stimulation (625 nm, 1 mW, 10 Hz, 5 ms pulse duration, 10 s). Slices were pretreated with silodosin (40 nM) for 5 min prior to stimulation.

      Reviewer #3 (Public review):

      Summary:

      This is an exciting and timely study addressing the role of descending noradrenergic systems in nocifensive responses. While it is well-established that spinally released noradrenaline (aka norepinephrine) generally acts as an inhibitory factor in spinal sensory processing, this system is highly complex. Descending projections from the A6 (locus coeruleus, LC) and the A5 regions typically modulate spinal sensory processing and reduce pain behaviours, but certain subpopulations of LC neurons have been shown to mediate pronociceptive effects, such as those projecting to the prefrontal cortex (Hirshberg et al., PMID: 29027903).

      The study proposes that descending cerulean noradrenergic neurons potentiate touch sensation via alpha-1 adrenoceptors on Hes5+ spinal astrocytes, contributing to mechanical hyperalgesia. This finding is consistent with prior work from the same group (dd et al., PMID:). However, caution is needed when generalising about LC projections, as the locus coeruleus is functionally diverse, with differences in targets, neurotransmitter co-release, and behavioural effects. Specifying the subpopulations of LC neurons involved would significantly enhance the impact and interpretability of the findings.

      Strengths:

      The study employs state-of-the-art molecular, genetic, and neurophysiological methods, including precise CRISPR and optogenetic targeting, to investigate the role of Hes5+ astrocytes. This approach is elegant and highlights the often-overlooked contribution of astrocytes in spinal sensory gating. The data convincingly support the role of Hes5+ astrocytes as regulators of touch sensation, coordinated by brain-derived noradrenaline in the spinal dorsal horn, opening new avenues for research into pain and touch modulation.

      Furthermore, the data support a model in which superficial dorsal horn (SDH) Hes5+ astrocytes act as non-neuronal gating cells for brain-derived noradrenergic (NA) signalling through their interaction with substantia gelatinosa inhibitory interneurons. Locally released adenosine from NA-stimulated Hes5+ astrocytes, following acute restraint stress, may suppress the function of SDH-Vgat+ inhibitory interneurons, resulting in mechanical pain hypersensitivity. However, the spatially restricted neuron-astrocyte communication underlying this mechanism requires further investigation in future studies.

      Weaknesses

      (1) Specificity of the LC Pathway targeting

      The main concern lies with how definitively the LC pathway was targeted. Were other descending noradrenergic nuclei, such as A5 or A7, also labelled in the experiments? The authors must convincingly demonstrate that the observed effects are mediated exclusively by LC noradrenergic terminals to substantiate their claims (i.e. "we identified a circuit, the descending LC→SDH-NA neurons").

      (a) For instance, the direct vector injection into the LC likely results in unspecific effects due to the extreme heterogeneity of this nucleus and retrograde labelling of the A5 and A7 nuclei from the LC (i.e., Li et al., PMID: 26903420).

      We appreciate the reviewer's valuable comments. To address this point, we performed additional experiments and demonstrated that intra-SDH injection of AAVretro-Cre followed by intra-LC injection of AAV2/9-EF1α-FLEx[DTR-EGFP] specifically results in DTR expression in NA neurons of the LC, but not of the A5 or A7 regions (Figure S4; lines 127-128). These results confirm the specificity of targeting the LC<sup>→SDH</sup>-NAergic pathway in our study.

      (b) It is difficult to believe that the intersectional approach described in the study successfully targeted LC→SDH-NA neurons using AAVrg vectors. Previous studies (e.g., PMID: 34344259 or PMID: 36625030) demonstrated that similar strategies were ineffective for spinal-LC projections. The authors should provide detailed quantification of the efficiency of retrograde labelling and specificity of transgene expression in LC neurons projecting to the SDH.

      Thank you for your comment. As we described in our response to the weakness (5)-e) of Reviewer #1 (Public review), our additional analysis showed that, under our experimental conditions, expression of genes (for example DTR) was observed in 4.4% of NA (TH<sup>+</sup>) neurons in the LC (Figure S4; lines 126-127).

      The reasons for this difference between the previous studies and our current study is unclear; however, it is likely attributed to methodological differences, including the type of viral vectors employed, species differences (mouse (PMID: 34344259, our study) vs. rat (PMID: 36625030)), the amount of AAV injected into the SDH (300 nL at three sites (PMID: 34344259), and 300 nL at a single site (our study)) and LC (500 nL at a single site (PMID: 34344259), and 300 nL at a single site (our study)), as well as the depth of AAV injection in the SDH (200–300 µm from the dorsal surface of the spinal cord (PMID: 34344259), and 120–150 µm in depth from the surface of the dorsal root entry zone (our study)).

      (c) Furthermore, it is striking that the authors observed a comparably strong phenotypical change in Figure 1K despite fewer neurons being labelled, compared to Figure 1H and 1N with substantially more neurons being targeted. Interestingly, the effect in Figure 1K appears more pronounced but shorter-lasting than in the comparable experiment shown in Figure 1H. This discrepancy requires further explanation.

      Although only a representative section of the LC was shown in the initial version, LC<sup>→SDH</sup>-NA neurons are distributed rostrocaudally throughout the LC, as previously reported (Llorca-Torralba et al., Brain, 2022 (PMID: 34373893)). Our additional experiments analyzing multiple sections of the anterior and posterior regions of the LC have now revealed that approximately sixty LC<sup>→SDH</sup>-NA neurons express DTR, and these neurons are eliminated following DTX treatment (Figure S4; lines 126-128) (it should be noted that these neurons specifically project to the L4 segment of the SDH, and the total number of LC<sup>→SDH</sup>-NA neurons is likely much higher). Considering the specificity of LC<sup>→SDH</sup>-NAergic pathway targeting demonstrated in our study (as described above), together with the fact that primary afferent sensory fibers from the plantar skin of the hindpaw predominantly project to the L4 segment of the SDH, these data suggest that the observed behavioral changes are attributable to the loss of these neurons and that ablation of even a relatively small number of NA neurons in the LC can have a significant impact on behavior. We have added this hypothesis in the Discussion section (lines 373-382).

      Regarding the data in Figures 1H and 1K, as the reviewer pointed out, a statistically significant difference was observed at 90 min in mice with ablation of LC-NA neurons, but not in those with LC<sup>→SDH</sup>-NA neuron ablation. This is likely due to a slightly higher threshold in the control group at this time point (Figure 1K), and it remains unclear whether there is a mechanistic difference between the two groups at this specific time point.

      (d) A valuable addition would be staining for noradrenergic terminals in the spinal cord for the intersectional approach (Figure 1J), as done in Figures 1F/G. LC projections terminate preferentially in the SDH, whereas A5 projections terminate in the deep dorsal horn (DDH). Staining could clarify whether circuits beyond the LC are being ablated.

      As suggested, we performed DTR immunostaining in the SDH; however, we did not detect any DTR immunofluorescence there. A similar result was also observed in the spinal terminals of DTR-expressing primary afferent fibers (our unpublished data). The reason for this is unclear, but to the best of our knowledge, no studies have clearly shown DTR expression at presynaptic terminals, which may be because the action of DTX on the neuronal cell body is necessary for cell ablation. Nevertheless, as described in our response to the weakness (5)-f) by Reviewer 1 (Public review), we have now confirmed the specific expression of DTR in the LC, but not in the A5 and A7 regions (Figure S4; lines 127-128).

      (e) Furthermore, different LC neurons often mediate opposite physiological outcomes depending on their projection targets-for example, dorsal LC neurons projecting to the prefrontal cortex PFCx are pronociceptive, while ventral LC neurons projecting to the SC are antinociceptive (PMIDs: 29027903, 34344259, 36625030). Given this functional diversity, direct injection into the LC is likely to result in nonspecific effects.

      To avoid behavioral outcomes resulting from a mixture of facilitatory and inhibitory effects caused by activating the entire population of LC-NA neurons, we employed a specific manipulation targeting LC<sup>→SDH</sup>-NA neurons using AAV vectors. The specificity of this manipulation was confirmed in our previous study (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)) and in the current study (Figure S4). Using this approach, we previously demonstrated that LC neurons can exert pronociceptive effects via astrocytes in the SDH (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)). This pronociceptive role is further supported by the current study, which uses a more selective manipulation of LC<sup>→SDH</sup>-NA neurons through a NET-Cre mouse line. In addition, intrathecal administration of relatively low doses of NA in naïve mice clearly induces mechanical pain hypersensitivity. Nevertheless, we have also acknowledged that several recent studies have reported an inhibitory role of LC<sup>→SDH</sup>-NA neurons in spinal nociceptive signaling. The reason for these differing behavioral outcomes remains unclear, but several methodological differences may underlie the discrepancy. First, the degree of LC<sup>→SDH</sup>-NA neuronal activity may play a role. Although direct comparisons between studies reporting pro- and anti-nociceptive effects are difficult, our previous studies demonstrated that intrathecal administration of high doses of NA in naïve mice does not induce mechanical pain hypersensitivity (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)). Second, the sensory modality used in behavioral testing may be a contributing factor as the pronociceptive effect of NA appears to be selectively observed in responses to mechanical, but not thermal, stimuli (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)). This sensory modality-selective effect is also evident in mice subjected to acute restraint stress (Table S1). Therefore, the role of LC<sup>→SDH</sup>-NA neurons in modulating nociceptive signaling in the SDH is more complex than previously appreciated, and their contribution to pain regulation should be reconsidered in light of factors such as NA levels, sensory modality, and experimental context. In revising the manuscript, we have included some points described above in the Discussion (lines 282-291).

      Conclusion on Specificity: The authors are strongly encouraged to address these limitations directly, as they significantly affect the validity of the conclusions regarding the LC pathway. Providing more robust evidence, acknowledging experimental limitations, and incorporating complementary analyses would greatly strengthen the manuscript.

      We appreciate the reviewer’s comments. We fully acknowledge the limitations raised and agree that addressing them directly is important for the rigor of our conclusions on the LC pathway. To this end, we have performed additional experiments (e.g., Figure A and S4), which are now included in the revised manuscript. Furthermore, we have also newly added a new paragraph for experimental limitations in the end of Discussion section (lines 373-408). We believe these new data substantially strengthen the validity of our findings and have clarified these points in the Discussion section.

      (2) Discrepancies in Data

      (a) Figures 1B and 1E: The behavioural effect of stress on PWT (Figure 1E) persists for 120 minutes, whereas Ca2+ imaging changes (Figure 1B) are only observed in the first 20 minutes, with signal attenuation starting at 30 minutes. This discrepancy requires clarification, as it impacts the proposed mechanism.

      Thank you for your important comment. As pointed out by the reviewer, there is a difference between the duration of behavioral responses and Ca<sup>2+</sup> events, although the exact time point at which the PWT begins to decline remains undetermined (as behavioral testing cannot be conducted during stress exposure). A similar temporal difference was also observed following intraplantar injection of capsaicin (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)); while LC<sup>→SDH</sup>-NA neuron-mediated astrocytic Ca<sup>2+</sup> responses in SDH astrocytes last for 5–10 min after injection, behavioral hypersensitivity peaks around 60 min post-injection and gradually returns to baseline over the subsequent 60–120 min. These findings raise the possibility that astrocyte-mediated pain hypersensitivity in the SDH may involve a sustained alteration in spinal neural function, such as central sensitization. We have added this hypothesis to the Discussion section of the revised manuscript (lines 399-408), as it represents an important direction for future investigation.

      (b) Figure 4E: The effect is barely visible, and the tissue resembles "Swiss cheese," suggesting poor staining quality. This is insufficient for such an important conclusion. Improved staining and/or complementary staining (e.g., cFOS) are needed. Additionally, no clear difference is observed between Stress+Ab stim. and Stress+Ab stim.+CPT, raising doubts about the robustness of the data.

      As suggested, we performed c-FOS immunostaining and obtained clearer results (Figure 4E,F; lines 243-252). We also quantitatively analyzed the number of c-FOS<sup>+</sup> cells in the superficial laminae, and the results are consistent with those obtained from the pERK experiments.

      (c) Discrepancy with Existing Evidence: The claim regarding the pronociceptive effect of LC→SDH-NAergic signalling on mechanical hypersensitivity contrasts with findings by Kucharczyk et al. (PMID: 35245374), who reported no facilitation of spinal convergent (wide-dynamic range) neuron responses to tactile mechanical stimuli, but potent inhibition to noxious mechanical von Frey stimulation. This discrepancy suggests alternative mechanisms may be at play and raises the question of why noxious stimuli were not tested.

      In our experiments, ChrimsonR expression was observed in the superficial and deeper laminae of the spinal cord (Figure S6). Due to the technical limitations of the optical fibers used for optogenetics, the light stimulation could only reach the superficial laminae; therefore, it may not have affected the activity of neurons (including WDR neurons) located in the deeper laminae. Furthermore, the study by Kucharczyk et al. (Brain, 2022 (PMID: 35245374)) employed a stimulation protocol that differed from ours, applying continuous stimulation over several minutes. Given that the levels of NA released from LC<sup>→SDH</sup>-NAergic terminals in the SDH increase with the duration of terminal stimulation (as shown in Figure 2B), longer stimulation may result in higher levels of NA in the SDH. Considering also our data indicating that the pro- and anti-nociceptive effects of NA are dose dependent (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), these differences may be related to LC<sup>→SDH</sup>-NA neuron activity, NA levels in the SDH, and the differential responses of SDH neurons in the superficial versus deeper laminae (lines 388-395).

      (3) Sole reliance on Von Frey testing

      The exclusive use of von Frey as a behavioural readout for mechanical sensitisation is a significant limitation. This assay is highly variable, and without additional supporting measures, the conclusions lack robustness. Incorporating other behavioural measures, such as the adhesive tape removal test to evaluate tactile discomfort, the needle floor walk corridor to assess sensitivity to uneven or noxious surfaces, or the kinetic weight-bearing test to measure changes in limb loading during movement, could provide complementary insights. Physiological tests, such as the Randall-Selitto test for noxious pressure thresholds or CatWalk gait analysis to evaluate changes in weight distribution and gait dynamics, would further strengthen the findings and allow for a more comprehensive assessment of mechanical sensitisation.

      Thank you for your suggestion. Based on our previous findings that Hes5<sup>+</sup> astrocytes in the SDH selectively modulate mechanosensory signaling (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), the present study focused on behavioral responses to mechanical stimuli using von Frey filaments. As we have not previously conducted most of the behavioral tests suggested by the reviewers, and as we currently lack the necessary equipments for these tests (e.g., Randall–Selitto test, CatWalk gait analysis, and weight-bearing test), we were unable to include them in this study. However, it will be of great interest in future research to investigate whether activation of the LC<sup>→SDH</sup>-NA neuron-to-SDH Hes5<sup>+</sup> astrocyte signaling pathway similarly sensitizes behavioral responses to other types of mechanical stimuli and also to investigate the sensory modality-selective pro- and antinociceptive role of LC<sup>→SDH</sup>-NAergic signaling in the SDH (lines 396-399).

      Overall Conclusion

      This study addresses an important and complex topic with innovative methods and compelling data. However, the conclusions rely on several assumptions that require more robust evidence. Specificity of the LC pathway, experimental discrepancies, and methodological limitations (e.g., sole reliance on von Frey) must be addressed to substantiate the claims. With these issues resolved, this work could significantly advance our understanding of astrocytic and noradrenergic contributions to pain modulation.

      We have made every effort to address the reviewer’s concerns through additional experiments and analyses. Based on the new control data presented, we believe that our explanation is reasonable and acceptable. Although additional data cannot be provided on some points due to methodological constraints and limitations of the techniques currently available in our laboratory, we respectfully submit that the evidence presented sufficiently supports our conclusions.

      Reviewer #3 (Recommendations for the authors):

      A lot of beautiful and challenging-to-collect data is presented. Sincere congratulations to all the authors on this achievement!

      Notwithstanding, please carefully reconsider the conclusions regarding the LC pathway, as additional evidence is required to ensure their specificity and robustness.

      We thank the reviewer for the kind comments and for raising an important point regarding the LC pathway. The reviewer’s feedback prompted us to conduct additional investigations to further strengthen the validity of our conclusions. We have incorporated these new data and analyses into the revised manuscript, and we believe that these revisions substantially enhance the robustness and reliability of our findings.

    1. In random moments, you find yourself wondering if you should just burn it all to the ground and start over.

      I would make this bolder - on first read I'm thinking that I would maybe even make this the header because it's very striking and head-nod-y

    1. Reviewer #1 (Public review):

      Summary:

      This study by Howe and colleagues investigates the role of the posterolateral cortical amygdala (plCoA) in mediating innate responses to odors, specifically attraction and aversion. By combining optogenetic stimulation, single-cell RNA sequencing, and spatial analysis, the authors identify a topographically organized circuit within plCoA that governs these behaviors. They show that specific glutamatergic neurons in the anterior and posterior regions of plCoA are responsible for driving attraction and avoidance, respectively, and that these neurons project to distinct downstream regions, including the medial amygdala and nucleus accumbens, to control these responses.

      Strengths:

      The major strength of the study is the thoroughness of the experimental approach, which combines advanced techniques in neural manipulation and mapping with high-resolution molecular profiling. The identification of a topographically organized circuit in plCoA and the connection between molecularly defined populations and distinct behaviors is a notable contribution to understanding the neural basis of innate motivational responses. Additionally, the use of fucntional manipulations adds depth to the findings, offering valuable insights into the functionality of specific neuronal populations.

      Weaknesses:

      Previously described weaknesses in the study's methods and interpretation were fully addressed during revision. Locomotor behavior of the mice during head-fixed imaging experiments was added and analysis of the correlation of locomotion with neural activity was also added.

      This work provides significant insights into the neural circuits underlying innate behaviors and opens new avenues for further research. The findings are particularly relevant for understanding the neural basis of motivational behaviors in response to sensory stimuli, and the methods used could be valuable for researchers studying similar circuits in other brain regions. If the authors address the methodological issues raised, this work could have a substantial impact on the field, contributing to both basic neuroscience and translational research on the neural control of behavior.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study by Howe and colleagues investigates the role of the posterolateral cortical amygdala (plCoA) in mediating innate responses to odors, specifically attraction and aversion. By combining optogenetic stimulation, single-cell RNA sequencing, and spatial analysis, the authors identify a topographically organized circuit within plCoA that governs these behaviors. They show that specific glutamatergic neurons in the anterior and posterior regions of plCoA are responsible for driving attraction and avoidance, respectively, and that these neurons project to distinct downstream regions, including the medial amygdala and nucleus accumbens, to control these responses.

      Strengths:

      The major strength of the study is the thoroughness of the experimental approach, which combines advanced techniques in neural manipulation and mapping with high-resolution molecular profiling. The identification of a topographically organized circuit in plCoA and the connection between molecularly defined populations and distinct behaviors is a notable contribution to understanding the neural basis of innate motivational responses. Additionally, the use of functional manipulations adds depth to the findings, offering valuable insights into the functionality of specific neuronal populations.

      Weaknesses:

      There are some weaknesses in the study's methods and interpretation. The lack of clarity regarding the behavior of the mice during head-fixed imaging experiments raises the possibility that restricted behavior could explain the absence of valence encoding at the population level.

      We agree with idea that head-fixation may alter the state of the animal and the neural encoding of odor. To address this, we have provided further analysis of walking behavior during the imaging sessions, which is provided in Figure S2. Overall, we could not identify any clear patterns in locomotor behavior that are odor-specific. Moreover, when neural activity was sorted depending on the behavioral state (walking, pausing or fleeing) we didn’t observe any apparent patterns in odor-evoked neural activity. This is now discussed in the Results and Limitations sections of the manuscript.

      Furthermore, while the authors employ chemogenetic inhibition of specific pathways, the rationale for this choice over optogenetic inhibition is not fully addressed, and this could potentially affect the interpretation of the results.

      The rationale was logistical. First, inhibition of over a timescale of minutes is problematic with heat generation during prolonged optical stimulation. Second, our behavioral apparatus has a narrow height between the ceiling and floor, making tethering difficult. This is now explained the results section. The trade-off of using chemogenetics is that we are silencing neurons and not specific projections. However, because we find that NAc- and MeA- projecting neurons have little shared collateralization, we believe the conclusion of divergent pathways still stands. This is now discussed in the Limitations section.

      Additionally, the choice of the mplCoA for manipulation, rather than the more directly implicated anterior and posterior subregions, is not well-explained, which could undermine the conclusions drawn about the topographic organization of plCoA.

      We targeted the middle region of plCoA because it contains a mixture of cell types found in both the anterior and posterior plCoA, allowing us to test the hypothesis that cell types, not intra plCoA location, elicit different responses. Had we targeted the anterior or posterior regions, we would expect to simply recapitulate the result from activation of random cells in each region. As a result, we think stimulation in the middle plCoA is a better test for the contribution of cell types. We have now clarified this in the text.

      Despite these concerns, the work provides significant insights into the neural circuits underlying innate behaviors and opens new avenues for further research. The findings are particularly relevant for understanding the neural basis of motivational behaviors in response to sensory stimuli, and the methods used could be valuable for researchers studying similar circuits in other brain regions. If the authors address the methodological issues raised, this work could have a substantial impact on the field, contributing to both basic neuroscience and translational research on the neural control of behavior.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by the Root laboratory and colleagues describes how the posterolateral cortical amygdala (plCoA) generates valenced behaviors. Using a suite of methods, the authors demonstrate that valence encoding is mediated by several factors, including spatial localization of neurons within the plCoA, glutamatergic markers, and projection. The manuscript shows convincingly that multiple features (spatial, genetic, and projection) contribute to overall population encoding of valence. Overall, the authors conduct many challenging experiments, each of which contains the relevant controls, and the results are interpreted within the framework of their experiments.

      Strengths:

      - For a first submission the manuscript is well constructed, containing lots of data sets and clearly presented, in spite of the abundance of experimental results.

      - The authors should be commended for their rigorous anatomical characterizations and posthoc analysis. In the field of circuit neuroscience, this is rarely done so carefully, and when it is, often new insights are gleaned as is the case in the current manuscript.

      - The combination of molecular markers, behavioral readouts and projection mapping together substantially strengthen the results.

      - The focus on this relatively understudied brain region in the context is valence is well appreciated, exciting and novel.

      Weaknesses:

      - Interpretation of calcium imaging data is very limited and requires additional analysis and behavioral responses specific to odors should be considered. If there are neural responses behavioral epochs and responses to those neuronal responses should be displayed and analyzed.

      We have now considered this, see response above.

      - The effect of odor habituation is not considered.

      We considered this, but we did not find any apparent differences in valence encoding as measured by the proportion of neurons with significant valence scores across trials (see Figure 1J).

      - Optogenetic data in the two subregions relies on very careful viral spread and fiber placement. The current anatomy results provided should be clear about the spread of virus in A-P, and D-V axis, providing coordinates for this, to ensure readers the specificity of each sub-zone is real.

      We were careful to exclude animals for improper targeting. The spread of virus is detailed in Figures S3, S8 & S9.

      - The choice of behavioral assays across the two regions doesn't seem balanced and would benefit from more congruency.

      The choice of the 4-quadrant assay was used because this study builds off of our prior experiments that demonstrate a role for the plCoA in innate behavior. It is noteworthy that the responses to odor seen in this assay are generally in agreement with other olfactory behavioral assays, so one wouldn’t predict a different result. Moreover, the approach and avoidance responses measured in this assay are precisely the behaviors we wish to understand. We did examine other non-olfactory behavioral readouts (Figures S3, S8), and didn’t observe any effect of manipulation of these pathways.

      - Rationale for some of the choices of photo-stimulation experiment parameters isn't well defined.

      The parameters for photo-stimulation were based on those used in our past work (Root et al., 2014). We used a gradient of frequency from 1-10 Hz based on the idea that odor likely exists in a gradient and this was meant to mimic a potential gradient, though we don’t know if it exists. The range in stimulation frequencies appears to align with the actual rate of firing of plCoA neurons (Iurilli et al., 2017).

      Reviewer #3 (Public review):

      Summary:

      Combining electrophysiological recording, circuit tracing, single cell RNAseq, and optogenetic and chemogenetic manipulation, Howe and colleagues have identified a graded division between anterior and posterior plCoA and determined the molecular characteristics that distinguish the neurons in this part of the amygdala. They demonstrate that the expression of slc17a6 is mostly restricted to the anterior plCoA whereas slc17a7 is more broadly expressed. Through both anterograde and retrograde tracing experiments, they demonstrate that the anterior plCoA neurons preferentially projected to the MEA whereas those in the posterior plCoA preferentially innervated the nucleus accumbens. Interestingly, optogenetic activation of the aplCoA drives avoidance in a spatial preference assay whereas activating the pplCoA leads to preference. The data support a model that spatially segregated and molecularly defined populations of neurons and their projection targets carry valence specific information for the odors. The discoveries represent a conceptual advance in understanding plCoA function and innate valence coding in the olfactory system.

      Strengths:

      The strongest evidence supporting the model comes from single cell RNASeq, genetically facilitated anterograde and retrograde circuit tracing, and optogenetic stimulation. The evidence clear demonstrates two molecularly defined cell populations with differential projection targets. Stimulating the two populations produced opposite behavioral responses.

      Weaknesses:

      There are a couple of inconsistencies that may be addressed by additional experiments and careful interpretation of the data.

      Stimulating aplCoA or slc17a6 neurons results in spatial avoidance, and stimulating pplCoA or slc17a7 neurons drives approach behaviors. On the other hand, the authors and others in the field also show that there is no apparent spatial bias in odor-driven responses associated with odor valence. This discrepancy may be addressed better. A possibility is that odor-evoked responses are recorded from populations outside of those defined by slc17a6/a7. This may be addressed by marking activated cells and identifying their molecular markers. A second possibility is that optogenetic stimulation activates a broad set of neurons that and does not recapitulate the sparseness of odor responses. It is not known whether sparsely activation by optogenetic stimulation can still drive approach of avoidance behaviors.

      We agree that marking specific genetic or projection defined neurons could help to clarify if there are some neurons have more selective valence responses. However, we are not able to perform these experiments at the moment. We have included new data demonstrating that sparser optogenetic activation evokes behaviors similar in magnitude as the broader activation (see Figure S4).

      The authors show that inhibiting slc17a7 neurons blocks approaching behaviors toward 2-PE. Consistent with this result, inhibiting NAc projection neurons also inhibits approach responses. However, inhibiting aplCOA or slc17a6 neurons does not reduce aversive response to TMT, but blocking MEA projection neurons does. The latter two pieces of evidence are not consistent with each other. One possibility is that the MEA projecting neurons may not be expressing slc17a6. It is not clear that the retrogradely labeling experiments what percentage of MEA- and NACprojecting neurons express slc17a6 and slc17a7. It is possible that neurons expressing neither VGluT1 nor VGluT2 could drive aversive or appetitive responses. This possibility may also explain that silencing slc17a6 neurons does not block avoidance.

      We have now performed RNAscope staining on retrograde tracing to better define this relationship. Although the VGluT1 and VGluT2 neurons have biased projections to the MeA and NAc, respectively, there is some nuance detailed in Figure S10. Generally, MeA projecting neurons are predominately VGluT2+, whereas NAc projecting have about 20% that express both. Some (less than 35%) retrogradely labeled neurons were not detected as VGluT1 or VGluT2 positive, suggesting that other populations could also contribute. We agree that the discrepancy between MeA-projection and VGluT2 silencing is likely due to incomplete targeting of the MeA-projecting population with the VGluT2-cre line. This is included in the Discussion section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Main:

      (1) For the head-fixed imaging experiments, what is the behavior of the mice during odor exposure? Could the weak reliability of individual neurons be due to a lack of approach or avoidance behavior? Could restricted behavior also explain the lack of valence encoding at the population level?

      We agree that this is a limitation of head-fixed recordings. In the revised manuscript we did attempt to characterize their behavioral response, and look for correlations in odor representation. Although we did find different patterns of odor-evoked walking behavior, these patterns were not reliable or specific to particular odors (Figure S2). For example, one might expect aversive odors to pause walking or elicit a fast fleeing-like response, but we did not observe any apparent differences for locomotion between odors as all odors evoked a mixture of responses (Figure S2A-D, text lines 208-232). We then examined responses to odor depending on the behavioral state (walking, pausing or fleeing) and didn’t observe any apparent patterns in odor responses (Figure S2E,F). Lastly, we acknowledge in the text that the lack of valence encoding may be an artifact of head-fixation (see lines 849-857).

      (2) For the optogenetic manipulations of Vglut1 and Vglut2 neurons, why was the injection and fiber targeted to the medial portion of the plCoA, if the hypothesis was that these glutamatergic neuron populations in different regions (anterior or posterior) are responsible for approach and avoidance? 

      We targeted the middle region of plCoA because it contains a mixture of cell types found in both the anterior and posterior plCoA, allowing us to test the hypothesis that cell types, not intraplCoA location, elicit different responses. Had we targeted the anterior or posterior regions, we would expect to simply recapitulate the result from activation of random cells in each region. As a result, we think stimulation in the middle plCoA is a better test for the contribution of cell types. We have clarified this in the text (Lines 417-419).

      Could this explain the lack of necessity with the DREADD experiments? 

      For the loss of function experiments, a larger volume of virus was injected to cover a larger area and we did confirm targeting of the appropriate areas. Though, it is always possible that the lack of necessity is due to incomplete silencing.

      Further, why was an optogenetic inhibition approach not utilized? 

      Although optogenetic inhibition could have plausibly been used instead, we chose chemogenetic inhibition for two reasons: First, for minutes-long periods of inhibition, optical illumination poses the risk of introducing heat related effects (Owen et al., 2019). In fact, we first tried optical inhibition but controls were exhibited unusually large variance. Second, it is more feasible in our assay as it has a narrow height between the floor and lid that complicates tethering to an optic fiber. Past experiments overcame this with a motorized fiber retraction system (Root et al., 2014), but this is highly variable with user-dependent effects, so we found chemogenetics to be a more practical strategy. We have added a sentence to explain the rationale (see lines 561-563).

      (3) The specific subregion of the nucleus accumbens that was targeted should be named, as distinct parts of the nucleus accumbens can have very different functions. 

      We attempted to define specific subregions of the nucleus accumbens and found that plCoA projection is not specific to the shell or core, anterior or posterior, rather it broadly innervates the entire structure. We have added a note about this in manuscript (see lines 470-471). Given that we did not find notable subregion-specific outputs within the NAc, targeting was directed to the middle region of NAc, with coordinates stated in the methods. 

      (4) Why was an intersectional DREADD approach used to inhibit the projection pathways, as opposed to optogenetic inhibition? The DREADD approach could potentially affect all projection targets, and the authors might want to address how this could influence the interpretation of the results.

      This is partly addressed above in point 2. As for interpretation, we acknowledge that the intersectional approach silences the neurons projecting to a given target and not the specific projection and we have been careful with the wording. Although this may complicate the conclusion, we did map the collaterals for NAc and MeA projecting neurons and find that neurons do not appreciably project to both targets and have minimal projections to other targets. We have now taken care to state that we silence the neurons projecting to a structure, not silencing the projection, and we acknowledge this caveat. However, since the MeA- and NAcprojecting neurons appear to be distinct from each other (largely not collateralizing to each other), the conclusion that these divergent pathways are required still stands. We have added discussion of this in the Limitations section (see lines 859-863).

      Minor:

      (1) Line 402 needs a reference.

      We have added the missing reference (now line 441).

      (2) The Supplemental Figure labeling in the main text should be checked carefully.

      Thank you for pointing this out. We have fixed the prior errors.

      (3) Panel letter D is missing from Figure 2.

      This has been fixed.

      Reviewer #2 (Recommendations for the authors):

      Major Concerns, additional experiments:

      - In the calcium imaging experiments mice were presented with the same odor many times. Overall responses to odor presentations were quite variable and appear to habituate dramatically (Figure S1F). The general conclusion from these experiments are a lack of consistent valence-specific responses of individual neurons, but I wonder if this conclusion is slightly premature. A few potential explanatory factors that may need additional attention are: -First, despite recording video of the mouse's face during experiments, no behavioral response to any odor is described. Is it possible these odors when presented in head-fixed conditions do not have the same valence?

      Yes, we agree that this is a possibility. We have added a discussion in the Limitations section (see lines 849-857). We have also added additional behavioral analysis discussed below.

      On trials with neural responses are there behavioral responses that could be quantified? 

      We have now added data in which we attempt to characterize their behavioral response, to look for correlations in odor representation (see lines 208-228). Although we did observe different patterns of odor-evoked walking behavior, these patterns were not reliable or specific to particular odors (Figure S2). One might expect aversive odors to pause walking or elicit a fast fleeing-like response, but we did not observe any apparent differences for locomotion between odors (Figure S2A-D). Next, we examined responses to odor depending on the behavioral state (walking, pausing or fleeing) and didn’t observe any meaningful differences in odor responses (Figure S2E,F). Lastly, we acknowledge that the odor representation may be different in freely moving animals that exhibit dynamic responses to odor (see lines 859-857).

      - Habituation seems to play a prominent role in the neural signals, is there a larger contribution of valence if you look only at the first delivery (or some subset of the 20 presentations) of an odor type for a given trial? 

      Indeed, we considered this, but we did not find any apparent differences in valence encoding as measured by the proportion of neurons with significant valence scores across trials (see Figure 1J).

      - Is it reasonable to exclude valence encoding as a possibility when largely neurons were unresponsive to the positive valence odors (2PE and peanut) chosen when looking at the average cluster response (Figure 1F)? 

      It is true that we see fewer neurons responding to the appetitive odors (Figure 1H) and smaller average responses within the cluster, but some neurons do respond robustly. If these were valence responses, we would predict that neural responses should be similarly selective, but we do not observe any such selectivity. The sparseness of responses to appetitive odors does cause the average cluster analysis (Figure 1F) to show muted responses to these odors, consistent with the decreased responsivity to appetitive odors. Moreover, single neuron response analysis reveals that a given neuron is not more likely to respond to appetitive or aversive odors with any selectivity greater than chance. For these reasons, we think it is reasonable to conclude an absence of valence responses, which is consistent with the conclusion from another report (Iurilli et al., 2017).

      - While the preference and aversion assay with 4 corners is an interesting set-up and provides a lot of data for this particular manuscript. It would be helpful to test additional behaviors to determine whether these circuits are more conserved. As it stands the current manuscript relies on very broad claims using a single behavioral readout. Some attempts to use head-fixed approaches with more defined odor delivery timelines and/or additional valenced behavioral readouts is warranted.

      We appreciate the suggestion, but are not able to perform these experiments at the moment. The choice of the 4-quadrant assay was used because it built off of our prior experiments that demonstrate a role for the plCoA in innate behavior. It is noteworthy that the responses to odor seen in this assay are generally in agreement with other olfactory behavioral assays, so one wouldn’t predict a different result. The approach and avoidance responses measured in this assay are precisely the behaviors we wish to understand. Moreover, we did examine other nonolfactory behavioral readouts (Figures S3, S8), and didn’t observe any effect of manipulation of these pathways. Lastly, we have tried to define parameters for head-fixed behavior that would permit correlation of neural responses with behavior, including longer stimulations and closed loop locomotion control of odor concentration, but were unsuccessful at establishing parameters that generated reliable behavioral responses. We acknowledge that one limitation of the study is the limited behavioral tests with two odors and whether the circuits are more broadly necessary for other odors. 

      Minor comments:

      • Please define PID in the Results when it is first introduced.

      Done (see line 154)

      • Line 412 Figure S5C-N should be Figure S6C-N.

      Fixed. Now Figure S8C-N due to additional figures (see line 451).

      • Throughout the Discussion it would be helpful if the authors referred to specific Figure panels that support their statements (e.g. lines 654-656 "[...] which is supported by other findings presented here showing that both VGluT2+ and VGluT1+ neurons project to MeA, while the projection to NAc is almost entirely composed of VGluT1+ neurons".

      Thank you for the suggestion. We have added figure references in the discussion.

      • Line 778 "producing" should be "produce".

      Corrected (see line 840)

      • The figures are very busy, especially all the manipulations. The authors are commended for including each data point, but they might consider a more subtle design (translucent lines only for each animal, and one mean dot for the SEM), just to reduce the overall clutter of an already overwhelming figure set. But this is ultimately left to the authors to resolve and style to their liking. 

      Thank you for the suggestion. We have tried some different styles but like the original best.

      Reviewer #3 (Recommendations for the authors):

      If within reach, I suggest that the author determine the percentage of retrogradely labeled neurons to NAc or MEA that expresses GluT1 and GluT2. 

      We have done this for the middle region plCoA that has the greatest mixture of cell types (See Figure S10, lines 504-517). We find that the MeA projecting neurons are mostly VGluT2+ with a minority that express both VGluT1 and VGlut2. NAc-projecting neurons are primarily VGluT1+ with about 20% expressing VGlut2 as well.

      It would also be nice to sparse label of aplCoA and pplCoA using ChR2 to see if sparse activation drives approach or avoidance. 

      We agree that it would be useful to vary the sparseness of the ChR2 expression, to see if produces similar results. We examined this using sparsely labeled odor ensembles, as previously done (Root et al., 2014). Briefly, we used the Arc-CreER mouse to label TMT responsive neurons with a cre-dependent ChR2 AAV vector targeted to the anterior or posterior regions, while previously we had broadly targeted the entirety of plCoA. We had established that this labeling method captures about half of the active cells detected by Arc expression, which is on the order of hundreds of neurons rather than thousands by broad cre-independent expression. Remarkably, we get effects similar in magnitude that are not significantly different from that with broader activation of the anterior or posterior domains (see new Figure S4, lines 267-288). It still remains possible that there is a threshold number of neurons that are necessary to elicit behavior, but that is beyond the scope of the current study. However, these data indicate that the effect of activating anterior and posterior domains is not an artifact of broad stimulation.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03280

      Corresponding author(s): Stephan Gruber

      1. General Statements [optional]

      First, we would like to thank the editor at Review Commons for the efficient handling of our manuscript. We also apologize for our delayed response.

      We are grateful to all three reviewers for their careful evaluation of our work and for their constructive feedback, which will provide a valuable basis for improving the figures and the text, as described below. We expect to be able to complete the revision following the plan described below quickly.

      We note that the reviewer reports (Rev. #1 and Rev. #3) made us realize that the manuscript text was misleading on the following point. Although we used the purified ATP hydrolysis–deficient Smc protein for sybody isolation, this does not restrict the selection to a specific conformation. As described in detail in Vazquez-Nunez et al. (Figure 5), this mutant displays the ATP-engaged conformation only in a smaller fraction of complexes (~25% in the presence of ATP and DNA), consistent with prior in vivo observations reported by Diebold-Durand et al. (Figure 5). Rather than limiting the selection to a particular configuration, our aim was to reduce the prevalence of the predominant rod state in order to broaden the range of conformations represented during sybody selection. Consistent with this interpretation, only a small number of isolated sybodies show strong conformation-specific binding in the presence or absence of ATP/DNA, as observed by ELISA (now included in the manuscript). We will revise the manuscript text accordingly to clarify this point.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Gosselin et al., develop a method to target protein activity using synthetic single-domain nanobodies (sybodies). They screen a library of sybodies using ribosome/ phage display generated against bacillus Smc-ScpAB complex. Specifically, they use an ATP hydrolysis deficient mutant of SMC so as to identify sybodies that will potentially disrupt Smc-ScpAB activity. They next screen their library in vivo, using growth defects in rich media as a read-out for Smc activity perturbation. They identify 14 sybodies that mirror smc deletion phenotype including defective growth in fast-growth conditions, as well as chromosome segregation defects. The authors use a clever approach by making chimeras between bacillus and S. pnuemoniae Smc to narrow-down to specific regions within the bacillus Smc coiled-coil that are likely targets of the sybodies. Using ATPase assays, they find that the sybodies either impede DNA-stimulated ATP hydrolysis or hyperactivate ATP hydrolysis (even in the absence of DNA). The authors propose that the sybodies may likely be locking Smc-ScpAB in the "closed" or "open" state via interaction with the specific coiled-coil region on Smc. I have a few comments that the authors should consider:

      Major comments: 1. Lack of direct in vitro binding measurements: The authors do not provide measurements of sybody affinities, binding/ unbinding kinetics, stoichiometries with respect to Smc-ScpAB. Additionally, do the sybodies preferentially interact with Smc in ATP/ DNA-bound state? And, do the sybodies affect the interaction of ScpAB with SMC? It is understandable that such measurements for 14 sybodies is challenging, and not essential for this study. Nonetheless, it is informative to have biochemical characterization of sybody interaction with the Smc-ScpAB complex for at least 1-2 candidate sybodies described here.

      We agree with the reviewer that adding such data would be reassuring and that obtaining solid data using purified components is not easy even for a smaller selection of sybodies. We have data that show direct binding of Smc to sybodies by various methods including ELISA, pull-downs and by biophysical methods (GCI). Initially, we omitted these data from the manuscript as we are convinced that the mapping data obtained with chimeric SMC proteins is more definitive and relevant. During the revision we will incorporate the ELISA data showing direct binding and also indicating a lack of preference for a specific state of Smc.

      Many modes of sybody binding to Smc are plausible The authors provide an elaborate discussion of sybodies locking the Smc-ScpAB complex in open/ closed states. However, in the absence of structural support, the mechanistic inferences may need to be tempered. For example, is it also not possible for the sybodies to bind the inner interface of the coiled-coil, resulting in steric hinderance to coiled-coil interactions. It is also possible that sybody interaction disrupts ScpAB interaction (as data ruling this possibility out has not been provided). Thus, other potential mechanisms would be worth considering/ discussing. In this direction, did AlphaFold reveal any potential insights into putative binding locations?

      We have attempted to map the binding by structure prediction, however, so far, even the latest versions of AlphaFold are not able to clearly delineate the binding interface. Indeed, many ways of binding are possible, including disruption of ScpAB interaction. However, since the main binding site is located on the SMC coiled coils, the later scenario would likely be an indirect consequence of altered coiled coil configuration, consistent with our current interpretation.

      1. Sybody expression in vivo Have the authors estimated sybody expression in vivo? Are they all expressed to similar levels?

      We have tagged selected sybodies with gfp and performed live cell imaging. This showed that they are all roughly equally expressed and that they localize as foci in the cell presumably by binding to Smc complexes loaded onto the chromosome at ParB/parS sites. We will include this data in the revised version of the manuscript.

      1. Sybodies should phenocopy ATP hydrolysis mutant of Smc The sybodies were screened against an ATP hydrolysis deficient mutant of Smc, with the rationale that these sybodies would interfere this step of the Smc duty cycle. Does the expression of the sybodies in vivo phenocopy the ATP hydrolysis deficient mutant of Smc? Could the authors consider any phenotypic read-outs that can indicate whether the sybody action results in an smc-null effect or specifically an ATP hydrolysis deficient effect?

      As eluded to above, we think that our selection gave rise to sybodies that bind various, possibly multiple Smc conformations. Consistent with this idea, the phenotypes are similar to null mutant rather than the ATP-hydrolysis defective EQ mutant, which display even more severe growth phenotypes. We will add the following notes to the text:

      “These conditions favour ATP-engaged particles alongside the typically predominant ATP-disengaged rod-shaped state (add Vazquez Nunez et al., 2021).”

      “ELISA data confirm that nearly all clones bind Smc-ScpAB; however, their binding shows little or no dependence on the presence of ATP or DNA.”

      Minor comments: 1. It was surprising that no sybodies were found that could target both bacillus and spneu Smc. For example, sybodies targeting the head regions of Smc that might work in a more universal manner. Could the authors comment on the coverage of the sybodies across the protein structure?

      It is rather common that sybodies (like antibodies and nanobodies) exhibit strong affinity differences between highly conserved proteins (> 90 % identity). The underlying reasons for such strong discrimination are i) location of less conserved residues primarily at the target protein surface and ii) the large interaction interface between sybody and target which offers multiple vulnerabilities for disturbance, in particular through bulky side chains resulting in steric clashes. Another frequently observed phenomenon is sybody binding to a dominant epitope, which also often applies to nanobodies and antibodies. A great example for this are the dominant epitopes on SARS-CoV-2 RBDs.

      Growth curves (Fig. S3) show a large jump in recovery in growth under sybody induction conditions. Could the authors address this observation here and in the text?

      We suppose that this recovery represents suppressor mutants and/or (more likely) improved growth in the absence of functional Smc during nutrient limitation (see Gruber et al., 2013 and Wang et al., 2013). We will add this statement to the text.

      L41- Sentence correction: Loop can be removed. Ah, yes, sorry for this confusing error. Thank you. 4. L525 - bsuSmc 'E' :extra E can be removed. To do. Thank you. 5. References need to be properly formatted. To do. Thank you. 6. The authors should add in figure legend for Fig 1i) details on representation of the purple region, and explain the grey strokes for orientation of the loop. To do. 7. How many cells were analysed in the cell biological assays? Legends should include these information. To Be Included.

      Reviewer #1 (Significance (Required)):

      Overall, this is an impressive study that uses an elegant strategy to find inhibitors of protein activity in vivo. The manuscript is clearly written and the experiments are logical and well-designed. The findings from the study will be significant to the broad field of genome biology, synthetic biology and also SMC biology. Specifically, the coiled coil domain of SMC proteins have been proposed to be of high functional value. The authors have elegantly identified key coiled-coil regions that may be important for function, and parallelly exhibited potential of the use of synthetic sybody/designed binders for inhibition of protein activity.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Review: "Single Domain Antibody Inhibitors Target the Coiled Coil Arms of the Bacillus subtilis SMC complex" by Ophélie Gosselin et al, Review Commons RC-2025-03280 Structural Maintenance of Chromosome proteins (SMCs), a family of proteins found in almost all organisms, are organizers of DNA. They accomplish this by a process known as loop extrusion, wherein double-stranded DNA is actively reeled in and extruded into loops. Although SMCs are known to have several DNA binding regions, the exact mechanism by which they facilitate loop extrusion is not understood but is believed to entail large conformational changes. There are currently several models for loop extrusion, including one wherein the coiled coil (CC) arms open, but there is a lack of insightful experimentation and analysis to confirm any of these models. The work presented aims to provide much-needed new tools to investigate these questions: conformation-selective sybodies (synthetic nanobodies) that are likely to alter the CC opening and closing reactions. The authors produced, isolated, and expressed sybodies that specifically bound to Bacillus subtilis Smc-ScpAB. Using chimeric Smc constructs, where the coiled coils were partly replaced with the corresponding sequences from Streptococcus pneumoniae, the authors revealed that the isolated sybodies all targeted the same 4N CC element of the Smc arms. This region is likely disrupted by the sybodies either by stopping the arms from opening (correctly) or forcing them to stay open (enough). Disrupting these functional elements is suggested to cause the Smc-dependent chromosome organization lethal phenotype, implying that arm opening and closing is a key regulatory feature of bacterial Smc-ScpAB. In summary, the authors present a new method for trapping bacterial Smc's in certain conformations using synthetic antibodies. Using these antibodies, they have pinpointed the (previously suggested) 4N region of the coiled coils as an essential site for the opening and closing of the Smc coiled coil arms and that hindering these reactions blocks Smc-driven chromosomal organization. The work has important implications for how we might elucidate the mechanism of DNA loop extrusion by SMC complexes. Some specific comments: Line 75: "likely stabilizing otherwise rare intermediates of the conformational cycle." - sorry, why is that being concluded? Why not stabilizing longer-lived oncformations? We will clarify this statement!

      Line 89: Sorry, possibly our lack of understanding: why first ribosome and then phage display?

      Ribosome display offers to screen around 10^12 sybodies per selection round (technically unrestricted library size), while for phage display, the library size is restricted to around 10^9 sybodies due to the fact that production of a phage library requires transformation of the phagemid plasmid into E. coli, thereby introducing a diversity bottleneck. This is why the sybody platform starts off with ribosome display. It switches to phage display from round 2 onwards because the output of the initial round of ribosome display is around 10^6 sybodies, which can be easily transferred into the phage display format. Phage display is used to minimize selection biases. For more information, please consult the original sybody paper (PMID: 29792401).

      Line 100: Why was only lethality selected? Less severe phenotypes not clear enough?

      Yes, colony size is more difficult to score robustly, as the sizes of individual transformant colonies can vary quite widely. The number of isolated sybodies was at the limit of further analysis.

      Line 106: Could it be tested somehow if convex and concave library sybodies fold in Bs?

      We did not focus on the non-functional sybody candidates and only sybodies of the loop library turned out to cause functional consequences at the cellular level. Notably, we will include gfp-imaging showing that non-lethal sybodies are expressed to similar levels that toxic sybodies. Given the identical scaffold of concave and loop sybodies (they only differ in their CDR3 length), we expect that the concave sybodies fold in the cytoplasm of B. subtilis. For the convex sybodies exhibiting a different scaffold, this will be tested.

      Line 125: Could Pxyl be repressed by glucose?

      To our knowledge and experience, repression by glucose (catabolite repression) does not work well in this context in B. subtilis.

      Line 131: The SMC replacement strain is a cool experiment and removes a lot of doubts!

      Thank you! (we agree 😊)

      Line 141: The mapping is good and looks reliable, but looks and feels like a tour de force? Of course, some cryo-EM would have been lovely (lines 228-229 understood, it has been tried!).

      Yes, we have made several attempts at structural biology. Unfortunately, Smc-ScpAB is not well suited for cryo-EM in our hands and crystallography with Smc fragments and sybodies did not yield well-diffracting crystals.

      Line 179: Mmmh. Do we not assume DNA binding on top of the dimerised heads to open the CC (clamp)?

      We will clarify the text here.

      Line 187: Having sybodies that presumably keep the CC together (closing) and some that do not allow them to come together correctly (opening) is really cool and probably important going forward.

      Thank you!

      Figure 1 Ai is not very colour-blind friendly.

      We are sorry for this oversight. We will try to make the color scheme more inclusive. Thank you for the notification.

      Optional: did the authors see any spontaneous mutations emerge that bypass the lethal phenotype of sybody expression?

      No, we did not observe spontaneous mutations suppressing the phenotype, possibly due to the limited number of cell generations observed. We tried to avoid suppressors by limiting growth, but this may indeed be a good future approach for further fine map the binding sites and to obtain insights into the mechanism of inhibition.

      Optional: we think it would be nice to try some biochemical experiment with BMOE/cysteine-crosslinked B. subtilis Smc in the mid-region (4N or next to it) of the Smc coiled coils to try to further strengthen the story. Some of the authors are experts in this technique and strains might already exist?

      We have indeed tried to study the impact of sybody binding on Smc conformation by cysteine cross-linking. However, we were not convinced by the results and thus prefer not to draw any conclusions from them. We will add a corresponding note to the text.

      Reviewer #2 (Significance (Required)):

      The authors present a new method for trapping bacterial Smc's in certain conformations using synthetic antibodies. Using these antibodies, they have pinpointed the (previously suggested) 4N region of the coiled coils as an essential site for the opening and closing of the Smc coiled coil arms and that hindering these reactions blocks Smc-driven chromosomal organization. The work has important implications for how we might elucidate the mechanism of DNA loop extrusion by SMC complexes. Thank you!

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Gosselin et al. use the sybody technology to study effects of in vivo inhibition oft he Bacillus subtilis SMC complex. Smc proteins are central DNA binding elements of several complexes that are vital for chromosome dynamics in almost all organisms. Sybodies are selected from three different libraries of the single domain antibodies, using the „transition state" mutant Smc. They identify 14 such mutant sybodies that are lethal when expressed in vivo, because they prevent proper function of Smc. The authors present evidence suggesting that all obtained sybodies bind to a coiled-coil region close to the Smc „neck", and thereby interfere with the Smc activity cycle, as evidenced by defective ATPase activity when Smc is bound to DNA. The study is well done and presented and shows that the strategy is very potent in finding a means to quickly turn off a protein's function in vivo, much quicker than depleting the protein.

      The authors also draw conclusions on the molecular mode of action of the SMC complex. The provide a number of suggestive experiments, but in my view mostly indirect evidence for such mechanism.

      My main criticism ist hat the authors have used a single - and catalytically trapped form of SMC. They speculate why they only obtain sybodies from one library, and then only idenfity sybodies that bind to a rather small part oft he large Smc protein. While the approach is definitely valuable, it is biassed towards sybodies that bind to Smc in a quite special way, it seems. Using wild type Smc would be interesting, to make more robust statements about the action of sybodies potantially binding to different parts of Smc.

      As explained above, we are quite confident the Smc ATPase mutation did not bias the selection in an obvious way. The surprising bias towards coiled coil binding sites has likely other explanations, as they likely form a preferred epitope recognized by sybodies.

      Line 105: Alternatively, the other libraries did not produce good binders or these sybodies were 106 not stably expressed in B. subtilis. This could be tested using Western blotting - I am assuming sybody antibodies are commercially avalable. However, this test is not important for the overall study, it would just clarify a minor point.

      While there are antibody fragments available to augment the size of sybodies (PMID: 40108246), these recognize 3D-epitopes and are thus not suited for Western blotting. We did not follow up on the negative results much, but would like to point out again that there are several biases that likely emerge for the same reason (bias to library, bias to coiled coil binding site). If correct, then likely few other sybodies are effectively lethal in B. subtilis, with the exception of the ones isolated and characterized. We have added this notion to the manuscript. We have also tested the expression of non-lethal sybodies by gfp-tagging and imaging. These results will be included in the revision.

      Fig. 2B: is is odd to count Spo0J foci per cells, as it is clear from the images that several origins must be present within the fluorescent foci. I am fine with the „counting" method, as the images show there is a clear segregation defect when sybodies are expressed, I believe the authors should state, though, that this is not a replication block, but failure to segregate origins.

      We agree that this is an important point and will add a corresponding comment to the text.

      Testing binding sites of sybodies tot he SMC complex is done in an indirect manner, by using chimeric Smc constructs. I am surprised why the authors have not used in vitro crosslinking: the authors can purify Smc, and mass spectrometry analyses would identify sites where sybodies are crosslinked to Smc. Again, I am fine with the indirect method, but the authors make quite concrete statements on binding based on non-inhibition of chimeric Smc; I can see alternative explanations why a chimera may not be targeted.

      We have made several attempts of testing direct binding with mixed outcomes and decided to not include those results in the light of the stronger and more relevant in vivo mapping. However, we will add ELISA results and briefly discuss grating coupled interferometry (GCI) data and pull-downs.

      Smc-disrupting sybodies affect the ATPase activity in one of two ways. Again, rather indirect experiments. This leads to the point Revealing Smc arm dynamics through synthetic binders in the discussion. The authors are quite careful in stating that their experiments are suggestive for a certain mode of action of Smc, which is warranted.

      In line 245, they state More broadly, the study demonstrates how synthetic binders can trap, stabilize, or block transient conformations of active chromatin-associated machines, providing a powerful means to probe their mechanisms in living cells. This is off course a possible scenario for the use of sybodies, but the study does not really trap Smc in a transient conformation, at least this is not clearly shown.

      We agree and will carefully rephrase this statement. Thank you.

      Overall, it is an interesting study, with a well-presented novel technology, and a limited gain of knowledge on SMC proteins. We respectfully disagree with the last point, since our unique results highlight the importance of the Smc coiled coils, which are otherwise largely neglected in the SMC literature, likely (at least in part) due the mild effect of single point mutations on coiled coil dynamics.

      Reviewer #3 (Significance (Required)):

      The work describes the gaining and use of single-binder antibodies (sybodies) to interfere with the function of proteins in bacteria. Using this technology for the SMC complex, the authors demonstrate that they can obtain a significant of binders that target a defined region is SMC and thereby interfere with the ATPase cycle.

      The study does not present a strong gain of knowledge of the mode of action of the SMC complex.

      As pointed out above, we respectfully disagree with this assertion.

      • *

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      As pointed out above, there are a few minor points that we prefer not to experimentally address. In particular, we do not consider it as necessary to determine the expression levels of sybodies which were non-inhibitory. We also wish to note that we attempted to obtain structural additional biochemical data and to that end performed cryo-EM, crystallography and cysteine cross-linking experiments. Unfortunately, we did not obtain sybody complex structures and the cross-linking data were unfortunately not conclusive. We also wish to note that the first author has finished her PhD and left the lab, which limits our capacity to add additional experiments. However, as the reviewers also pointed out, the main conclusions are well supported by the data already.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Gosselin et al., develop a method to target protein activity using synthetic single-domain nanobodies (sybodies). They screen a library of sybodies using ribosome/ phage display generated against bacillus Smc-ScpAB complex. Specifically, they use an ATP hydrolysis deficient mutant of SMC so as to identify sybodies that will potentially disrupt Smc-ScpAB activity. They next screen their library in vivo, using growth defects in rich media as a read-out for Smc activity perturbation. They identify 14 sybodies that mirror smc deletion phenotype including defective growth in fast-growth conditions, as well as chromosome segregation defects. The authors use a clever approach by making chimeras between bacillus and S. pnuemoniae Smc to narrow-down to specific regions within the bacillus Smc coiled-coil that are likely targets of the sybodies. Using ATPase assays, they find that the sybodies either impede DNA-stimulated ATP hydrolysis or hyperactivate ATP hydrolysis (even in the absence of DNA). The authors propose that the sybodies may likely be locking Smc-ScpAB in the "closed" or "open" state via interaction with the specific coiled-coil region on Smc. I have a few comments that the authors should consider:

      Major comments:

      1. Lack of direct in vitro binding measurements: The authors do not provide measurements of sybody affinities, binding/ unbinding kinetics, stoichiometries with respect to Smc-ScpAB. Additionally, do the sybodies preferentially interact with Smc in ATP/ DNA-bound state? And, do the sybodies affect the interaction of ScpAB with SMC? It is understandable that such measurements for 14 sybodies is challenging, and not essential for this study. Nonetheless, it is informative to have biochemical characterization of sybody interaction with the Smc-ScpAB complex for at least 1-2 candidate sybodies described here.
      2. Many modes of sybody binding to Smc are plausible The authors provide an elaborate discussion of sybodies locking the Smc-ScpAB complex in open/ closed states. However, in the absence of structural support, the mechanistic inferences may need to be tempered. For example, is it also not possible for the sybodies to bind the inner interface of the coiled-coil, resulting in steric hinderance to coiled-coil interactions. It is also possible that sybody interaction disrupts ScpAB interaction (as data ruling this possibility out has not been provided). Thus, other potential mechanisms would be worth considering/ discussing. In this direction, did AlphaFold reveal any potential insights into putative binding locations?
      3. Sybody expression in vivo Have the authors estimated sybody expression in vivo? Are they all expressed to similar levels?
      4. Sybodies should phenocopy ATP hydrolysis mutant of Smc The sybodies were screened against an ATP hydrolysis deficient mutant of Smc, with the rationale that these sybodies would interfere this step of the Smc duty cycle. Does the expression of the sybodies in vivo phenocopy the ATP hydrolysis deficient mutant of Smc? Could the authors consider any phenotypic read-outs that can indicate whether the sybody action results in an smc-null effect or specifically an ATP hydrolysis deficient effect?

      Minor comments:

      1. It was surprising that no sybodies were found that could target both bacillus and spneu Smc. For example, sybodies targeting the head regions of Smc that might work in a more universal manner. Could the authors comment on the coverage of the sybodies across the protein structure?
      2. Growth curves (Fig. S3) show a large jump in recovery in growth under sybody induction conditions. Could the authors address this observation here and in the text?
      3. L41- Sentence correction: Loop can be removed.
      4. L525 - bsuSmc 'E' :extra E can be removed.
      5. References need to be properly formatted.
      6. The authors should add in figure legend for Fig 1i) details on representation of the purple region, and explain the grey strokes for orientation of the loop.
      7. How many cells were analysed in the cell biological assays? Legends should include these information.

      Significance

      Overall, this is an impressive study that uses an elegant strategy to find inhibitors of protein activity in vivo. The manuscript is clearly written and the experiments are logical and well-designed. The findings from the study will be significant to the broad field of genome biology, synthetic biology and also SMC biology. Specifically, the coiled coil domain of SMC proteins have been proposed to be of high functional value. The authors have elegantly identified key coiled-coil regions that may be important for function, and parallelly exhibited potential of the use of synthetic sybody/designed binders for inhibition of protein activity.

    1. In his early years at the head of Membertou, the nation’s financial situation was dismal; today, its total annual revenue, which includes fishing, gaming, and real-estate projects, is over sixty-two million dollars. The success has been communal: some of the profits go to investment in future economic development, and the rest goes to social programming, housing, education, and dividends received by individual nation members.

      Chief Terry Paul is inspried by Choctow approach which saw welcoming of gambling, big industry, and such -> brings this to Cape Breton and brings Mebertou out of poverty between 80s and now -> communal wealth - > reinvested into community and social programing, even dividends.

      He was the architect of the clearwater deal.

    1. M. Krempe was not equally docile; and in my condition at that time, of almost insupportable sensitiveness, his harsh blunt encomiums gave me even more pain than the benevolent approbation of M. Waldman. “D–n the fellow!” cried he; “why, M. Clerval, I assure you he has outstript us all. Aye, stare if you please; but it is nevertheless true. A youngster who, but a few years ago, believed Cornelius Agrippa as firmly as the gospel, has now set himself at the head of the university; and if he is not soon pulled down, we shall all be out of countenance. – Aye, aye,” continued he, observing my face expressive of suffering, “M. Frankenstein is modest; an excellent quality in a young man. Young men should be diffident of themselves, you know, M. Clerval; I was myself when young: but that wears out in a very short time.”

      Encomium means a speech or writing that praises someone or something highly.

  2. Feb 2026
    1. THE AMERICAN YAWP Menu Skip to content HomeAbout Barbara Jordan – On the Impeachment of Richard Nixon (1974) Brookes print Casta painting Contributors How the Other Half Lived: Photographs of Jacob Riis Introduction Note on Recommended Readings Press Sample Feedback (@AmericanYawp) Teaching Materials TEST: 11/18/2025 Updates Who Pays for This? 6. A New Nation “The Federal Pillars,” from The Massachusetts Centinel, August 2, 1789. Library of Congress. *The American Yawp is an evolving, collaborative text. Please click here to improve this chapter.* I. IntroductionII. Shays’s RebellionIII. The Constitutional ConventionIV. Ratifying the ConstitutionV. Rights and CompromisesVI. Hamilton’s Financial SystemVII. The Whiskey Rebellion and Jay’s TreatyVIII. The French Revolution and the Limits of LibertyIX. Religious FreedomX. The Election of 1800XI. ConclusionXII. Primary SourcesXIII. Reference Material I. Introduction On July 4, 1788, Philadelphians turned out for a “grand federal procession” in honor of the new national constitution. Workers in various trades and professions demonstrated. Blacksmiths carted around a working forge, on which they symbolically beat swords into farm tools. Potters proudly carried a sign paraphrasing from the Bible, “The potter hath power over his clay,” linking God’s power with an artisan’s work and a citizen’s control over the country. Christian clergymen meanwhile marched arm-in-arm with Jewish leaders. The grand procession represented what many Americans hoped the United States would become: a diverse but cohesive, prosperous nation.1 Over the next few years, Americans would celebrate more of these patriotic holidays. In April 1789, for example, thousands gathered in New York to see George Washington take the presidential oath of office. That November, Washington called his fellow citizens to celebrate with a day of thanksgiving, particularly for “the peaceable and rational manner” in which the government had been established.2 But the new nation was never as cohesive as its champions had hoped. Although the officials of the new federal government—and the people who supported it—placed great emphasis on unity and cooperation, the country was often anything but unified. The Constitution itself had been a controversial document adopted to strengthen the government so that it could withstand internal conflicts. Whatever the later celebrations, the new nation had looked to the future with uncertainty. Less than two years before the national celebrations of 1788 and 1789, the United States had faced the threat of collapse.   II. Shays’s Rebellion Daniel Shays became a divisive figure, to some a violent rebel seeking to upend the new American government, to others an upholder of the true revolutionary virtues Shays and others fought for. This contemporary depiction of Shays and his accomplice Job Shattuck portrays them in the latter light as rising “illustrious from the Jail.” Unidentified artist, Daniel Shays and Job Shattuck, 1787. Wikimedia. In 1786 and 1787, a few years after the Revolution ended, thousands of farmers in western Massachusetts were struggling under a heavy burden of debt. Their problems were made worse by weak local and national economies. Many political leaders saw both the debt and the struggling economy as a consequence of the Articles of Confederation, which provided the federal government with no way to raise revenue and did little to create a cohesive nation out of the various states. The farmers wanted the Massachusetts government to protect them from their creditors, but the state supported the lenders instead. As creditors threatened to foreclose on their property, many of these farmers, including Revolutionary War veterans, took up arms. Led by a fellow veteran named Daniel Shays, these armed men, the “Shaysites,” resorted to tactics like the patriots had used before the Revolution, forming blockades around courthouses to keep judges from issuing foreclosure orders. These protesters saw their cause and their methods as an extension of the “Spirit of 1776”; they were protecting their rights and demanding redress for the people’s grievances. Governor James Bowdoin, however, saw the Shaysites as rebels who wanted to rule the government through mob violence. He called up thousands of militiamen to disperse them. A former Revolutionary general, Benjamin Lincoln, led the state force, insisting that Massachusetts must prevent “a state of anarchy, confusion and slavery.”3 In January 1787, Lincoln’s militia arrested more than one thousand Shaysites and reopened the courts. Daniel Shays and other leaders were indicted for treason, and several were sentenced to death, but eventually Shays and most of his followers received pardons. Their protest, which became known as Shays’s Rebellion, generated intense national debate. While some Americans, like Thomas Jefferson, thought “a little rebellion now and then” helped keep the country free, others feared the nation was sliding toward anarchy and complained that the states could not maintain control. For nationalists like James Madison of Virginia, Shays’s Rebellion was a prime example of why the country needed a strong central government. “Liberty,” Madison warned, “may be endangered by the abuses of liberty as well as the abuses of power.”4   III. The Constitutional Convention The uprising in Massachusetts convinced leaders around the country to act. After years of goading by James Madison and other nationalists, delegates from twelve of the thirteen states met at the Pennsylvania state house in Philadelphia in the summer of 1787. Only Rhode Island declined to send a representative. The delegates arrived at the convention with instructions to revise the Articles of Confederation. The biggest problem the convention needed to solve was the federal government’s inability to levy taxes. That weakness meant that the burden of paying back debt from the Revolutionary War fell on the states. The states, in turn, found themselves beholden to the lenders who had bought up their war bonds. That was part of why Massachusetts had chosen to side with its wealthy bondholders over poor western farmers.5 James Madison, however, had no intention of simply revising the Articles of Confederation. He intended to produce a completely new national constitution. In the preceding year, he had completed two extensive research projects—one on the history of government in the United States, the other on the history of republics around the world. He used this research as the basis for a proposal he brought with him to Philadelphia. It came to be called the Virginia Plan, named after Madison’s home state.6 James Madison was a central figure in the reconfiguration of the national government. Madison’s Virginia Plan was a guiding document in the formation of a new government under the Constitution. John Vanderlyn, Portrait of James Madison, 1816. Wikimedia. The Virginia Plan was daring. Classical learning said that a republican form of government required a small and homogenous state: the Roman republic, or a small country like Denmark, for example. Citizens who were too far apart or too different could not govern themselves successfully. Conventional wisdom said the United States needed to have a very weak central government, which should simply represent the states on certain matters they had in common. Otherwise, power should stay at the state or local level. But Madison’s research had led him in a different direction. He believed it was possible to create “an extended republic” encompassing a diversity of people, climates, and customs. The Virginia Plan, therefore, proposed that the United States should have a strong federal government. It was to have three branches—legislative, executive, and judicial—with power to act on any issues of national concern. The legislature, or Congress, would have two houses, in which every state would be represented according to its population size or tax base. The national legislature would have veto power over state laws.7 Other delegates to the convention generally agreed with Madison that the Articles of Confederation had failed. But they did not agree on what kind of government should replace them. In particular, they disagreed about the best method of representation in the new Congress. Representation was an important issue that influenced a host of other decisions, including deciding how the national executive branch should work, what specific powers the federal government should have, and even what to do about the divisive issue of slavery. For more than a decade, each state had enjoyed a single vote in the Continental Congress. William Patterson’s New Jersey Plan proposed to keep things that way. The Connecticut delegate Roger Sherman, furthermore, argued that members of Congress should be appointed by the state legislatures. Ordinary voters, Sherman said, lacked information, were “constantly liable to be misled” and “should have as little to do as may be” about most national decisions.8 Large states, however, preferred the Virginia Plan, which would give their citizens far more power over the legislative branch. James Wilson of Pennsylvania argued that since the Virginia Plan would vastly increase the powers of the national government, representation should be drawn as directly as possible from the public. No government, he warned, “could long subsist without the confidence of the people.”9) Ultimately, Roger Sherman suggested a compromise. Congress would have a lower house, the House of Representatives, in which members were assigned according to each state’s population, and an upper house, which became the Senate, in which each state would have one vote. This proposal, after months of debate, was adopted in a slightly altered form as the Great Compromise: each state would have two senators, who could vote independently. In addition to establishing both types of representation, this compromise also counted three-fifths of a state’s enslaved population for representation and tax purposes. The delegates took even longer to decide on the form of the national executive branch. Should executive power be in the hands of a committee or a single person? How should its officeholders be chosen? On June 1, James Wilson moved that the national executive power reside in a single person. Coming only four years after the American Revolution, that proposal was extremely contentious; it conjured up images of an elected monarchy.10 The delegates also worried about how to protect the executive branch from corruption or undue control. They endlessly debated these questions, and not until early September did they decide the president would be elected by a special electoral college. In the end, the Constitutional Convention proposed a government unlike any other, combining elements copied from ancient republics and English political tradition but making some limited democratic innovations—all while trying to maintain a delicate balance between national and state sovereignty. It was a complicated and highly controversial scheme.   IV. Ratifying the Constitution Delegates to the Constitutional Convention assembled, argued, and finally agreed in this room, styled in the same manner as during the Convention. Photograph of the Assembly Room, Independence Hall, Philadelphia, Pennsylvania. Wikimedia. Creative Commons Attribution-Share Alike 3.0 Unported. The convention voted to send its proposed Constitution to Congress, which was then sitting in New York, with a cover letter from George Washington. The plan for adopting the new Constitution, however, required approval from special state ratification conventions, not just Congress. During the ratification process, critics of the Constitution organized to persuade voters in the different states to oppose it. Importantly, the Constitutional Convention had voted down a proposal from Virginia’s George Mason, the author of Virginia’s state Declaration of Rights, for a national bill of rights. This omission became a rallying point for opponents of the document. Many of these Anti-Federalists argued that without such a guarantee of specific rights, American citizens risked losing their personal liberty to the powerful federal government. The pro-ratification Federalists, on the other hand, argued that including a bill of rights was not only redundant but dangerous; it could limit future citizens from adding new rights.11 Citizens debated the merits of the Constitution in newspaper articles, letters, sermons, and coffeehouse quarrels across America. Some of the most famous, and most important, arguments came from Alexander Hamilton, John Jay, and James Madison in the Federalist Papers, which were published in various New York newspapers in 1787 and 1788.12 The first crucial vote came at the beginning of 1788 in Massachusetts. At first, the Anti-Federalists at the Massachusetts ratifying convention probably had the upper hand, but after weeks of debate, enough delegates changed their votes to narrowly approve the Constitution. But they also approved a number of proposed amendments, which were to be submitted to the first Congress. This pattern—ratifying the Constitution but attaching proposed amendments—was followed by other state conventions. The most high-profile convention was held in Richmond, Virginia, in June 1788, when Federalists like James Madison, Edmund Randolph, and John Marshall squared off against equally influential Anti-Federalists like Patrick Henry and George Mason. Virginia was America’s most populous state, it had produced some of the country’s highest-profile leaders, and the success of the new government rested upon its cooperation. After nearly a month of debate, Virginia voted 89 to 79 in favor of ratification.13 On July 2, 1788, Congress announced that a majority of states had ratified the Constitution and that the document was now in effect. Yet this did not mean the debates were over. North Carolina, New York, and Rhode Island had not completed their ratification conventions, and Anti-Federalists still argued that the Constitution would lead to tyranny. The New York convention would ratify the Constitution by just three votes, and finally Rhode Island would ratify it by two votes—a full year after George Washington was inaugurated as president.   V. Rights and Compromises Although debates continued, Washington’s election as president cemented the Constitution’s authority. By 1793, the term Anti-Federalist would be essentially meaningless. Yet the debates produced a piece of the Constitution that seems irreplaceable today. Ten amendments were added in 1791. Together, they constitute the Bill of Rights. James Madison, against his original wishes, supported these amendments as an act of political compromise and necessity. He had won election to the House of Representatives only by promising his Virginia constituents such a list of rights. There was much the Bill of Rights did not cover. Women found no special protections or guarantee of a voice in government. Many states continued to restrict voting only to men who owned significant amounts of property. And slavery not only continued to exist; it was condoned and protected by the Constitution. Of all the compromises that formed the Constitution, perhaps none would be more important than the compromise over the slave trade. Americans generally perceived the transatlantic slave trade as more violent and immoral than slavery itself. Many northerners opposed it on moral grounds. But they also understood that letting southern states import more Africans would increase their political power. The Constitution counted each enslaved individual as three fifths of a person for purposes of representation, so in districts with many enslaved people, the white voters had extra influence. On the other hand, the states of the Upper South also welcomed a ban on the Atlantic trade because they already had a surplus of enslaved laborers. Banning importation meant enslavers in Virginia and Maryland could get higher prices when they sold their enslaved laborers to states like South Carolina and Georgia that were dependent on a continued slave trade. New England and the Deep South agreed to what was called a “dirty compromise” at the Constitutional Convention in 1787. New Englanders agreed to include a constitutional provision that protected the foreign slave trade for twenty years; in exchange, South Carolina and Georgia delegates had agreed to support a constitutional clause that made it easier for Congress to pass commercial legislation. As a result, the Atlantic slave trade resumed until 1808 when it was outlawed for three reasons. First, Britain was also in the process of outlawing the slave trade in 1807, and the United States did not want to concede any moral high ground to its rival. Second, the Haitian Revolution (1791–1804), a successful slave revolt against French colonial rule in the West Indies, had changed the stakes in the debate. The image of thousands of armed Black revolutionaries terrified white Americans. Third, the Haitian Revolution had ended France’s plans to expand its presence in the Americas, so in 1803, the United States had purchased the Louisiana Territory from the French at a fire-sale price. This massive new territory, which had doubled the size of the United States, had put the question of slavery’s expansion at the top of the national agenda. Many white Americans, including President Thomas Jefferson, thought that ending the external slave trade and dispersing the domestic slave population would keep the United States a white man’s republic and perhaps even lead to the disappearance of slavery. The ban on the slave trade, however, lacked effective enforcement measures and funding. Moreover, instead of freeing illegally imported Africans, the act left their fate to the individual states, and many of those states simply sold intercepted enslaved people at auction. Thus, the ban preserved the logic of property ownership in human beings. The new federal government protected slavery as much as it expanded democratic rights and privileges for white men.14   VI. Hamilton’s Financial System Alexander Hamilton saw America’s future as a metropolitan, commercial, industrial society, in contrast to Thomas Jefferson’s nation of small farmers. While both men had the ear of President Washington, Hamilton’s vision proved most appealing and enduring. John Trumbull, Portrait of Alexander Hamilton, 1806. Wikimedia. President George Washington’s cabinet choices reflected continuing political tensions over the size and power of the federal government. The vice president was John Adams, and Washington chose Alexander Hamilton to be his secretary of the treasury. Both men wanted an active government that would promote prosperity by supporting American industry. However, Washington chose Thomas Jefferson to be his secretary of state, and Jefferson was committed to restricting federal power and preserving an economy based on agriculture. Almost from the beginning, Washington struggled to reconcile the Federalist and Republican (or Democratic-Republican) factions within his own administration.15 Alexander Hamilton believed that self-interest was the “most powerful incentive of human actions.” Self-interest drove humans to accumulate property, and that effort created commerce and industry. According to Hamilton, government had important roles to play in this process. First, the state should protect private property from theft. Second, according to Hamilton, the state should use human “passions” and “make them subservient to the public good.”16 In other words, a wise government would harness its citizens’ desire for property so that both private individuals and the state would benefit. Hamilton, like many of his contemporary statesmen, did not believe the state should ensure an equal distribution of property. Inequality was understood as “the great & fundamental distinction in Society,” and Hamilton saw no reason why this should change. Instead, Hamilton wanted to tie the economic interests of wealthy Americans, or “monied men,” to the federal government’s financial health. If the rich needed the government, then they would direct their energies to making sure it remained solvent.17 Hamilton, therefore, believed that the federal government must be “a Repository of the Rights of the wealthy.”18 As the nation’s first secretary of the treasury, he proposed an ambitious financial plan to achieve just that. The first part of Hamilton’s plan involved federal “assumption” of state debts, which were mostly left over from the Revolutionary War. The federal government would assume responsibility for the states’ unpaid debts, which totaled about $25 million. Second, Hamilton wanted Congress to create a bank—a Bank of the United States. The goal of these proposals was to link federal power and the country’s economic vitality. Under the assumption proposal, the states’ creditors (people who owned state bonds or promissory notes) would turn their old notes in to the treasury and receive new federal notes of the same face value. Hamilton foresaw that these bonds would circulate like money, acting as “an engine of business, and instrument of industry and commerce.”19 This part of his plan, however, was controversial for two reasons. First, many taxpayers objected to paying the full face value on old notes, which had fallen in market value. Often the current holders had purchased them from the original creditors for pennies on the dollar. To pay them at full face value, therefore, would mean rewarding speculators at taxpayer expense. Hamilton countered that government debts must be honored in full, or else citizens would lose all trust in the government. Second, many southerners objected that they had already paid their outstanding state debts, so federal assumption would mean forcing them to pay again for the debts of New Englanders. Nevertheless, President Washington and Congress both accepted Hamilton’s argument. By the end of 1794, 98 percent of the country’s domestic debt had been converted into new federal bonds.20 Hamilton’s plan for a Bank of the United States, similarly, won congressional approval despite strong opposition. Thomas Jefferson and other Republicans argued that the plan was unconstitutional; the Constitution did not authorize Congress to create a bank. Hamilton, however, argued that the bank was not only constitutional but also important for the country’s prosperity. The Bank of the United States would fulfill several needs. It would act as a convenient depository for federal funds. It would print paper banknotes backed by specie (gold or silver). Its agents would also help control inflation by periodically taking state bank notes to their banks of origin and demanding specie in exchange, limiting the amount of notes the state banks printed. Furthermore, it would give wealthy people a vested interest in the federal government’s finances. The government would control just 20 percent of the bank’s stock; the other eighty percent would be owned by private investors. Thus, an “intimate connexion” between the government and wealthy men would benefit both, and this connection would promote American commerce. In 1791, therefore, Congress approved a twenty-year charter for the Bank of the United States. The bank’s stocks, together with federal bonds, created over $70 million in new financial instruments. These spurred the formation of securities markets, which allowed the federal government to borrow more money and underwrote the rapid spread of state-charted banks and other private business corporations in the 1790s. For Federalists, this was one of the major purposes of the federal government. For opponents who wanted a more limited role for industry, however, or who lived on the frontier and lacked access to capital, Hamilton’s system seemed to reinforce class boundaries and give the rich inordinate power over the federal government. Hamilton’s plan, furthermore, had another highly controversial element. In order to pay what it owed on the new bonds, the federal government needed reliable sources of tax revenue. In 1791, Hamilton proposed a federal excise tax on the production, sale, and consumption of a number of goods, including whiskey.   VII. The Whiskey Rebellion and Jay’s Treaty Grain was the most valuable cash crop for many American farmers. In the West, selling grain to a local distillery for alcohol production was typically more profitable than shipping it over the Appalachians to eastern markets. Hamilton’s whiskey tax thus placed a special burden on western farmers. It seemed to divide the young republic in half—geographically between the East and West, economically between merchants and farmers, and culturally between cities and the countryside. In the fall of 1791, sixteen men in western Pennsylvania, disguised in women’s clothes, assaulted a tax collector named Robert Johnson. They tarred and feathered him, and the local deputy marshals seeking justice met similar fates. They were robbed and beaten, whipped and flogged, tarred and feathered, and tied up and left for dead. The rebel farmers also adopted other protest methods from the Revolution and Shays’s Rebellion, writing local petitions and erecting liberty poles. For the next two years, tax collections in the region dwindled. Then, in July 1794, groups of armed farmers attacked federal marshals and tax collectors, burning down at least two tax collectors’ homes. At the end of the month, an armed force of about seven thousand, led by the radical attorney David Bradford, robbed the U.S. mail and gathered about eight miles east of Pittsburgh. President Washington responded quickly. First, Washington dispatched a committee of three distinguished Pennsylvanians to meet with the rebels and try to bring about a peaceful resolution. Meanwhile, he gathered an army of thirteen thousand militiamen in Carlisle, Pennsylvania. On September 19, Washington became the only sitting president to lead troops in the field, though he quickly turned over the army to the command of Henry Lee, a Revolutionary hero and the current governor of Virginia. As the federal army moved westward, the farmers scattered. Hoping to make a dramatic display of federal authority, Alexander Hamilton oversaw the arrest and trial of a number of rebels. Many were released because of a lack of evidence, and most of those who remained, including two men sentenced to death for treason, were soon pardoned by the president. The Whiskey Rebellion had shown that the federal government was capable of quelling internal unrest. But it also demonstrated that some citizens, especially poor westerners, viewed it as their enemy.21 Around the same time, another national issue also aroused fierce protest. Along with his vision of a strong financial system, Hamilton also had a vision of a nation busily engaged in foreign trade. In his mind, that meant pursuing a friendly relationship with one nation in particular: Great Britain. America’s relationship with Britain since the end of the Revolution had been tense, partly because of warfare between the British and French. Their naval war threatened American shipping, and the impressment of men into Britain’s navy terrorized American sailors. American trade could be risky and expensive, and impressment threatened seafaring families. Nevertheless, President Washington was conscious of American weakness and was determined not to take sides. In April 1793, he officially declared that the United States would remain neutral.22 With his blessing, Hamilton’s political ally John Jay, who was currently serving as chief justice of the Supreme Court, sailed to London to negotiate a treaty that would satisfy both Britain and the United States. Jefferson and Madison strongly opposed these negotiations. They mistrusted Britain and saw the treaty as the American state favoring Britain over France. The French had recently overthrown their own monarchy, and Republicans thought the United States should be glad to have the friendship of a new revolutionary state. They also suspected that a treaty with Britain would favor northern merchants and manufacturers over the agricultural South. In November 1794, despite their misgivings, John Jay signed a “treaty of amity, commerce, and navigation” with the British. Jay’s Treaty, as it was commonly called, required Britain to abandon its military positions in the Northwest Territory (especially Fort Detroit, Fort Mackinac, and Fort Niagara) by 1796. Britain also agreed to compensate American merchants for their losses. The United States, in return, agreed to treat Britain as its most prized trade partner, which meant tacitly supporting Britain in its current conflict with France. Unfortunately, Jay had failed to secure an end to impressment.23 For Federalists, this treaty was a significant accomplishment. Jay’s Treaty gave the United States, a relatively weak power, the ability to stay officially neutral in European wars, and it preserved American prosperity by protecting trade. For Jefferson’s Republicans, however, the treaty was proof of Federalist treachery. The Federalists had sided with a monarchy against a republic, and they had submitted to British influence in American affairs without even ending impressment. In Congress, debate over the treaty transformed the Federalists and Republicans from temporary factions into two distinct (though still loosely organized) political parties.   VIII. The French Revolution and the Limits of Liberty The mounting body count of the French Revolution included that of the queen and king, who were beheaded in a public ceremony in early 1793, as depicted in the engraving. While Americans disdained the concept of monarchy, the execution of King Louis XVI was regarded by many Americans as an abomination, an indication of the chaos and savagery reigning in France at the time. Charles Monnet (artist), Antoine-Jean Duclos and Isidore-Stanislas Helman (engravers), Day of 21 January 1793 the death of Louis Capet on the Place de la Révolution, 1794. Wikimedia. In part, the Federalists were turning toward Britain because they feared the most radical forms of democratic thought. In the wake of Shays’s Rebellion, the Whiskey Rebellion, and other internal protests, Federalists sought to preserve social stability. The course of the French Revolution seemed to justify their concerns. In 1789, news had arrived in America that the French had revolted against their king. Most Americans imagined that liberty was spreading from America to Europe, carried there by the returning French heroes who had taken part in the American Revolution. Initially, nearly all Americans had praised the French Revolution. Towns all over the country hosted speeches and parades on July 14 to commemorate the day it began. Women had worn neoclassical dress to honor republican principles, and men had pinned revolutionary cockades to their hats. John Randolph, a Virginia planter, named two of his favorite horses Jacobin and Sans-Culotte after French revolutionary factions.24 In April 1793, a new French ambassador, “Citizen” Edmond-Charles Genêt, arrived in the United States. During his tour of several cities, Americans greeted him with wild enthusiasm. Citizen Genêt encouraged Americans to act against Spain, a British ally, by attacking its colonies of Florida and Louisiana. When President Washington refused, Genêt threatened to appeal to the American people directly. In response, Washington demanded that France recall its diplomat. In the meantime, however, Genêt’s faction had fallen from power in France. Knowing that a return home might cost him his head, he decided to remain in America. Genêt’s intuition was correct. A radical coalition of revolutionaries had seized power in France. They initiated a bloody purge of their enemies, the Reign of Terror. As Americans learned about Genêt’s impropriety and the mounting body count in France, many began to have second thoughts about the French Revolution. Americans who feared that the French Revolution was spiraling out of control tended to become Federalists. Those who remained hopeful about the revolution tended to become Republicans. Not deterred by the violence, Thomas Jefferson declared that he would rather see “half the earth desolated” than see the French Revolution fail. “Were there but an Adam and an Eve left in every country, and left free,” he wrote, “it would be better than as it now is.”25 Meanwhile, the Federalists sought closer ties with Britain. Despite the political rancor, in late 1796 there came one sign of hope: the United States peacefully elected a new president. For now, as Washington stepped down and executive power changed hands, the country did not descend into the anarchy that many leaders feared. The new president was John Adams, Washington’s vice president. Adams was less beloved than the old general, and he governed a deeply divided nation. The foreign crisis also presented him with a major test. In response to Jay’s Treaty, the French government authorized its vessels to attack American shipping. To resolve this, President Adams sent envoys to France in 1797. The French insulted these diplomats. Some officials, whom the Americans code-named X, Y, and Z in their correspondence, hinted that negotiations could begin only after the Americans offered a bribe. When the story became public, this XYZ Affair infuriated American citizens. Dozens of towns wrote addresses to President Adams, pledging him their support against France. Many people seemed eager for war. “Millions for defense,” toasted South Carolina representative Robert Goodloe Harper, “but not one cent for tribute.”26 By 1798, the people of Charleston watched the ocean’s horizon apprehensively because they feared the arrival of the French navy at any moment. Many people now worried that the same ships that had aided Americans during the Revolutionary War might discharge an invasion force on their shores. Some southerners were sure that this force would consist of Black troops from France’s Caribbean colonies, who would attack the southern states and cause their enslaved laborers to revolt. Many Americans also worried that France had covert agents in the country. In the streets of Charleston, armed bands of young men searched for French disorganizers. Even the little children prepared for the looming conflict by fighting with sticks.27 Meanwhile, during the crisis, New Englanders were some of the most outspoken opponents of France. In 1798, they found a new reason for Francophobia. An influential Massachusetts minister, Jedidiah Morse, announced to his congregation that the French Revolution had been hatched in a conspiracy led by a mysterious anti-Christian organization called the Illuminati. The story was a hoax, but rumors of Illuminati infiltration spread throughout New England like wildfire, adding a new dimension to the foreign threat.28 Against this backdrop of fear, the French Quasi-War, as it would come to be known, was fought on the Atlantic, mostly between French naval vessels and American merchant ships. During this crisis, however, anxiety about foreign agents ran high, and members of Congress took action to prevent internal subversion. The most controversial of these steps were the Alien and Sedition Acts. These two laws, passed in 1798, were intended to prevent French agents and sympathizers from compromising America’s resistance, but they also attacked Americans who criticized the president and the Federalist Party. The Alien Act allowed the federal government to deport foreign nationals, or “aliens,” who seemed to pose a national security threat. Even more dramatically, the Sedition Act allowed the government to prosecute anyone found to be speaking or publishing “false, scandalous, and malicious writing” against the government.29 These laws were not simply brought on by war hysteria. They reflected common assumptions about the nature of the American Revolution and the limits of liberty. In fact, most of the advocates for the Constitution and the First Amendment accepted that free speech simply meant a lack of prior censorship or restraint, not a guarantee against punishment. According to this logic, “licentious” or unruly speech made society less free, not more. James Wilson, one of the principal architects of the Constitution, argued that “every author is responsible when he attacks the security or welfare of the government.”30 In 1798, most Federalists were inclined to agree. Under the terms of the Sedition Act, they indicted and prosecuted several Republican printers—and even a Republican congressman who had criticized President Adams. Meanwhile, although the Adams administration never enforced the Alien Act, its passage was enough to convince some foreign nationals to leave the country. For the president and most other Federalists, the Alien and Sedition Acts represented a continuation of a conservative rather than radical American Revolution. However, the Alien and Sedition Acts caused a backlash in two ways. First, shocked opponents articulated a new and expansive vision for liberty. The New York lawyer Tunis Wortman, for example, demanded an “absolute independence” of the press.31 Likewise, the Virginia judge George Hay called for “any publication whatever criminal” to be exempt from legal punishment.32 Many Americans began to argue that free speech meant the ability to say virtually anything without fear of prosecution. Second, James Madison and Thomas Jefferson helped organize opposition from state governments. Ironically, both of them had expressed support for the principle behind the Sedition Act in previous years. Jefferson, for example, had written to Madison in 1789 that the nation should punish citizens for speaking “false facts” that injured the country.33 Nevertheless, both men now opposed the Alien and Sedition Acts on constitutional grounds. In 1798, Jefferson made this point in a resolution adopted by the Kentucky state legislature. A short time later, the Virginia legislature adopted a similar document written by Madison. The Kentucky and Virginia Resolutions argued that the national government’s authority was limited to the powers expressly granted by the U.S. Constitution. More importantly, they asserted that the states could declare federal laws unconstitutional. For the time being, these resolutions were simply gestures of defiance. Their bold claim, however, would have important effects in later decades. In just a few years, many Americans’ feelings toward France had changed dramatically. Far from rejoicing in the “light of freedom,” many Americans now feared the “contagion” of French-style liberty. Debates over the French Revolution in the 1790s gave Americans some of their earliest opportunities to articulate what it meant to be American. Did American national character rest on a radical and universal vision of human liberty? Or was America supposed to be essentially pious and traditional, an outgrowth of Great Britain? They couldn’t agree. It was on this cracked foundation that many conflicts of the nineteenth century would rest.   IX. Religious Freedom One reason the debates over the French Revolution became so heated was that Americans were unsure about their own religious future. The Illuminati scare of 1798 was just one manifestation of this fear. Across the United States, a slow but profound shift in attitudes toward religion and government began. In 1776, none of the American state governments observed the separation of church and state. On the contrary, all thirteen states either had established, official, and tax-supported state churches, or at least required their officeholders to profess a certain faith. Most officials believed this was necessary to protect morality and social order. Over the next six decades, however, that changed. In 1833, the final state, Massachusetts, stopped supporting an official religious denomination. Historians call that gradual process disestablishment. In many states, the process of disestablishment had started before the creation of the Constitution. South Carolina, for example, had been nominally Anglican before the Revolution, but it had dropped denominational restrictions in its 1778 constitution. Instead, it now allowed any church consisting of at least fifteen adult males to become “incorporated,” or recognized for tax purposes as a state-supported church. Churches needed only to agree to a set of basic Christian theological tenets, which were vague enough that most denominations could support them.34 South Carolina tried to balance religious freedom with the religious practice that was supposed to be necessary for social order. Officeholders were still expected to be Christians; their oaths were witnessed by God, they were compelled by their religious beliefs to tell the truth, and they were called to live according to the Bible. This list of minimal requirements came to define acceptable Christianity in many states. As new Christian denominations proliferated between 1780 and 1840, however, more and more Christians fell outside this definition. South Carolina continued its general establishment law until 1790, when a constitutional revision removed the establishment clause and religious restrictions on officeholders. Many other states, though, continued to support an established church well into the nineteenth century. The federal Constitution did not prevent this. The religious freedom clause in the Bill of Rights, during these decades, limited the federal government but not state governments. It was not until 1833 that a state supreme court decision ended Massachusetts’s support for the Congregational Church. Many political leaders, including Thomas Jefferson and James Madison, favored disestablishment because they saw the relationship between church and state as a tool of oppression. Jefferson proposed a Statute for Religious Freedom in the Virginia state assembly in 1779, but his bill failed in the overwhelmingly Anglican legislature. Madison proposed it again in 1785, and it defeated a rival bill that would have given equal revenue to all Protestant churches. Instead Virginia would not use public money to support religion. “The Religion then of every man,” Jefferson wrote, “must be left to the conviction and conscience of every man; and it is the right of every man to exercise it as these may dictate.”35 At the federal level, the delegates to the Constitutional Convention of 1787 easily agreed that the national government should not have an official religion. This principle was upheld in 1791 when the First Amendment was ratified, with its guarantee of religious liberty. The limits of federal disestablishment, however, required discussion. The federal government, for example, supported Native American missionaries and congressional chaplains. Well into the nineteenth century, debate raged over whether the postal service should operate on Sundays, and whether non-Christians could act as witnesses in federal courts. Americans continued to struggle to understand what it meant for Congress not to “establish” a religion.   X. The Election of 1800 The year 1800 brought about a host of changes in government, in particular the first successful and peaceful transfer of power from one political party to another. But the year was important for another reason: the U.S. Capitol in Washington, D.C. (pictured here in 1800) was finally opened to be occupied by Congress, the Supreme Court, the Library of Congress, and the courts of the District of Columbia. William Russell Birch, A view of the Capitol of Washington before it was burnt down by the British, c. 1800. Wikimedia. Meanwhile, the Sedition and Alien Acts expired in 1800 and 1801. They had been relatively ineffective at suppressing dissent. On the contrary, they were much more important for the loud reactions they had inspired. They had helped many Americans decide what they didn’t want from their national government. By 1800, therefore, President Adams had lost the confidence of many Americans. They had let him know it. In 1798, for instance, he had issued a national thanksgiving proclamation. Instead of enjoying a day of celebration and thankfulness, Adams and his family had been forced by rioters to flee the capital city of Philadelphia until the day was over. Conversely, his prickly independence had also put him at odds with Alexander Hamilton, the leader of his own party, who offered him little support. After four years in office, Adams found himself widely reviled. In the election of 1800, therefore, the Republicans defeated Adams in a bitter and complicated presidential race. During the election, one Federalist newspaper article predicted that a Republican victory would fill America with “murder, robbery, rape, adultery, and incest.”36 A Republican newspaper, on the other hand, flung sexual slurs against President Adams, saying he had “neither the force and firmness of a man, nor the gentleness and sensibility of a woman.” Both sides predicted disaster and possibly war if the other should win.37 In the end, the contest came down to a tie between two Republicans, Thomas Jefferson of Virginia and Aaron Burr of New York, who each had seventy-three electoral votes. (Adams had sixty-five.) Burr was supposed to be a candidate for vice president, not president, but under the Constitution’s original rules, a tie-breaking vote had to take place in the House of Representatives. It was controlled by Federalists bitter at Jefferson. House members voted dozens of times without breaking the tie. On the thirty-sixth ballot, Thomas Jefferson emerged victorious. Republicans believed they had saved the United States from grave danger. An assembly of Republicans in New York City called the election a “bloodless revolution.” They thought of their victory as a revolution in part because the Constitution (and eighteenth-century political theory) made no provision for political parties. The Republicans thought they were fighting to rescue the country from an aristocratic takeover, not just taking part in a normal constitutional process. This image attacks Jefferson’s support of the French Revolution and religious freedom. The letter, “To Mazzei,” refers to a 1796 correspondence that criticized the Federalists and, by association, President Washington. Providential Detection, 1797. Courtesy American Antiquarian Society. Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0. In his first inaugural address, however, Thomas Jefferson offered an olive branch to the Federalists. He pledged to follow the will of the American majority, whom he believed were Republicans, but to respect the rights of the Federalist minority. His election set an important precedent. Adams accepted his electoral defeat and left the White House peacefully. “The revolution of 1800,” Jefferson wrote years later, did for American principles what the Revolution of 1776 had done for its structure. But this time, the revolution was accomplished not “by the sword” but “by the rational and peaceable instrument of reform, the suffrage of the people.”38 Four years later, when the Twelfth Amendment changed the rules for presidential elections to prevent future deadlocks, it was designed to accommodate the way political parties worked. Despite Adams’s and Jefferson’s attempts to tame party politics, though, the tension between federal power and the liberties of states and individuals would exist long into the nineteenth century. And while Jefferson’s administration attempted to decrease federal influence, Chief Justice John Marshall, an Adams appointee, worked to increase the authority of the Supreme Court. These competing agendas clashed most famously in the 1803 case of Marbury v. Madison, which Marshall used to establish a major precedent. The Marbury case seemed insignificant at first. The night before leaving office in early 1801, Adams had appointed several men to serve as justices of the peace in Washington, D.C. By making these “midnight appointments,” Adams had sought to put Federalists into vacant positions at the last minute. On taking office, however, Jefferson and his secretary of state, James Madison, had refused to deliver the federal commissions to the men Adams had appointed. Several of the appointees, including William Marbury, sued the government, and the case was argued before the Supreme Court. Marshall used Marbury’s case to make a clever ruling. On the issue of the commissions, the Supreme Court ruled in favor of the Jefferson administration. But Chief Justice Marshall went further in his decision, ruling that the Supreme Court reserved the right to decide whether an act of Congress violated the Constitution. In other words, the court assumed the power of judicial review. This was a major (and lasting) blow to the Republican agenda, especially after 1810, when the Supreme Court extended judicial review to state laws. Jefferson was particularly frustrated by the decision, arguing that the power of judicial review “would make the Judiciary a despotic branch.”39   XI. Conclusion A grand debate over political power engulfed the young United States. The Constitution ensured that there would be a strong federal government capable of taxing, waging war, and making law, but it could never resolve the young nation’s many conflicting constituencies. The Whiskey Rebellion proved that the nation could stifle internal dissent but exposed a new threat to liberty. Hamilton’s banking system provided the nation with credit but also constrained frontier farmers. The Constitution’s guarantee of religious liberty conflicted with many popular prerogatives. Dissension only deepened, and as the 1790s progressed, Americans became bitterly divided over political parties and foreign war. During the ratification debates, Alexander Hamilton had written of the wonders of the Constitution. “A nation, without a national government,” he wrote, would be “an awful spectacle.” But, he added, “the establishment of a Constitution, in time of profound peace, by the voluntary consent of a whole people, is a prodigy,” a miracle that should be witnessed “with trembling anxiety.”40 Anti-Federalists had grave concerns about the Constitution, but even they could celebrate the idea of national unity. By 1795, even the staunchest critics would have grudgingly agreed with Hamilton’s convictions about the Constitution. Yet these same individuals could also take the cautions in Washington’s 1796 farewell address to heart. “There is an opinion,” Washington wrote, “that parties in free countries are useful checks upon the administration of the government and serve to keep alive the spirit of liberty.” This, he conceded, was probably true, but in a republic, he said, the danger was not too little partisanship, but too much. “A fire not to be quenched,” Washington warned, “it demands a uniform vigilance to prevent its bursting into a flame, lest, instead of warming, it should consume.”41 For every parade, thanksgiving proclamation, or grand procession honoring the unity of the nation, there was also some political controversy reminding American citizens of how fragile their union was. And as party differences and regional quarrels tested the federal government, the new nation increasingly explored the limits of its democracy.   XII. Primary Sources 1. Hector St. Jean de Crèvecœur describes the American people, 1782 Hector St. John de Crèvecœur was born in France, but relocated to the colony of New York and married a local woman named Mehitable Tippet. For a period of several years, de Crèvecœur wrote about the people he encountered in North America. The resulting work was widely successful in Europe. In this passage, Crèvecœur attempts to reflect on the difference between life in Europe and life in North America. 2. A Confederation of Native peoples seek peace with the United States, 1786 In 1786, half a year before the Constitutional Convention, a collection of Native American leaders gathered on the banks of the Detroit River to offer a unified message to the Congress of the United States. Despite this proposal, American surveyors, settlers, and others continued to cross the Ohio River. 3. Mary Smith Cranch comments on politics, 1786-87 In the aftermath of the Revolution, politics became a sport consumed by both men and women. In a series of letters sent to her sister, Mary Smith Cranch comments on a series of political events including the lack of support for diplomats, the circulation of paper or hard currency, legal reform, tariffs against imported tea tables, Shays’s rebellion, and the role of women in supporting the nation’s interests. 4. James Madison, Memorial and Remonstrance Against Religious Assessments, 1785 Before the American Revolution, Virginia supported local Anglican churches through taxes. After the American Revolution, Virginia had to decide what to do with this policy. Some founding fathers, including Patrick Henry, wanted to equally distribute tax dollars to all churches. In this document, James Madison explains why he did not want any government money to support religious causes in Virginia. 5. George Washington, “Farewell Address,” 1796 George Washington used his final public address as president to warn against what he understood as the two greatest dangers to American prosperity: political parties and foreign wars. Washington urged the American people to avoid political partisanship and entanglements with European wars.  6. Venture Smith, A Narrative of the Life and Adventures of Venture Smith, 1798 Venture Smith’s autobiography is one of the earliest slave narratives to circulate in the Atlantic World. Slave narratives grew into the most important genre of antislavery literature and bore testimony to the injustices of the slave system. Smith was unusually lucky in that he was able to purchase his freedom, but his story nonetheless reveals the hardships faced by even the most fortunate enslaved men and women. 7. Susannah Rowson, Charlotte Temple, 1794 In Charlotte Temple, the first novel written in America, Susannah Rowson offered a cautionary tale of a woman deceived and then abandoned by a roguish man. Americans throughout the new nation read the book with rapt attention and many even traveled to New York City to visit the supposed grave of this fictional character. 8. Constitutional ratification cartoon, 1789 The Massachusetts Centinel ran a series of cartoons depicting the ratification of the Constitution.  Each vertical pillar represents a state that has ratified the new government.  In this cartoon, North Carolina’s pillar is being guided into place (it would vote for ratification in November 1789).  Rhode Island’s pillar, however, is crumbling and shows the uncertainty of the vote there.    9. Anti-Thomas Jefferson Cartoon, 1797 This image attacks Jefferson’s support of the French Revolution and religious freedom.  The Altar to “Gallic Despotism” mocks Jefferson’s allegiance to the French. The letter, “To Mazzei,” refers to a 1796 correspondence that criticized the Federalists and, by association, President Washington.    XIII. Reference Material This chapter was edited by Tara Strauch, with content contributions by Marco Basile, Nathaniel C. Green, Brenden Kennedy, Spencer McBride, Andrea Nero, Cara Rogers, Tara Strauch, Michael Harrison Taylor, Jordan Taylor, Kevin Wisniewski, and Ben Wright. Recommended citation: Marco Basile et al., “A New Nation,” Tara Strauch, ed., in The American Yawp, eds. Joseph Locke and Ben Wright (Stanford, CA: Stanford University Press, 2018).   Recommended Reading Allgor, Catherine. Parlor Politics: In which the Ladies of Washington Help Build a City and a Government. Charlottesville: University of Virginia Press, 2000. Appleby, Joyce. Inheriting the Revolution: The First Generation of Americans. Cambridge, Mass.: Belknap Press, 2001. Bartolini-Tuazon, Kathleen. For Fear of an Elective King: George Washington and the Presidential Title Controversy of 1789. Ithaca: Cornell University Press, 2014. Beeman, Richard, Stephen Botein, and Edward C. Carter II eds. Beyond Confederation: Origins of the Constitution and American National Identity. Chapel Hill, N.C.: University of North Carolina Press, 1987. Bilder, Mary Sarah. Madison’s Hand: Revising the Constitutional Convention. Cambridge: Harvard University Press, 2015. Bouton, Terry. “A Road Closed: Rural Insurgency in Post-Independence Pennsylvania,” Journal of American History 87:3 (December 2000): 855-887. Cunningham, Noble E. The Jeffersonian Republicans: The Formation of Party Organization, 1789-1801. Chapel Hill, N.C.: University of North Carolina Press, 1967. Dunn, Susan. Jefferson’s Second Revolution: The Election of 1800 and the Triumph of Republicanism. Boston: Houghton Mifflin, 2004. Edling, Max. A Revolution in Favor of Government: Origins of the U.S. Constitution and the Making of the American State. New York: Oxford University Press, 2003 Gordon-Reed, Annette. The Hemingses of Monticello: An American Family. New York: W. W. Norton, 2008. Halperin, Terri Diane. The Alien and Sedition Acts of 1798: Testing the Constitution. Baltimore: Johns Hopkins University Press, 2016. Holton, Woody. Unruly Americans and the Origins of the Constitution. 1st edition. New York: Hill and Wang, 2007. Kierner, Cynthia A. Martha Jefferson Randolph, Daughter of Monticello: Her Life and Times. Chapel Hill: University of North Carolina Press, 2012. Maier, Pauline. Ratification: The People Debate the Constitution, 1787-1788. New York: Simon & Schuster, 2010. Papenfuse, Eric Robert. “Unleashing the ‘Wildness’: The Mobilization of Grassroots Antifederalism in Maryland,” Journal of the Early Republic 16:1 (Spring 1996): 73-106. Pasley, Jeffrey L. The First Presidential Contest: 1796 and the Founding of American Democracy. Lawrence: The University of Kansas Press, 2013. Smith-Rosenberg, Carroll. “Dis-Covering the Subject of the ‘Great Constitutional Discussion,’ 1786-1789,” Journal of American History 79:3 (December 1992): 841-873 Taylor, Alan. William Cooper’s Town: Power and Persuasion on the Frontier of the Early American Republic. Reprint edition. New York: Vintage, 1996. Rakove, Jack N. Original Meanings: Politics and Ideas in the Making of the Constitution. New York: Vintage Books, 1996. Salmon, Marylynn. Women and the Law of Property in Early America. Chapel Hill, N.C.: University of North Carolina Press, 1989. Sharp, James Roger. American Politics in the Early Republic: The New Nation in Crisis. New Haven: Yale University Press, 1993. Slaughter, Thomas P. The Whiskey Rebellion: Frontier Epilogue to the American Revolution. New York: Oxford University Press, 1988. Waldstreicher, David. In the Midst of Perpetual Fetes : The Making of American Nationalism, 1776-1820. Chapel Hill : Williamsburg, Virginia, by the University of North Carolina Press, 1997. Wood, Gordon. Empire of Liberty: A History of the Early Republic, 1789-1815. Oxford: Oxford University Press, 2011. Zagarri, Rosemarie. Revolutionary Backlash: Women and Politics in the Early American Republic. Philadelphia: University of Pennsylvania Press, 2007. Allgor, Catherine. Parlor Politics: In Which the Ladies of Washington Help Build a City and a Government. Charlottesville: University of Virginia Press, 2000. Appleby, Joyce. Inheriting the Revolution: The First Generation of Americans. Cambridge, MA: Belknap Press, 2001. Bartolini-Tuazon, Kathleen. For Fear of an Elective King: George Washington and the Presidential Title Controversy of 1789. Ithaca, NY: Cornell University Press, 2014. Beeman, Richard, Stephen Botein, and Edward C. Carter II, eds. Beyond Confederation: Origins of the Constitution and American National Identity. Chapel Hill: University of North Carolina Press, 1987. Bilder, Mary Sarah. Madison’s Hand: Revising the Constitutional Convention. Cambridge, MA: Harvard University Press, 2015. Bouton, Terry. “A Road Closed: Rural Insurgency in Post-Independence Pennsylvania.” Journal of American History 87, no. 3 (December 2000): 855–887. Cunningham, Noble E. The Jeffersonian Republicans: The Formation of Party Organization, 1789–1801. Chapel Hill: University of North Carolina Press, 1967. Dunn, Susan. Jefferson’s Second Revolution: The Election of 1800 and the Triumph of Republicanism. Boston: Houghton Mifflin, 2004. Edling, Max. A Revolution in Favor of Government: Origins of the U.S. Constitution and the Making of the American State. New York: Oxford University Press, 2003. Gordon-Reed, Annette. The Hemingses of Monticello: An American Family. New York: Norton, 2008. Halperin, Terri Diane. The Alien and Sedition Acts of 1798: Testing the Constitution. Baltimore: Johns Hopkins University Press, 2016. Holton, Woody. Unruly Americans and the Origins of the Constitution. New York: Hill and Wang, 2007. Kierner, Cynthia A. Martha Jefferson Randolph, Daughter of Monticello: Her Life and Times. Chapel Hill: University of North Carolina Press, 2012. Maier, Pauline. Ratification: The People Debate the Constitution, 1787–1788. New York: Simon and Schuster, 2010. Papenfuse, Eric Robert. “Unleashing the ‘Wildness’: The Mobilization of Grassroots Antifederalism in Maryland.” Journal of the Early Republic 16, no. 1 (Spring 1996): 73–106. Pasley, Jeffrey L. The First Presidential Contest: 1796 and the Founding of American Democracy. Lawrence: University of Kansas Press, 2013. Rakove, Jack N. Original Meanings: Politics and Ideas in the Making of the Constitution. New York: Vintage Books, 1996. Salmon, Marylynn. Women and the Law of Property in Early America. Chapel Hill: University of North Carolina Press, 1989. Sharp, James Roger. American Politics in the Early Republic: The New Nation in Crisis. New Haven, CT: Yale University Press, 1993. Slaughter, Thomas P. The Whiskey Rebellion: Frontier Epilogue to the American Revolution. New York: Oxford University Press, 1986. Smith-Rosenberg, Carroll. “Dis-Covering the Subject of the ‘Great Constitutional Discussion,’ 1786–1789.” Journal of American History 79, no. 3 (December 1992): 841–873. Taylor, Alan. William Cooper’s Town: Power and Persuasion on the Frontier of the Early American Republic. New York: Vintage, 1996. Waldstreicher, David. In the Midst of Perpetual Fetes : The Making of American Nationalism, 1776–1820. Chapel Hill : University of North Carolina Press, 1997. Wood, Gordon. Empire of Liberty: A History of the Early Republic, 1789–1815. Oxford, UK: Oxford University Press, 2011. Zagarri, Rosemarie. Revolutionary Backlash: Women and Politics in the Early American Republic. Philadelphia: University of Pennsylvania Press, 2007   Notes Francis Hopkinson, An Account of the Grand Federal Procession, Philadelphia, July 4, 1788 (Philadelphia: Carey, 1788). []George Washington, Thanksgiving Proclamation, October, 3, 1789; Fed. Reg., Presidential Proclamations, 1791–1991. []Hampshire Gazette (CT), September 13, 1786. []James Madison, The Federalist Papers, (New York: Signet Classics, 2003), no. 63. []Woody Holton, Unruly Americans and the Origins of the Constitution (New York: Hill and Wang, 2007), 8–9. []Madison took an active role during the convention. He also did more than anyone else to shape historians’ understandings of the convention by taking meticulous notes. Many of the quotes included here come from Madison’s notes. To learn more about this important document, read Mary Sarah Bilder, Madison’s Hand: Revising the Constitutional Convention (Cambridge, MA: Harvard University Press, 2015). []Virginia (Randolph) Plan as Amended (National Archives Microfilm Publication M866, 1 roll); The Official Records of the Constitutional Convention; Records of the Continental and Confederation Congresses and the Constitutional Convention, 1774–1789, Record Group 360; National Archives. []Richard Beeman, Plain, Honest Men: The Making of the American Constitution (New York: Random House, 2009), 114. []Herbert J. Storing, What the Anti-Federalists Were For: The Political Thought of the Opponents of the Constitution (Chicago: University of Chicago Press, 1981), 16. []Ray Raphael, Mr. President: How and Why the Founders Created a Chief Executive (New York: Knopf, 2012), 50. See also Kathleen Bartoloni-Tuazon, For Fear of an Elected King: George Washington and the Presidential Title Controversy of 1789 (Ithaca, NY: Cornell University Press, 2014). []David J. Siemers, Ratifying the Republic: Antifederalists and Federalists in Constitutional Time (Stanford, CA: Stanford University Press, 2002). []Alexander Hamilton, James Madison, and John Jay, The Federalist Papers, ed. Ian Shapiro (New Haven, CT: Yale University Press, 2009). []Pauline Maier, Ratification: The People Debate the Constitution, 1787–1788 (New York: Simon and Schuster, 2010), 225–237. []David Waldstreicher, Slavery’s Constitution: From Revolution to Ratification (New York: Hill and Wang, 2009). []Carson Holloway, Hamilton Versus Jefferson in the Washington Administration: Completing the Founding or Betraying the Founding? (New York: Cambridge University Press, 2015). []Alexander Hamilton, The Works of Alexander Hamilton, Volume 1, ed. Henry Cabot Lodge, ed. (New York: Putnam, 1904), 70, 408. []Alexander Hamilton, Report on Manufactures (New York: Childs and Swaine, 1791). []James H. Hutson, ed., Supplement to Max Farrand’s the Records of the Federal Convention of 1787 (New Haven, CT: Yale University Press, 1987), 119. []Hamilton, Report on Manufactures). []Richard Sylla, “National Foundations: Public Credit, the National Bank, and Securities Markets,” in Founding Choices: American Economic Policy in the 1790s, ed. Douglas A. Irwin and Richard Sylla (Chicago: University of Chicago Press, 2011), 68. []Thomas P. Slaughter, The Whiskey Rebellion: Frontier Epilogue to the American Revolution (New York: Oxford University Press, 1986). []“Proclamation of Neutrality, 1793,” in A Compilation of the Messages and Papers of the Presidents Prepared Under the Direction of the Joint Committee on printing, of the House and Senate Pursuant to an Act of the Fifty-Second Congress of the United States (New York: Bureau of National Literature, 1897). []United States, Treaty of Amity, Commerce, and Navigation, signed at London November 19, 1794, Submitted to the Senate June 8, Resolution of Advice and Consent, on condition, June 24, 1795. Ratified by the United States August 14, 1795. Ratified by Great Britain October 28, 1795. Ratifications exchanged at London October 28, 1795. Proclaimed February 29, 1796. []Elizabeth Fox-Genovese and Eugene D. Genovese, The Mind of the Master Class: History and Faith in the Southern Slaveholders Worldview (New York: Cambridge University Press, 2005), 18. []From Thomas Jefferson to William Short, 3 January 1793,” Founders Online, National Archives. http://founders.archives.gov/documents/Jefferson/01-25-02-0016, last modified June 29, 2015; The Papers of Thomas Jefferson, vol. 25, 1 January–10 May 1793, ed. John Catanzariti (Princeton, NJ: Princeton University Press, 1992), 14–17. []Robert Goodloe Harper, June 18, 1798, quoted in American Daily Advertiser (Philadelphia), June 20, 1798. []Robert J. Alderson Jr., This Bright Era of Happy Revolutions: French Consul Michel-Ange-Bernard Mangourit and International Republicanism in Charleston, 1792–1794 (Columbia: University of South Carolina Press, 2008). []Rachel Hope Cleves, The Reign of Terror in America: Visions of Violence from Anti-Jacobinism to Antislavery (New York: Cambridge University Press, 2012), 47. []Alien Act, July 6, 1798, and An Act in Addition to the Act, Entitled “An Act for the Punishment of Certain Crimes Against the United States,” July 14, 1798; Fifth Congress; Enrolled Acts and Resolutions; General Records of the United States Government; Record Group 11; National Archives. []James Wilson, Congressional Debate, December 1, 1787, in Jonathan Elliot, ed., The Debates in the Several State Conventions on the Adoption of the Federal Constitution as Recommended by the General Convention at Philadelphia in 1787, Vol. 2 (New York: s.n., 1888) 448–450. []Tunis Wortman, A Treatise Concerning Political Enquiry, and the Liberty of the Press (New York: Forman, 1800), 181. []George Hay, An Essay on the Liberty of the Press (Philadelphia: s.n., 1799), 43. []Thomas Jefferson to James Madison, August 28, 1789, from The Works of Thomas Jefferson in Twelve Volumes, Federal Edition, ed. Paul Leicester Ford. http://www.loc.gov/resource/mtj1.011_0853_0861 []Francis Newton Thorpe, ed., The Federal and State Constitutions, Colonial Charters, and Other Organic Laws of the States, Territories, and Colonies Now or Heretofore Forming the United States of America Compiled and Edited Under the Act of Congress of June 30, 1906 (Washington, DC: U.S. Government Printing Office, 1909). []Thomas Jefferson, An Act for Establishing Religious Freedom, 16 January 1786, Manuscript, Records of the General Assembly, Enrolled Bills, Record Group 78, Library of Virginia. []Catherine Allgor, Parlor Politics: In Which the Ladies of Washington Help Build a City and a Government (Charlottesville: University of Virginia Press, 2000), 14. []James T. Callender, The Prospect Before Us (Richmond: s.n., 1800). []Letter from Thomas Jefferson to Spencer Roane, September 6, 1819, in The Writings of Thomas Jefferson, 20 vols., ed. Albert Ellery Bergh (Washington, DC: Thomas Jefferson Memorial Association of the United States, 1903), 142. []Harold H. Bruff, Untrodden Ground: How Presidents Interpret the Constitution (Chicago: University of Chicago Press, 2015), 65. []Alexander Hamilton, The Federalist Papers (New York: Signet Classics, 2003), no. 85. []George Washington, Farewell Address, Annals of Congress, 4th Congress, 2869–2870. [] This entry was posted in Uncategorized on June 7, 2013 by All Chapters. Post navigation ← 5. The American Revolution 7. The Early Republic →

      The discussion of Shays’s Rebellion reveals how economic struggles and weak national power under the Articles of Confederation created serious unrest among farmers. While some leaders viewed the rebellion as a dangerous threat to order, others believed it represented the same revolutionary spirit that founded the country.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript by Lin et al. presents a timely, technically strong study that builds patient-specific midbrain-like organoids (MLOs) from hiPSCs carrying clinically relevant GBA1 mutations (L444P/P415R and L444P/RecNcil). The authors comprehensively characterize nGD phenotypes (GCase deficiency, GluCer/GluSph accumulation, altered transcriptome, impaired dopaminergic differentiation), perform CRISPR correction to produce an isogenic line, and test three therapeutic modalities (SapC-DOPS-fGCase nanoparticles, AAV9-GBA1, and SRT with GZ452). The model and multi-arm therapeutic evaluation are important advances with clear translational value.

      My overall recommendation is that the work undergo a major revision to address the experimental and interpretive gaps listed below.

      Strengths:

      (1) Human, patient-specific midbrain model: Use of clinically relevant compound heterozygous GBA1 alleles (L444P/P415R and L444P/RecNcil) makes the model highly relevant to human nGD and captures patient genetic context that mouse models often miss.

      (2) Robust multi-level phenotyping: Biochemical (GCase activity), lipidomic (GluCer/GluSph by UHPLC-MS/MS), molecular (bulk RNA-seq), and histological (TH/FOXA2, LAMP1, LC3) characterization are thorough and complementary.

      (3) Use of isogenic CRISPR correction: Generating an isogenic line (WT/P415R) and demonstrating partial rescue strengthens causal inference that the GBA1 mutation drives many observed phenotypes.

      (4) Parallel therapeutic testing in the same human platform: Comparing enzyme delivery (SapC-DOPS-fGCase), gene therapy (AAV9-GBA1), and substrate reduction (GZ452) within the same MLO system is an elegant demonstration of the platform's utility for preclinical evaluation.

      (5) Good methodological transparency: Detailed protocols for MLO generation, editing, lipidomics, and assays allow reproducibility

      Weaknesses:

      (1) Limited genetic and biological replication

      (a) Single primary disease line for core mechanistic claims. Most mechanistic data derive from GD2-1260 (L444P/P415R); GD2-10-257 (L444P/RecNcil) appears mainly in therapeutic experiments. Relying primarily on one patient line risks conflating patient-specific variation with general nGD mechanisms.

      (b) Unclear biological replicate strategy. It is not always explicit how many independent differentiations and organoid batches were used (biological replicates vs. technical fields of view).

      (c) A significant disadvantage of employing brain organoids is the heterogeneity during induction and potential low reproducibility. In this study, it is unclear how many independent differentiation batches were evaluated and, for each test (for example, immunofluorescent stain and bulk RNA-seq), how many organoids from each group were used. Please add a statement accordingly and show replicates to verify consistency in the supplementary data.

      (d) Isogenic correction is partial. The corrected line is WT/P415R (single-allele correction); residual P415R complicates the interpretation of "full" rescue and leaves open whether the remaining pathology is due to incomplete correction or clonal/epigenetic effects.

      (e) The authors tested week 3, 4, 8, 15, and 28 old organoids in different settings. However, systematic markers of maturation should be analyzed, and different maturation stages should be compared, for example, comparing week 8 organoids to week 28 organoids, with immunofluorescent marker staining and bulk RNAseq.

      (f) The manuscript frequently refers to Wnt signaling dysregulation as a major finding. However, experimental validation is limited to transcriptomic data. Functional tests, such as the use of Wnt agonist/inhibitor, are needed to support this claim (see below).

      (g) Suggested fixes/experiments

      Add at least one more independent disease hiPSC line (or show expanded analysis from GD2-10-257) for key mechanistic endpoints (lipid accumulation, transcriptomics, DA markers)

      Generate and analyze a fully corrected isogenic WT/WT clone (or a P415R-only line) if feasible; at minimum, acknowledge this limitation more explicitly and soften claims.

      Report and increase independent differentiations (N = biological replicates) and present per-differentiation summary statistics.

      (2) Mechanistic validation is insufficient

      (a) RNA-seq pathways (Wnt, mTOR, lysosome) are not functionally probed. The manuscript shows pathway enrichment and some protein markers (p-4E-BP1) but lacks perturbation/rescue experiments to link these pathways causally to the DA phenotype.

      (b) Autophagy analysis lacks flux assays. LC3-II and LAMP1 are informative, but without flux assays (e.g., bafilomycin A1 or chloroquine), one cannot distinguish increased autophagosome formation from decreased clearance.

      (c) Dopaminergic dysfunction is superficially assessed. Dopamine in the medium and TH protein are shown, but no neuronal electrophysiology, synaptic marker co-localization, or viability measures are provided to demonstrate functional recovery after therapy.

      (d) Suggested fixes/experiments

      Perform targeted functional assays:

      (i) Wnt reporter assays (TOP/FOP flash) and/or treat organoids with Wnt agonists/antagonists to test whether Wnt modulation rescues DA differentiation.

      (ii)Test mTOR pathway causality using mTOR inhibitors (e.g., rapamycin) or 4E-BP1 perturbation and assay effects on DA markers and autophagy.

      Include autophagy flux assessment (LC3 turnover with bafilomycin), and measure cathepsin activity where relevant.

      Add at least one functional neuronal readout: calcium imaging, MEA recordings, or synaptic marker quantification (e.g., SYN1, PSD95) together with TH colocalization.

      (3) Therapeutic evaluation needs greater depth and standardization

      (a) Short windows and limited durability data. SapC-DOPS and AAV9 experiments range from 48 hours to 3 weeks; longer follow-up is needed to assess durability and whether biochemical rescue translates into restored neuronal function.

      (b) Dose-response and biodistribution are under-characterized. AAV injection sites/volumes are described, but transduction efficiency, vg copies per organoid, cell-type tropism quantification, and SapC-DOPS penetration/distribution are not rigorously quantified.

      (c) Specificity controls are missing. For SapC-DOPS, inclusion of a non-functional enzyme control (or heat-inactivated fGCase) would rule out non-specific nanoparticle effects. For AAV, assessment of off-target expression and potential cytotoxicity is needed.

      (d) Comparative efficacy lacking. It remains unclear which modality is most effective in the long term and in which cellular compartments.

      (e) Suggested fixes/experiments

      Extend follow-up (e.g., 6+ weeks) after AAV/SapC dosing and evaluate DA markers, electrophysiology, and lipid levels over time.

      Quantify AAV transduction by qPCR for vector genomes and by cell-type quantification of GFP+ cells (neurons vs astrocytes vs progenitors).

      Include SapC-DOPS control nanoparticles loaded with an inert protein and/or fluorescent cargo quantitation to show distribution and uptake kinetics.

      Provide head-to-head comparative graphs (activity, lipid clearance, DA restoration, and durability) with statistical tests.

      (4) Model limitations not fully accounted for in interpretation

      (a) Absence of microglia and vasculature limits recapitulation of neuroinflammatory responses and drug penetration, both of which are important in nGD. These absences could explain incomplete phenotypic rescues and must be emphasized when drawing conclusions about therapeutic translation.

      (b) Developmental vs degenerative phenotype conflation. Many phenotypes appear during differentiation (patterning defects). The manuscript sometimes interprets these as degenerative mechanisms; the distinction must be clarified.

      (c) Suggested fixes

      Tone down the language throughout (Abstract/Results/Discussion) to avoid overstatement that MLOs fully recapitulate nGD neuropathology.

      Add plans or pilot data (if available) for microglia incorporation or vascularization to indicate how future work will address these gaps.

      (5) Statistical and presentation issues

      (a) Missing or unclear sample sizes (n). For organoid-level assays, report the number of organoids and the number of independent differentiations.

      (b) Statistical assumptions not justified. Tests assume normality; where sample sizes are small, consider non-parametric tests and report exact p-values.

      (c) Quantification scope. Many image quantifications appear to be from selected fields of view, which are then averaged across organoids and differentiations.

      (d) RNA-seq QC and deposition. Provide mapping rates, batch correction details, and ensure the GEO accession is active. Include these in Methods/Supplement.

      (e) Suggested fixes

      Add a table summarizing biological replicates, technical replicates, and statistical tests used for each figure panel.

      Recompute statistics where appropriate (non-parametric if N is small) and report effect sizes and confidence intervals.

      (6) Minor comments and clarifications

      (a) The authors should validate midbrain identity further with additional regional markers (EN1, OTX2) and show absence/low expression of forebrain markers (FOXG1) across replicates.

      (b) Extracellular dopamine ELISA should be complemented with intracellular dopamine or TH+ neuron counts normalized per organoid or per total neurons.

      (c) For CRISPR editing: the authors should report off-target analysis (GUIDE-seq or targeted sequencing of predicted off-targets) or at least in-silico off-target score and sequencing coverage of the edited locus.

      (d) It should be clarified as to whether lipidomics normalization is to total protein per organoid or per cell, and include representative LC-MS chromatograms or method QC.

      (e) Figure legends should be improved in order to state the number of organoids, the number of differentiations, and the exact statistical tests used (including multiple-comparison corrections).

      (f) In the title, the authors state "reveal disease mechanisms", but the studies mainly exhibit functional changes. They should consider toning down the statement.

      (7) Recommendations

      This reviewer recommends a major revision. The manuscript presents substantial novelty and strong potential impact but requires additional experimental validation and clearer, more conservative interpretation. Key items to address are:

      (a) Strengthening genetic and biological replication (additional lines or replicate differentiations).

      (b) Adding functional mechanistic validation for major pathways (Wnt/mTOR/autophagy) and providing autophagy flux data.

      (c) Including at least one neuronal functional readout (calcium imaging/MEA/patch) to demonstrate functional rescue.

      (d) Deepening therapeutic characterization (dose, biodistribution, durability) and including specificity controls.

      (e) Improving statistical reporting and explicitly stating biological replicate structure.

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript by Lin et al. presents a timely, technically strong study that builds patientspecific midbrain-like organoids (MLOs) from hiPSCs carrying clinically relevant GBA1 mutations (L444P/P415R and L444P/RecNcil). The authors comprehensively characterize nGD phenotypes (GCase deficiency, GluCer/GluSph accumulation, altered transcriptome, impaired dopaminergic differentiation), perform CRISPR correction to produce an isogenic line, and test three therapeutic modalities (SapC-DOPS-fGCase nanoparticles, AAV9GBA1, and SRT with GZ452). The model and multi-arm therapeutic evaluation are important advances with clear translational value.

      My overall recommendation is that the work undergo a major revision to address the experimental and interpretive gaps listed below.

      Strengths:

      (1) Human, patient-specific midbrain model: Use of clinically relevant compound heterozygous GBA1 alleles (L444P/P415R and L444P/RecNcil) makes the model highly relevant to human nGD and captures patient genetic context that mouse models often miss.

      (2) Robust multi-level phenotyping: Biochemical (GCase activity), lipidomic (GluCer/GluSph by UHPLC-MS/MS), molecular (bulk RNA-seq), and histological (TH/FOXA2, LAMP1, LC3) characterization are thorough and complementary.

      (3) Use of isogenic CRISPR correction: Generating an isogenic line (WT/P415R) and demonstrating partial rescue strengthens causal inference that the GBA1 mutation drives many observed phenotypes.

      (4) Parallel therapeutic testing in the same human platform: Comparing enzyme delivery (SapC-DOPS-fGCase), gene therapy (AAV9-GBA1), and substrate reduction (GZ452) within the same MLO system is an elegant demonstration of the platform's utility for preclinical evaluation.

      (5) Good methodological transparency: Detailed protocols for MLO generation, editing, lipidomics, and assays allow reproducibility

      Weaknesses:

      (1) Limited genetic and biological replication

      (a) Single primary disease line for core mechanistic claims. Most mechanistic data derive from GD2-1260 (L444P/P415R); GD2-10-257 (L444P/RecNcil) appears mainly in therapeutic experiments. Relying primarily on one patient line risks conflating patient-specific variation with general nGD mechanisms.

      We thank the reviewer for highlighting the importance of genetic and biological replication. An additional patient-derived iPSC line was included in the manuscript, therefore, our study includes two independent nGD patient-derived iPSC lines, GD2-1260 (GBA1<sup>L444P/P415R</sup>) and GD2-10-257 (GBA1<sup>L444P/RecNcil</sup>), both of which carry the severe mutations associated with nGD. These two lines represent distinct genetic backgrounds and were used to demonstrate the consistency of key disease phenotypes (reduced GCase activity, elevated substrate, impaired dopaminergic neuron differentiation, etc.) across different patient’s MLOs. Major experiments (e.g., GCase activity assays, substrate, immunoblotting for DA marker TH, and therapeutic testing with SapC-DOPS-fGCase, AAV9-GBA1) were performed using both patient lines, with results showing consistent phenotypes and therapeutic responses (see Figs. 2-6, and Supplementary Figs. 4-5). To ensure clarity and transparency, a new Supplementary Table 2 summarizes the characterization of both the GD2-1260 and GD2-10-257 lines.

      (b) Unclear biological replicate strategy. It is not always explicit how many independent differentiations and organoid batches were used (biological replicates vs. technical fields of view).

      Biological replication was ensured in our study by conducting experiments in at least 3 independent differentiations per line, and technical replicates (multiple organoids/fields per batch) were averaged accordingly. We have clarified biological replicates and differentiation in the figure legends. 

      (c) A significant disadvantage of employing brain organoids is the heterogeneity during induction and potential low reproducibility. In this study, it is unclear how many independent differentiation batches were evaluated and, for each test (for example, immunofluorescent stain and bulk RNA-seq), how many organoids from each group were used. Please add a statement accordingly and show replicates to verify consistency in the supplementary data.

      In the revision, we have clarified biological replicates and differentiation in the figure legend in Fig.1E; Fig.2B,2G; Fig.3F, 3G; Fig.4B-C,E,H-J, M-N; Fig.6D; and Fig.7A-C, I.

      (d) Isogenic correction is partial. The corrected line is WT/P415R (single-allele correction); residual P415R complicates the interpretation of "full" rescue and leaves open whether the remaining pathology is due to incomplete correction or clonal/epigenetic effects.

      We attempted to generate an isogenic iPSC line by correcting both GBA1 mutations (L444P and P415R). However, this was not feasible because GBA1 overlaps with a highly homologous pseudogene (PGBA), which makes precise editing technically challenging. Consequently, only the L444P mutation was successfully corrected, and the resulting isogenic line retains the P415R mutation in a heterozygous state. Because Gaucher disease is an autosomal recessive disorder, individuals carrying a single GBA1 mutation (heterozygous carriers) do not develop clinical symptoms. Therefore, the partially corrected isogenic line, which retains only the P415R allele, represents a clinically relevant carrier model. Consistent with this, our results show that GCase activity was restored to approximately 50% of wild-type levels (Fig.4B-C), supporting the expected heterozygous state. These findings also make it unlikely that the remaining differences observed are due to clonal variation or epigenetic effects.

      (e) The authors tested week 3, 4, 8, 15, and 28 old organoids in different settings. However, systematic markers of maturation should be analyzed, and different maturation stages should be compared, for example, comparing week 8 organoids to week 28 organoids, with immunofluorescent marker staining and bulk RNAseq.

      We agree that a systematic analysis of maturation stages is essential for validating the MLO model. Our data integrated a longitudinal comparison across multiple developmental windows (Weeks 3 to 28) to characterize the transition from progenitors to mature/functional states for nGD phenotyping and evaluation of therapeutic modalities: 1) DA differentiation (Wks 3 and 8 in Fig. 3): qPCR analysis demonstrated the progression of DA-specific programs. We observed a steady increase in the mature DA neuron marker TH and ASCL1. This was accompanied by a gradual decrease in early floor plate/progenitor markers FOXA2 and PLZF, indicating a successful differentiation path from progenitors to differentiated/mature DA neurons. 2) Glycosphingolipid substrates accumulation (Wks 15 and 28 in Fig 2): To assess late-stage nGD phenotyping, we compared GluCer and GluSph at Week 15 and Week 28. This comparison highlights the progressive accumulation of substrates in nGD MLOs, reflecting the metabolic consequences of the disease at different mature stage. 3) Organoid growth dynamics (Wks 4, 8, and 15 in new Fig. 4): The new Fig. 4 tracks physical maturation through organoid size and growth rates across three key time points, providing a macro-scale verification of consistent development between WT and nGD groups. By comparing these early (Wk 3-8) and late (Wk 15-28) stages, we confirmed that our MLOs transition from a proliferative state to a post-mitotic, specialized neuronal state, satisfied the requirement for comparing distinct maturation stages.

      (f) The manuscript frequently refers to Wnt signaling dysregulation as a major finding. However, experimental validation is limited to transcriptomic data. Functional tests, such as the use of Wnt agonist/inhibitor, are needed to support this claim (see below).

      We agree that the suggested experiments could provide additional mechanistic insights into this study and will consider them in future work.

      (g) Suggested fixes / experiments

      Add at least one more independent disease hiPSC line (or show expanded analysis from GD2-10-257) for key mechanistic endpoints (lipid accumulation, transcriptomics, DA markers).

      Additional line iPSC GD2-10-257 derived MLO was included in the manuscript. This was addressed above [see response to Weaknesses (1)-a]. 

      Generate and analyze a fully corrected isogenic WT/WT clone (or a P415R-only line) if feasible; at minimum, acknowledge this limitation more explicitly and soften claims.

      We attempted to generate an isogenic iPSC line by correcting both GBA1 mutations (L444P and P415R). However, this was unsuccessful because the GBA1 gene overlaps with a pseudogene (PGBA) located 16 kb downstream of GBA1, which shares 96-98% sequence similarity with GBA1 (Ref#1, #2), which complicates precise editing. GBA1 is shorter (~5.7 kb) than PGBA (~7.6 kb). The primary exonic difference between GBA1 and PGBA is a 55-bp deletion in exon 9 of the pseudogene. As a result, the isogenic line we obtained carries only the P415R mutation, and L444P was corrected to the normal sequence. We have included this limitation in the Methods as “This gene editing strategy is expected to also target the GBA1 pseudogene due to the identical target sequence, which limits the gene correction on certain mutations (e.g., P415R)”. 

      References:

      (1) Horowitz M., Wilder S., Horowitz Z., Reiner O., Gelbart T., Beutler E. The human glucocerebrosidase gene and pseudogene: structure and evolution. Genomics (1989). 4, 87–96. doi:10.1016/0888-7543(89)90319-4

      (2) Woo EG, Tayebi N, Sidransky E. Next-Generation Sequencing Analysis of GBA1: The Challenge of Detecting Complex Recombinant Alleles. Front Genet. (2021). 12:684067. doi:10.3389/fgene.2021.684067. PMCID: PMC8255797.

      Report and increase independent differentiations (N = biological replicates) and present per-differentiation summary statistics.

      This was addressed above [see response to Weaknesses (1)-b, (1)-c]. 

      (2) Mechanistic validation is insufficient

      (a) RNA-seq pathways (Wnt, mTOR, lysosome) are not functionally probed. The manuscript shows pathway enrichment and some protein markers (p-4E-BP1) but lacks perturbation/rescue experiments to link these pathways causally to the DA phenotype.

      (b) Autophagy analysis lacks flux assays. LC3-II and LAMP1 are informative, but without flux assays (e.g., bafilomycin A1 or chloroquine), one cannot distinguish increased autophagosome formation from decreased clearance.

      (c) Dopaminergic dysfunction is superficially assessed. Dopamine in the medium and TH protein are shown, but no neuronal electrophysiology, synaptic marker co-localization, or viability measures are provided to demonstrate functional recovery after therapy.

      (d) Suggested fixes/experiments

      Perform targeted functional assays:

      (i) Wnt reporter assays (TOP/FOP flash) and/or treat organoids with Wnt agonists/antagonists to test whether Wnt modulation rescues DA differentiation.

      (ii) Test mTOR pathway causality using mTOR inhibitors (e.g., rapamycin) or 4E-BP1 perturbation and assay effects on DA markers and autophagy.

      Include autophagy flux assessment (LC3 turnover with bafilomycin), and measure cathepsin activity where relevant.

      Add at least one functional neuronal readout: calcium imaging, MEA recordings, or synaptic marker quantification (e.g., SYN1, PSD95) together with TH colocalization.

      We thank the reviewer for these valuable suggestions. We agree that the suggested experiments could provide additional mechanistic insights into this study and will consider them in future work. Importantly, the primary conclusions of our manuscript, that GBA1 mutations in nGD MLOs resulted in nGD pathologies such as diminished enzymatic function, accumulation of lipid substrates, widespread transcriptomic changes, and impaired dopaminergic neuron differentiation, which can be corrected by several therapeutic strategies in this study, are supported by the evidence presented. The suggested experiments represent an important direction for future research using brain organoids.

      (3) Therapeutic evaluation needs greater depth and standardization

      (a) Short windows and limited durability data. SapC-DOPS and AAV9 experiments range from 48 hours to 3 weeks; longer follow-up is needed to assess durability and whether biochemical rescue translates into restored neuronal function.

      We agree with the reviewer. Because this is a proof-of-principle study, the treatment was designed within a short time window. Long-term studies with more comprehensive outcome assessments will be conducted in future work.

      (b) Dose-response and biodistribution are under-characterized. AAV injection sites/volumes are described, but transduction efficiency, vg copies per organoid, cell-type tropism quantification, and SapC-DOPS penetration/distribution are not rigorously quantified.

      We appreciate the reviewer’s concerns. This study was intended to demonstrate the feasibility and initial response of MLOs to AAV therapy. A comprehensive evaluation of AAV biodistribution will be considered in future studies.

      The penetration and distribution of SapC-DOPS have been extensively characterized in prior studies. In vivo biodistribution of SapC–DOPS coupled CellVue Maroon, a fluorescent cargo, was examined in mice bearing human tumor xenografts using real-time fluorescence imaging, where CellVue Maroon fluorescence in tumor remained for 48 hours (Ref. #3: Fig. 4B, mouse 1), 100 hours (Ref. #4: Fig. 5), up to 216 hours (Ref. #5: Fig. 3). Uptake kinetics were also demonstrated in cells, with flow cytometry quantification showing that fluorescent cargo coupled SapC-DOPS nanovesicles, were incorporated into human brain tumor cell membranes within minutes and remained stably incorporated into the cells for up to one hour (Ref. # 6: Fig. 1a and Fig. 1b). Building on these findings, the present study focuses on evaluating the restoration of GCase function rather than reexamining biodistribution and uptake kinetics.

      References:

      (3) X. Qi, Z. Chu, Y.Y. Mahller, K.F. Stringer, D.P. Witte, T.P. Cripe. Cancer-selective targeting and cytotoxicity by liposomal-coupled lysosomal saposin C protein. Clin. Cancer Res. (2009) 15, 5840-5851. PMID: 19737950.

      (4) Z. Chu, S. Abu-Baker, M.B. Palascak, S.A. Ahmad, R.S. Franco, and X. Qi. Targeting and cytotoxicity of SapC-DOPS nanovesicles in pancreatic cancer. PLOS ONE (2013) 8, e75507. PMID: 24124494.

      (5) Z. Chu, K. LaSance, V.M. Blanco, C.-H. Kwon, B., Kaur, M., Frederick, S., Thornton, L., Lemen, and X. Qi. Multi-angle rotational optical imaging of brain tumors and arthritis using fluorescent SapC-DOPS nanovesicles. J. Vis. Exp. (2014) 87, e51187, 17. PMID: 24837630.

      (6) J. Wojton, Z. Chu, C-H. Kwon, L.M.L. Chow, M. Palascak, R. Franco, T. Bourdeau, S. Thornton, B. Kaur, and X. Qi. Systemic delivery of SapC-DOPS has antiangiogenic and antitumor effects against glioblastoma. Mol. Ther. (2013) 21, 1517-1525. PMID: 23732993.

      (c) Specificity controls are missing. For SapC-DOPS, inclusion of a non-functional enzyme control (or heat-inactivated fGCase) would rule out non-specific nanoparticle effects. For AAV, assessment of off-target expression and potential cytotoxicity is needed.

      Including inactive fGCase would confound the assessment of fGCase in MLOs by immunoblot and immunofluorescence; therefore, saposin C–DOPS was used as the control instead. 

      We agree that assessment of Off-target expression and potential cytotoxicity for AAV is important; this will be included in future studies.

      (d) Comparative efficacy lacking. It remains unclear which modality is most effective in the long term and in which cellular compartments.

      To address this comment, we have added a new table (Supplementary Table 2) comparing the four therapeutic modalities and summarizing their respective outcomes. While this study focused on short-term responses as a proof-of-principle, future work will explore long-term therapeutic effects. 

      (e) Suggested fixes/experiments

      Extend follow-up (e.g., 6+ weeks) after AAV/SapC dosing and evaluate DA markers, electrophysiology, and lipid levels over time.

      We appreciate the reviewer’s suggestions. The therapeutic testing in patient-derived MLOs was designed as a proof-of-principle study to demonstrate feasibility and the primary response (rescue of GCase function) to the treatment. A comprehensive, long-term therapeutic evaluation of AAV and SapC-DOPS-fGCase is indeed important for a complete assessment; however, this represents a separate therapeutic study and is beyond the scope of the current work.

      Quantify AAV transduction by qPCR for vector genomes and by cell-type quantification of GFP+ cells (neurons vs astrocytes vs progenitors).

      For the AAV-treated experiments, we agree that measuring AAV copy number and GFP expression would provide additional information. However, the primary goal of this study was to demonstrate the key therapeutic outcome, rescue of GCase function by AAV-delivered normal GCase, which is directly relevant to the treatment objective.

      Include SapC-DOPS control nanoparticles loaded with an inert protein and/or fluorescent cargo quantitation to show distribution and uptake kinetics.

      As noted above [see response to Weakness (3)-c], using inert GCase would confound the assessment of fGCase uptake in MLOs; therefore, it was not suitable for this study. See response above for the distribution and uptake kinetics of SapC-DOPS [see response to Weaknesses (3)-b].

      Provide head-to-head comparative graphs (activity, lipid clearance, DA restoration, and durability) with statistical tests.

      We have added a new table (Supplementary Table 2) providing a head-to-head comparison of the treatment effects. 

      (4) Model limitations not fully accounted for in interpretation

      (a) Absence of microglia and vasculature limits recapitulation of neuroinflammatory responses and drug penetration, both of which are important in nGD. These absences could explain incomplete phenotypic rescues and must be emphasized when drawing conclusions about therapeutic translation.

      We agree that the absence of microglia and vasculature in midbrain-like organoids represents a limitation, as we have discussed in the manuscript. In this revision, we highlighted this limitation in the Discussion section and clarified that it may contribute to incomplete phenotyping and phenotypic rescue observed in our therapeutic experiments. Additionally, we have outlined future directions to incorporate microglia and vascularization into the organoid system to better recapitulate the in vivo environment and improve translational relevance (see 7th paragraph in the Discussion).

      (b) Developmental vs degenerative phenotype conflation. Many phenotypes appear during differentiation (patterning defects). The manuscript sometimes interprets these as degenerative mechanisms; the distinction must be clarified.

      We appreciate the reviewer’s comments. In the revised manuscript, we have clarified that certain abnormalities, such as patterning defects observed during early differentiation, likely reflect developmental consequences of GBA1 mutations rather than degenerative processes. Conversely, phenotypes such as substrate accumulation, lysosomal dysfunction, and impaired dopaminergic maturation at later stages are interpreted as degenerative features. We have updated the Results and Discussion sections to avoid conflating developmental defects with neurodegenerative mechanisms.

      (c) Suggested fixes

      Tone down the language throughout (Abstract/Results/Discussion) to avoid overstatement that MLOs fully recapitulate nGD neuropathology.

      The manuscript has been revised to avoid overstatements.

      Add plans or pilot data (if available) for microglia incorporation or vascularization to indicate how future work will address these gaps.

      The manuscript now includes further plans to address the incorporation of microglia and vascularization, described in the last two paragraphs in the Discussion. Pilot study of microglia incorporation will be reported when it is completed.

      (5) Statistical and presentation issues

      (a) Missing or unclear sample sizes (n). For organoid-level assays, report the number of organoids and the number of independent differentiations.

      We have clarified biological replicates and differentiation in the figure legend [see response to Weaknesses (1)-b, (1)-c]. 

      (b) Statistical assumptions not justified. Tests assume normality; where sample sizes are small, consider non-parametric tests and report exact p-values.

      We have updated Statistical analysis in the methods as described below:

      “For comparisons between two groups, data were analyzed using unpaired two-tailed Student’s t-tests when the sample size was ≥6 per group and normality was confirmed by the Shapiro-Wilk test. When the normality assumption was not met or when sample sizes were small (n < 6), the non-parametric Mann-Whitney U test was used instead. For comparisons involving three or more groups, one-way ANOVA followed by Tukey’s multiple comparison test was applied when data were normally distributed; otherwise, the nonparametric Dunn’s multiple comparison test was used. Exclusion of outliers was made based on cut-offs of the mean ±2 standard deviations. All statistical analyses were performed using GraphPad Prism 10 software. Exact p-values are reported throughout the manuscript and figures where feasible. A p-value < 0.05 was considered statistically significant.”

      (c) Quantification scope. Many image quantifications appear to be from selected fields of view, which are then averaged across organoids and differentiations.

      In this work, quantitative immunofluorescence analyses (e.g., cell counts for FOXP1+, FOXG1+, SOX2+ and Ki67+ cells, as well as marker colocalization) were performed on at least 3–5 randomly selected non-overlapping fields of view (FOVs) per organoid section, with a minimum of 3 organoids per differentiation batch. Each FOV was imaged at consistent magnification (60x) and z-stack depth to ensure comparable sampling across conditions. Data from individual FOVs were first averaged within each organoid to obtain an organoid-level mean, and then biological replicates (independent differentiations, n ≥ 3) were averaged to generate the final group mean ± SEM. This multilevel averaging approach minimizes bias from regional heterogeneity within organoids and accounts for variability across differentiations. Representative confocal images shown in the figures were selected to accurately reflect the quantified data. We believe this standardized quantification strategy ensures robust and reproducible results while appropriately representing the 3D architecture of the organoids.

      In the revision, we have clarified the method used for image analysis of sectioned MLOs as below:

      “Quantitative immunofluorescence analyses (e.g., cell counts for FOXP1+, FOXG1+, SOX2+ and Ki67+ cells, as well as marker colocalization) were performed using ImageJ (NIH) on at least 3–5 randomly selected non-overlapping fields of view (FOVs) per organoid section, with a minimum of 3 organoids per differentiation batch. Each FOV was imaged at consistent magnification (60x) and z-stack depth to ensure comparable sampling across conditions. Data from individual FOVs were first averaged within each organoid to obtain an organoid-level mean, and then biological replicates (independent differentiations, n ≥ 3) were averaged to generate the final group mean ± SEM.”

      (d) RNA-seq QC and deposition. Provide mapping rates, batch correction details, and ensure the GEO accession is active. Include these in Methods/Supplement.

      RNA-seq data are from the same batch. The mapping rate is >90%. GEO accession will be active upon publication. These were included in the Methods.

      (e) Suggested fixes

      Add a table summarizing biological replicates, technical replicates, and statistical tests used for each figure panel.

      We have revised the figure legends to include replicates for each figure and statistical tests [see response in weaknesses (1)-b, (1)-c].

      Recompute statistics where appropriate (non-parametric if N is small) and report effect sizes and confidence intervals.

      Statistical analysis method is provided in the revision [see response in Weaknesses (5)-b].

      (6) Minor comments and clarifications

      (a) The authors should validate midbrain identity further with additional regional markers (EN1, OTX2) and show absence/low expression of forebrain markers (FOXG1) across replicates.

      We validated the MLO identity by 1) FOXG1 and 2) EN1. FOXG1 was barely detectable in Wk8 75.1_MLO but highly present in ‘age-matched’ cerebral organoid (CO), suggesting our culturing method is midbrain region-oriented. In nGD MLO, FOXG1 expression is significantly higher than 75.1_MLO, indicating that there was aberrant anterior-posterior brain specification, consistent with the transcriptomic dysregulation observed in our RNA-seq data.

      To further confirm midbrain identity, we examined the expression of EN1, an established midbrain-specific marker. Quantitative RT-PCR analysis demonstrated that EN1 expression increased progressively during differentiation in both WT-75.1 and nGD2-1260 MLOs at weeks 3 and 8 (Author response image 1). EN1 reached 34-fold and 373-fold higher levels than in WT-75.1 iPSCs at weeks 3 and 8, respectively, in WT-75.1 MLOs. In nGD MLOs, although EN1 expression showed a modest reduction at week 8, the levels were not significantly different from those observed in age-matched WT-75.1 MLOs (p > 0.05, ns).

      Author response image 1.

      qRT-PCR quantification of midbrain progenitor marker EN1 expression in WT-75.1 and GD2-1260 MLOs at Wk3 and Wk8. Data was normalized to WT-75.1 hiPSC cells and presented as mean ± SEM (n = 3-4 MLOs per group).ns, not significant.<br />

      (b) Extracellular dopamine ELISA should be complemented with intracellular dopamine or TH+ neuron counts normalized per organoid or per total neurons.

      We quantified TH expression at both the mRNA level (Fig. 3F) and the protein level (Fig. 3G/H) from whole-organoid lysates, which provides a more consistent and integrative measure across samples. These TH expression levels correlated well with the corresponding extracellular (medium) dopamine concentrations for each genotype. In contrast, TH⁺ neuron counts may not reliably reflect total cellular dopamine levels because the number of cells captured on each organoid section varies substantially, making normalization difficult. Measuring intracellular dopamine is an alternative approach that will be considered in future studies.

      (c) For CRISPR editing: the authors should report off-target analysis (GUIDE-seq or targeted sequencing of predicted off-targets) or at least in-silico off-target score and sequencing coverage of the edited locus. (off-target analysis (GUIDE-seq or targeted sequencing of predicted off-targets) or at least in-silico off-target score and sequencing coverage of the edited locus). 

      The off-target effect was analyzed during gene editing and the chance to target other off-targets is low due to low off-target scores ranked based on the MIT Specificity Score analysis. The related method was also updated as stated below:

      “The chance to target other Off-targets is low due to low Off-target scores ranked based on the MIT Specificity Score analysis (Hsu, P., Scott, D., Weinstein, J. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol 31, 827–832 (2013).https://doi.org/10.1038/nbt.2647).”

      (d) It should be clarified as to whether lipidomics normalization is to total protein per organoid or per cell, and include representative LC-MS chromatograms or method QC.

      The normalization was to the protein of the organoid lysate. This was clarified in the Methods section in the revision as stated below:

      “The GluCer and GluSph levels in MLO were normalized to total MLO protein (mg) that were used for glycosphingolipid analyses. Protein mass was determined by BCA assay and glycosphingolipid was expressed as pmol/mg protein. Additionally, GluSph levels in the culture medium were quantified and normalized to the medium volume (pmol/mL).”

      Representative LC-MS chromatograms for both normal and GD MLOs have been included in a new figure, Supplementary Figure 2.

      (e) Figure legends should be improved in order to state the number of organoids, the number of differentiations, and the exact statistical tests used (including multiplecomparison corrections).

      This was addressed above [see response to Weaknesses (1)-b and (5)-b].

      (f) In the title, the authors state "reveal disease mechanisms", but the studies mainly exhibit functional changes. They should consider toning down the statement.

      The title was revised to: Patient-Specific Midbrain Organoids with CRISPR Correction Recapitulate Neuronopathic Gaucher Disease Phenotypes and Enable Evaluation of Novel Therapies

      (7) Recommendations

      This reviewer recommends a major revision. The manuscript presents substantial novelty and strong potential impact but requires additional experimental validation and clearer, more conservative interpretation. Key items to address are:

      (a) Strengthening genetic and biological replication (additional lines or replicate differentiations).

      This was addressed above [see response to Weaknesses (1)-a, (1)-b, (1)-c].

      (b) Adding functional mechanistic validation for major pathways (Wnt/mTOR/autophagy) and providing autophagy flux data.

      (c) Including at least one neuronal functional readout (calcium imaging/MEA/patch) to demonstrate functional rescue.

      As addressed above [see response to Weaknesses (2)], the suggested experiments in b) and c) would provide additional insights into this study and we will consider them in future work. 

      (d) Deepening therapeutic characterization (dose, biodistribution, durability) and including specificity controls.

      This was addressed above [see response to Weaknesses (3)-a to e].

      (e) Improving statistical reporting and explicitly stating biological replicate structure.

      This was addressed above [see response to Weaknesses (1)-b, (5)-b].

      Reviewer #2 (Public review):

      Sun et al. have developed a midbrain-like organoid (MLO) model for neuronopathic Gaucher disease (nGD). The MLOs recapitulate several features of nGD molecular pathology, including reduced GCase activity, sphingolipid accumulation, and impaired dopaminergic neuron development. They also characterize the transcriptome in the MLO nGD model. CRISPR correction of one of the GBA1 mutant alleles rescues most of the nGD molecular phenotypes. The MLO model was further deployed in proof-of-principle studies of investigational nGD therapies, including SapC-DOPS nanovesicles, AAV9-mediated GBA1 gene delivery, and substrate-reduction therapy (GZ452). This patient-specific 3D model provides a new platform for studying nGD mechanisms and accelerating therapy development. Overall, only modest weaknesses are noted.

      We thank the reviewer for the supportive remarks.

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors describe modeling of neuronopathic Gaucher disease (nGD) using midbrain-like organoids (MLOs) derived from hiPSCs carrying GBA1 L444P/P415R or L444P/RecNciI variants. These MLOs recapitulate several disease features, including GCase deficiency, reduced enzymatic activity, lipid substrate accumulation, and impaired dopaminergic neuron differentiation. Correction of the GBA1 L444P variant restored GCase activity, normalized lipid metabolism, and rescued dopaminergic neuronal defects, confirming its pathogenic role in the MLO model. The authors further leveraged this system to evaluate therapeutic strategies, including: (i) SapC-DOPS nanovesicles for GCase delivery, (ii) AAV9-mediated GBA1 gene therapy, and (iii) GZ452, a glucosylceramide synthase inhibitor. These treatments reduced lipid accumulation and ameliorated autophagic, lysosomal, and neurodevelopmental abnormalities.

      Strengths:

      This manuscript demonstrates that nGD patient-derived MLOs can serve as an additional platform for investigating nGD mechanisms and advancing therapeutic development.

      Comments:

      (1) It is interesting that GBA1 L444P/P415R MLOs show defects in midbrain patterning and dopaminergic neuron differentiation (Figure 3). One might wonder whether these abnormalities are specific to the combination of L444P and P415R variants or represent a 

      general consequence of GBA1 loss. Do GBA1 L444P/RecNciI (GD2-10-257) MLOs also exhibit similar defects?

      We observed reduced dopaminergic neuron marker TH expression in GBA1 L444P/RecNciI (GD2-10-257) MLOs, suggesting that this line also exhibits defects in dopaminergic neuron differentiation. These data are provided in a new Supplementary Fig. 4E, and are summarized in new Supplementary Table 2 in the revision.

      (2) In Supplementary Figure 3, the authors examined GCase localization in SapC-DOPSfGCase-treated nGD MLOs. These data indicate that GCase is delivered to TH⁺ neurons, GFAP⁺ glia, and various other unidentified cell types. In fruit flies, the GBA1 ortholog, Gba1b, is only expressed in glia (PMID: 35857503; 35961319). Neuronally produced GluCer is transferred to glia for GBA1-mediated degradation. These findings raise an important question: in wild-type MLOs, which cell type(s) normally express GBA1? Are they dopaminergic neurons, astrocytes, or other cell types?

      All cell types in wild-type MLOs are expected to express GBA1, as it is a housekeeping gene broadly expressed across neurons, astrocytes, and other brain cell types. Its lysosomal function is essential for cellular homeostasis and is therefore not restricted to any specific lineage. (https://www.proteinatlas.org/ENSG00000177628GBA1/brain/midbrain). 

      (3) The authors may consider switching Figures 2 and 3 so that the differentiation defects observed in nGD MLOs (Figure 3) are presented before the analysis of other phenotypic abnormalities, including the various transcriptional changes (Figure 2).

      We appreciate the reviewer’s suggestion; however, we respectfully prefer to retain the current order of Figures 2 and 3, as we believe this structure provides the clearest narrative flow. Figure 2 establishes the core biochemical hallmarks: reduced GCase activity, substrate accumulation, and global transcriptomic dysregulation (1,429 DEGs enriched in neural development, WNT signaling, and lysosomal pathways), which together provide essential molecular context for studying the specific cellular differentiation defects presented in Figure 3. Presenting the broader disease landscape first creates a coherent mechanistic link to the subsequent analyses of midbrain patterning and dopaminergic neuron impairment.

      To enhance readability, we have added a brief transitional sentence at the start of the Figure 3 paragraph: “Building on the molecular and transcriptomic hallmarks of GCase deficiency observed in nGD MLOs (Figure 2), we next investigated the impact on midbrain patterning and dopaminergic neuron differentiation (Figure 3).”

    1. Though headnotes are extremely important to your project, your thoughts cannot remain in your head. Your observations and ideas must make it to the page in order to become primary data and inform your ethnographic essay. A first step in this process of recording observations is to take a few, brief notes when you are in a site and then translate these notes into complete sentences and ideas at a later time.

      This is 100% important for me to keep in mind. Notes need to be written down in order for me to be able to expand on them later on. Using descriptive notes in the beginning will only distract me and lead to weaker observations.

    1. “I suppose now that he has become if I maybe [sic] excused for using the termthe head of a family, will be a very staid and steady boy[,] leastwise I hope so.”

      Authority reframed as responsibility, not power.

    1. Reviewer #2 (Public review):

      Summary:

      The manuscript entitled "Mitochondrial Protein FgDML1 Regulates DON Toxin Biosynthesis and Cyazofamid Sensitivity in Fusarium graminearum by affecting mitochondrial homeostasis" identified the regulatory effect of FgDML1 in DON toxin biosynthesis and sensitivity of Fusarium graminearum to cyazofamid. The manuscript provides a theoretical framework for understanding the regulatory mechanisms of DON toxin biosynthesis in F. graminearum and identifies potential molecular targets for Fusarium head blight control. The paper in innovative, but there are issues in the writing that need to be added and corrected.

      Comments on revisions:

      The author has addressed my questions.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Summary:

      In their study, the authors investigated the F. graminearum homologue of the Drosophila Misato-Like Protein DML1 for a function in secondary metabolism and sensitivity to fungicides.

      Strengths:

      Generally, the topic of the study is interesting and timely, and the manuscript is well written, albeit in some cases, details on methods or controls are missing.

      Weaknesses:

      However, a major problem I see is with the core result of the study, the decrease in the DON content associated with the deletion of FgDML1. Although some growth data are shown in Figure 6, indicating a severe growth defect, the DON production presented in Figure 3 is not related to biomass. Also, the method and conditions for measuring DON are not described. Consequently, it could well be concluded that the decreased amount of DON detected is simply due to decreased growth, and the specific DON production of the mutant remains more or less the same.

      To alleviate this concern, it is crucial to show the details on the DON measurement and growth conditions and to relate the biomass formation under the same conditions to the DON amount detected. Only then can a conclusion as to an altered production in the mutant strains be drawn.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions. The point to point responds to the reviewer’s comments are listed as following. Our method for DON quantification was based on the amount per unit of mycelium. After obtaining the absorbance value from the ELISA reaction, the concentration of DON was calculated according to a standard curve and a formula, then divided by the dry weight of the mycelium to obtain the DON content per unit of mycelium, with the results finally expressed in µg/g.

      (1) Line 139f

      ... FgDML1 is a critical positive regulator of virulence ....

      Clearly, the deletion of FgDML1 impacts virulence, but it is too much of a general effect to say it is a regulator. DML1 acts high up in the cascade, impacting numerous processes, one of which is virulence. Generally, it has to be considered that deletion of DML1 causes a severe growth defect, which in turn is likely to lead to a plethora of effects. Besides discussing this fact, please also revise the manuscript to avoid references to "direct effects" or "regulator".

      Thank you very much for your advice. Our method for determining the amount of DON is based on the amount of mycelium per unit. After obtaining the absorbance value through Elisa reaction, we calculate the concentration of DON toxin according to the established standard curve and formula. Then, we divide it by the dry weight of mycelium to obtain the DON toxin content per unit mycelium, and finally present the results in µg/g. In summary, we conclude that the decrease in DON production by ΔFgDML is not due to slower hyphal growth, but rather a decrease in the ability of unit hyphae to produce DON toxins compared to the wild type. Given the decrease in DON toxin synthesis caused by FgDML1 deficiency, we believe that using a regulator is reasonable.

      (2) Line 143

      Please define "toxin-producing conditions".

      Thank you very much for your advice. We have accurately defined the conditions for toxin-producing conditions in the manuscript' toxin-inducing conditions '(28°C, 145 ×g, 7 days incubation)' (in L163-164)

      (3) Line 149

      A brief intro on toxisomes should be provided in the introduction to better integrate this into the manuscript's results.

      Thank you very much for your advice. We have added corresponding content about toxin producing bodies in the introduction section 'The biosynthesis of DON entails a reorganization of the endoplasmic reticulum into a specialized compartment termed the "toxisome" (Tang et al., 2018). The assembly of the toxisome coincides with the aggregation of key biosynthetic enzymes, which in turn enhances the efficiency of DON production. Concurrently, this compartmentalization serves as a self-defense mechanism, protecting the fungus from the autotoxicity of TRI pathway intermediates (Boenisch et al., 2017). The proteins TRI1, TRI4, TRI14, and Hmr1 are confirmed constituents of this structure(Kistler and Broz, 2015; Menke et al., 2013).' (in L86-93)

      (4) Line 153

      DON production decreases by about 80 %, but not to 0. Consequently, DML1 is important, but NOT essential for DON production.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'FgDML1 is essential for the biosynthesis of the DON toxin. '(in L161)

      (5) Line 168ff

      Please provide a reference for FgDnm1 being critical for mitochondrial fission and state whether such an interaction has been shown in other organisms.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'FgDnm1 is a key dynamin-related protein mediating mitochondrial fission(Griffin et al., 2005; Kang et al., 2023), suggesting that FgDML1 may form a complex with FgDnm1 to regulate mitochondrial fission and fusion processes. To our knowledge, this is the first report documenting an interaction between DML1 and Dnm in any fungal species, including model organisms such as S. cerevisiae. This novel finding provides new insights into the molecular mechanisms underlying mitochondrial dynamics in filamentous fungi. '(in L277-283)

      (6) Line 178

      Please specify whether Complex III activity was related to biomass and provide a p-value or standard deviation for the value.

      Thank you very much for your question. The activity determination of complex III was completed using a complex III enzyme activity kit (Solarbio, Beijing, China) (Li, et al 2022; Wang, et al 2022). Take 0.1 g of standardized mycelium as the sample for the experiment. Given that the mycelium has been homogenized, we believe that there is no necessary correlation between the activity and biomass of complex III. And we also refined the specific measurement steps in the article. ' Briefly, 0.1 g of mycelia was homogenized with 1 mL of extraction buffer in an ice bath. The homogenate was centrifuged at 600 ×g for 10 min at 4°C. The resulting supernatant was then subjected to a second centrifugation at 11,100 ×g for 10 min at 4°C. The pellet was resuspended in 200 μL of extraction buffer and disrupted by ultrasonication (200 W, 5 s pulses with 10 s intervals, 15 cycles). Complex III enzyme activity was finally measured by adding the working solution as per the manufacturer's protocol. Each treatment group contains three biological replicates and three technical replicates. '(in L511-517)

      Li C, et al. Amino acid catabolism regulates hematopoietic stem cell proteostasis via a GCN2-eIF2 axis. Cell Stem Cell. 2022 Jul 7; 29(7):1119-1134.e7. doi: 10.1016/j.stem.2022.06.004. PMID: 35803229.

      Wang K, et al. Locally organised and activated Fth1hi neutrophils aggravate inflammation of acute lung injury in an IL-10-dependent manner. Nat Commun. 2022 Dec 13;13(1):7703. doi: 10.1038/s41467-022-35492-y. PMID: 36513690; PMCID: PMC9745290

      (7) Line 185

      Albeit this headline is a reasonable hypothesis, you actually did not show that the conformation is altered. Please reword accordingly.

      Please also add references for cyazofamid acting on the QI site versus other fungicides acting on the QO site.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'Overexpression of FgQCR2, FgQCR8, and FgQCR9 may alters the conformation of the QI site, resulting in reduced sensitivity to cyazofamid. '(in L212-213). For fungicides targeting Qi and QO sites, we have added corresponding descriptions in the respective sections 'Numerous fungicides have been developed to inhibit the Qo site (e.g., pyraclostrobin, azoxystrobin)(Nuwamanya et al., 2022; Peng et al., 2022) and the Qi site (e.g., cyazofamid)(Mitani et al., 2001) of the cytochrome bc1 complex. '(in L327-329)

      (8) Line 200

      This section on growth should be moved up right after introducing the mutant strain.

      Thank you very much for your advice. We have advanced the part of nutritional growth and sexual asexual development before DON toxin to promote better reading and understanding. We arranged the sequence in the previous way to emphasize the new discovery between mitochondria and DON toxin. We found a significant decrease in DON toxin in ΔFgDML1, defects in the formation of toxin producing bodies, and downregulation of FgTRis at both the gene and protein levels. In summary, we believe that the absence of FgDML1 does indeed lead to a decrease in the content of DON toxin, and FgDML1 plays a regulatory role in the synthesis of DON toxin. In addition, our measurements of DON toxin, acetyl CoA, ATP and other indicators are all based on the amount per unit hyphae, excluding differences caused by hyphal biomass or growth. We have further refined the materials and methods to facilitate better reading and understanding.

      (9) Line 203

      "... significantly reduced growth rates ..."

      This is not what was measured here. Figure 6A shows a plate assay that can be used to assess hyphal extension. In the figure, it is also visible that the mycelium of the deletion mutant is much denser, maybe due to increased hyphal branching. Please reword.

      Additionally, it is important to include a biomass measurement here under the conditions used for DON assessment. Hyphal extension measurements cannot be used instead of biomass.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'The ΔFgDML1 strain displayed a distinct growth phenotype characterized by retardation in radial growth and the formation of more compact, denser hyphal networks on all tested media compared to the PH-1 and ΔFgDML-C strains. '(in L136-138).

      (10) Line 217

      Please include information on how long the cultures were monitored. Given the very slow growth of the mutant, perithecia formation may be considerably delayed beyond 14 days.

      Thank you very much for your advice. Based on your suggestion, we have extended the incubation time for sexual reproduction to 21 days to more accurately evaluate its sexual reproduction ability. Our results show that even after 21 days, Δ FgDML1 still cannot produce ascospores and ascospores, which proves that the absence of FgDML1 does indeed cause sexual reproduction defects in F. graminearum.

      Author response image 1.

      Discussion

      (11) Please mention your summary Figure 8 early on in the discussion, and explain conclusions with this figure in mind. Please avoid repetition of the results section as much as possible.

      Also, please state clearly what was already known from previous research and is in agreement with your results, and what is new (in fungi or generally).

      Thank you very much for your advice. Based on your suggestion, we mentioned Fig8 earlier in the first half of the discussion and provided guidance for the following text. We also conducted a more comprehensive discussion by analyzing our research results and comparing them with previous studies. 'Our study defines a novel mechanism through which FgDML1 governs mitochondrial homeostasis. We demonstrate that FgDML1 directly interacts with the key mitochondrial fission regulator FgDnm1 and positively modulates cellular bioenergetic metabolism, as evidenced by elevated ATP and acetyl-CoA levels (Fig. 8). '(in L250-253). 'The Misato/DML1 protein family is evolutionarily conserved from yeast to humans and plays a critical role in mitochondrial regulation. In S. cerevisiae, DML1 is an essential gene; its deletion is lethal, while its overexpression results in fragmented mitochondrial networks and aberrant cellular morphology, underscoring its necessity for normal mitochondrial function (Gurvitz et al., 2002). Similarly, in Homo sapiens, the homolog Misato localizes to the mitochondrial outer membrane, and both its depletion and overexpression are sufficient to disrupt mitochondrial morphology and distribution (Kimura and Okano, 2007). '(in L241-244).

      (12) Line 262ff

      Please specify if this interaction was shown previously in other organisms and provide references.

      Thank you very much for your advice. We have clearly stated in the corresponding section that the interaction between FgDML and FgDnm is the first reported, and to our knowledge, no relevant reports have been found in other species so far. ' Notably, FgDML1 was found to interact with FgDnm1 (Fig. 5E), FgDnm1 is a key dynamin-related protein mediating mitochondrial fission(Griffin et al., 2005; Kang et al., 2023), suggesting that FgDML1 may form a complex with FgDnm1 to regulate mitochondrial fission and fusion processes. To our knowledge, this is the first report documenting an interaction between DML1 and Dnm in any fungal species, including model organisms such as S. cerevisiae. This novel finding provides new insights into the molecular mechanisms underlying mitochondrial dynamics in filamentous fungi. '(in L276-283)

      (13) Line 287ff

      There is no result that would justify this speculation. Please remove.

      Thank you very much for your advice. We have modified the corresponding wording in the corresponding section. 'In conclusion, our findings suggest that the overexpression of assembly factors FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 potentially modifies the conformation of the Qi site, which specifically modulates the sensitivity of F. graminearum to cyazofamid. '(in L352-355)

      Materials and methods

      (14) A table with all primer sequences used in the study and their purpose is missing. For every experiment, the number of technical and biological replicates needs to be stated.

      Thank you very much for your advice. We have presented all the primers used in this study in Supplementary Table 1 (in Table S1) .We added the number of technical and biological replicates in the material and method descriptions for each experiment. 'For each sample, a total of 200 conidia were counted. The experiment included three biological replicates with three technical replicates each.'(in L434-436). 'Each treatment group contains three biological replicates. '(in L444-445). 'Each treatment group contains three biological replicates and three technical replicates. ' (in L463-464). 'Each treatment group contains three biological replicates and three technical replicates. '(in L474-475). 'Each treatment group contains three biological replicates. '(in L483). 'Each treatment group contains three biological replicates and three technical replicates.'(in L501-502). 'Each treatment group contains three biological replicates and three technical replicates. '(in L516-517). 'The experiment was independently repeated three times. '(in L533-534).

      (15) Line 369ff

      Please provide final concentrations used for assays here.

      Thank you very much for your advice. The final concentration has been displayed in the Figure (in Fig6. A, B) (in Fig. S3). And we have provided supplementary Table 2 to reflect the concentration in a more intuitive way.(in Table. S2)

      (16) Line 383

      Please provide a reference or data on the use of F2du for transformant selection and explain the abbreviation.

      Thank you very much for your advice. Based on your suggestion, we have provided the full name and references of F2du. 'Transformants were selected on PDA plates containing either 100 μg/mL Hygromycin B (Yeasen, Shanghai, China) or 0.2 μmol/mL 5-Fluorouracil 2'-deoxyriboside (F2du) (Solarbio, Beijing, China)(Zhao et al., 2022). '(in L405-407).

      (17) Line 407

      Please provide a reference for the method and at least a brief description.

      Thank you very much for your advice. Based on your suggestion, we have added references and provided a brief introduction to the method. 'As previously described (Tang et al., 2020; Wang et al., 2025), Specifically, coleoptiles were inoculated with conidial suspensions and incubated for 14 days, while leaves were inoculated with fresh mycelial plugs and incubated for 5 days, followed by observation and quantification of disease symptoms. DON toxin was measured using a Wise Science ELISA-based kit (Wise Science, Jiangsu, China) (Li et al., 2019; Zheng et al., 2018). '(in L466-471)

      (18) Line 414ff

      Also, here, the amount of biomass has to be considered for the measurement to be able to distinguish if actually less of the compounds were produced or if the effect seen was merely due to an altered amount of biomass present.

      Thank you very much for your advice. We believe that biomass is not within the scope of our measurement indicators, as we have measured and calculated based on unit hyphae. Therefore, we have ruled out experimental bias caused by a decrease in biomass.

      RNA and RT-qPCR

      (19) Line 461

      When the strains were transferred to AEA medium, was the biomass measured, at least wet weight, and in which culture volume was it done? It makes a big difference if the amount of (wet) biomass dilutes a small amount of fungicide-containing culture or if biomass is added in at least roughly equal amounts in sufficient growth medium to ensure equal conditions.

      Thank you very much for your question. Our sample processing controlled the wet weight of the samples before dosing, ensuring that the wet weight of the mycelium obtained from each sample before dosing was 0.2g. The mycelium was obtained through AEA with a volume of 100mL. This ensured consistency in the initial biomass between groups before dosing, and also ensured the accuracy of the drug concentration.

      (20) Line 466

      Please provide the name and supplier of the kit.

      Thank you very much for your advice. We have added corresponding content in the corresponding location. 'Mycelium was collected and total RNA was extracted following the instructions provided by the Total RNA Extraction Kit (Tiangen, Beijing, China).' (in L523-524).

      (21) All primer sequences must be provided in a table.

      Thank you very much for your advice. We have presented all the primers used in this study in Supplementary Table 1. (in Table S1).

      (22) For RT qPCR it is essential to check the RNA quality to be sure that the obtained results are not artifacts due to varying quality, which may exceed differences. Please state how quality control was done and which threshold was applied for high-quality RNA to be used in RTqPCR (like RIN factor, etc).

      Thank you very much for your question. We performed stringent quality control on the extracted total RNA. First, a micro-spectrophotometer was used to measure RNA concentration and purity, confirming that the A260/A280 ratio was between 1.8 and 2.0 and the A260/A230 ratio was greater than 2.0, indicating good RNA purity without significant protein or organic solvent contamination.Subsequently, verification by agarose gel electrophoresis revealed distinct 28S and 18S rRNA bands, demonstrating good RNA integrity and absence of degradation.

      Author response image 2.

      (B): Minor Comments:

      (1) Please increase the font size of the labels and annotations of the figures; it is hard to read as it is now.

      Thank you very much for your advice. We have increased the size of annotations or numerical labels in the corresponding images for better reading.

      (2) Throughout the manuscript: Please check that all abbreviations are explained at first use.

      Thank you very much for your advice. We have checked the entire text to ensure that abbreviations have their full names when they first appear.

      (3) I do hope that the authors can clarify all concerns and provide an amended manuscript of this interesting story.

      Thank you very much for your advice. Sincerely thank you for your suggestions and questions, which have been very helpful to us.

      Reviewer #2:

      The manuscript entitled "Mitochondrial Protein FgDML1 Regulates DON Toxin Biosynthesis and Cyazofamid Sensitivity in Fusarium graminearum by affecting mitochondrial homeostasis" identified the regulatory effect of FgDML1 in DON toxin biosynthesis and sensitivity of Fusarium graminearum to cyazofamid. The manuscript provides a theoretical framework for understanding the regulatory mechanisms of DON toxin biosynthesis in F. graminearum and identifies potential molecular targets for Fusarium head blight control. The paper is innovative, but there are issues in the writing that need to be addressed and corrected.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions with red words. In the response comments, to highlight the specific positions of the revised parts in the manuscript with red line number. The point to point responds to the reviewer’s comments are listed as following.

      Weaknesses:

      (1) The authors speculate that cyazofamid treatment caused upregulation of the assembly factors, leading to a change in the conformation of the Qi protein, thus restoring the enzyme activity of complex III. But no speculation was given in the discussion as to why this would lead to the upregulation of assembly factors, and how the upregulation of assembly factors would change the protein conformation, and is there any literature reporting a similar phenomenon? I would suggest adding this to the discussion.

      Thank you very much for your advice. Based on your suggestion, we have added content related to the assembly factor of complex III in the discussion section and made modifications to the corresponding wording. 'Previous studies have reported that mutations in the Complex III assembly factors TTC19, UQCC2, and UQCC3 impair the assembly and activity of Complex III (Feichtinger et al., 2017; Wanschers et al., 2014). '(in L345-347). 'In conclusion, our findings suggest that the overexpression of assembly factors FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 potentially modifies the conformation of the Qi site, which specifically modulates the sensitivity of F. graminearum to cyazofamid. '(in L352-355).

      (2) Would increased sensitivity of the mutant to cell wall stress be responsible for the excessive curvature of the mycelium?

      Thank you very much for your question. We believe that the sensitivity of ΔFgDML1 to osmotic stress is reduced, which may not be related to hyphal bending, as shown in the Author response image 3. During the conidia stage, ΔFgDML1 cannot germinate in YEPD, while the application of 1M Sorbitol promotes its germination. But it is caused by internal unknown mechanisms, which is also the focus of our future research.

      Author response image 3.

      (3) The vertical coordinates of Figure 7B need to be modified with positive inhibition rates for the mutants.

      Thank you very much for your advice. The display in Figure 7B truly reflects its inhibition rate. In the Δ FgDML1 mutant, when subjected to osmotic stress treatment, the inhibition rate becomes negative, indicating that the colony growth is greater than that of the CK. Therefore, the negative inhibition rate is shown in Figure 7B.

      (1) In Figure 1B, Figure 3C, and Figure 6C, the scale below the picture is not clear. In Figure 5D, the histogram is unclear, and it is recommended to redraw the graph.

      Thank you very much for your advice. The issue with the above images may be due to Word compression. We have changed the settings and enlarged the images as much as possible to better display them.

      (2) The full Latin name of the strain should be used in the title of figures and tables.

      Thank you very much for your advice. Based on your suggestion, we have used the full names of the strains appearing in the title of figures and tables.

      (3) Proteins in line 117 should be abbreviated.

      Thank you very much for your advice. Based on your suggestion, we have abbreviated the corresponding positions. 'The DML1 protein from S. cerevisiae was used as a query for a BLAST search against the Fusarium genome database, resulting in the identification of the putative DML1 gene FgDML1 (FGSG_05390) in F. graminearum. '(in L118-120).

      (4) The sentence in lines 187-189, which is supposed to introduce why the test is sensitive to the three drugs, is currently illogical.

      Thank you very much for your advice. Based on your suggestion, we have made modifications to the corresponding sections. 'Since Complex III is involved in the action of both cyazofamid (targeting the QI site) and pyraclostrobin (targeting the QO site), the sensitivity of ΔFgDML1 to cyazofamid and pyraclostrobin was investigated. ' (in L214-216).

      (5) The expression of FgQCR2, FgQCR7, and FgQCR8 was significantly upregulated in ΔFgDML1 at transcription levels. Do FgQCR2, FgQCR8, and FgQCR9 show upregulated expression at the protein level?

      Thank you very much for your question. Based on your suggestion, we evaluated the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in PH-1 and ΔFgDML1, and we found that the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 were higher than those in PH-1. (in Fig. 6F).

      (6) In Figure 7B, it is recommended to adjust the position of the horizontal axis labels in the histogram.

      Thank you very much for your advice. Based on your suggestion, we have made modifications to the corresponding sections.(in Fig. 7B)

      (7) There are numerous errors in the writing of gene names in the text. Please check the full text and change the writing of gene names and mutant names to italic.

      Thank you very much for your advice. We have checked the entire text to ensure that all genes have been italicized.

      (8) All acronyms should be spelled out in figure and table captions. e.g., F. graminearum.

      Thank you very much for your advice. Based on your suggestion, we have used the full names of the strains appearing in the title of figures and tables.

      (9) In line 492, P should be lowercase and italic.

      Thank you very much for your advice. Based on your suggestion, we have made adjustments to the corresponding content.

      Reviewer #3:

      Summary:

      The manuscript "Mitochondrial 1 protein FgDML1 regulates DON toxin biosynthesis and cyazofamid sensitivity in Fusarium graminearum by affecting mitochondrial homeostasis" describes the construction of a null mutant for the FgDML1 gene in F. graminearum and assays characterising the effects of this mutation on the pathogen's infection process and lifecycle. While FgDML1 remains underexplored with an unclear role in the biology of filamentous fungi, and although the authors performed several experiments, there are fundamental issues with the experimental design and execution, and interpretation of the results.

      Strengths:

      FgDML1 is an interesting target, and there are novel aspects in this manuscript. Studies in other organisms have shown that this protein plays important roles in mitochondrial DNA (mtDNA) inheritance, mitochondrial compartmentalisation, chromosome segregation, mitochondrial distribution, mitochondrial fusion, and overall mitochondrial dynamics. Indeed, in Saccharomyces cerevisiae, the mutation is lethal. The authors have carried out multi-faceted experiments to characterise the mutants.

      Weaknesses:

      However, I have concerns about how the study was conceived. Given the fundamental importance of mitochondrial function in eukaryotic cells and how the absence of this protein impacts these processes, it is unsurprising that deletion of this gene in F. graminearum profoundly affects fungal biology. Therefore, it is misleading to claim a direct link between FgDML1 and DON toxin biosynthesis (and virulence), as the observed effects are likely indirect consequences of compromised mitochondrial function. In fact, it is reasonable to assume that the production of all secondary metabolites is affected to some extent in the mutant strains and that such a strain would not be competitive at all under non-laboratory conditions. The order in which the authors present the results can be misleading, too. The results on vegetative growth rate appeared much later in the manuscript, which should have come first, as the FgDML1 mutant exhibited significant growth defects, and subsequent results should be discussed in that context. Moreover, the methodologies are not described properly, making the manuscript hard to follow and difficult to replicate.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions with red words. In the response comments, to highlight the specific positions of the revised parts in the manuscript with red line number. The point to point responds to the reviewer’s comments are listed as following.

      For weaknesses,we arranged the sequence in this way to emphasize the novel discovery between mitochondria and DON toxin. We found a significant decrease in DON toxin in Δ FgDML1, defects in the formation of toxin producing bodies, and downregulation of FgTRis at both the gene and protein levels. In summary, we believe that the absence of FgDML1 does indeed lead to a decrease in the content of DON toxin, and FgDML1 plays a regulatory role in the synthesis of DON toxin. In addition, our measurements of DON toxin, acetyl CoA, ATP and other indicators are all based on the amount per unit hyphae, excluding differences caused by hyphal biomass or growth. We have further refined the materials and methods to facilitate better reading and understanding.

      (1) Lines 37-39: The disease itself does not produce toxins; it is the fungus that causes the disease that produces toxins. Moreover, the disease symptoms observed are likely caused by the toxins produced by the fungus.

      Thank you very much for your advice. We have made modifications to the wording of the corresponding sections. 'Studies have shown that increased DON levels are positively correlated with the pathogenicity rate of F. graminearum.'(in L36-37).

      (2) Lines 82-87: While it is challenging to summarise the role of ATP in just a few words, this section needs improvement for clarity and accuracy. Additionally, I do not believe that drawing a direct link between mitochondrial defects and toxin production is an appropriate strategy in this case.

      Thank you very much for your advice. Based on your suggestion, we have added corresponding descriptions in the corresponding positions to provide more information on the relationship between ATP and toxins, in order to better prepare for the following text. 'Pathogen-intrinsic ATP homeostasis is recognized as a critical, rate-limiting determinant for toxin biosynthesis. Previous studies indicate that dual-target inhibition of ATP synthase (AtpA) and adenine deaminase (Ade) by a specific small-molecule probe effectively depletes intracellular ATP, consequently suppressing the synthesis of key virulence factors TcdA and TcdB transcriptionally and translationally(Marreddy et al., 2024). The systemic toxicity of Anthrax Edema Toxin (ET) is primarily attributed to its catalytic activity, which depletes the host cell's ATP reservoir, thereby triggering a bioenergetic collapse that culminates in cell lysis and death(Liu et al., 2025). '(in L78-86).

      (3) Lines 125-126: The manuscript does not clearly describe how subcellular localisation was determined. This methodology needs to be properly detailed.

      Thank you very much for your advice. The subcellular localization was validated through co-localization analysis with MitoTracker Red CMXRos, a mitochondrial-specific dye. The observed overlap between the FgDML1-GFP signal and the mitochondrial marker confirmed mitochondrial localization. Based on these results, we determined that FgDML1 is definitively localized to the mitochondria.We have incorporated this description in the appropriate section of the manuscript. 'Furthermore, subcellular localization studies confirmed that FgDML1 localizes to mitochondria, as demonstrated by colocalization with a mitochondria-specific dye MitoTracker Red CMXRos (Fig. 1B). '(in L125-127).

      (4) Regarding the organisation of the Results section, it needs to be revised. While I understand the authors' intention to emphasise the impact on virulence, the results showing how FgDML1 deletion affects vegetative growth, asexual and sexual reproduction, and sensitivity to stressors should be presented before the virulence assays and effects on DON production. Additionally, the authors do not provide any clear evidence that FgDML1 directly interacts with proteins involved in asexual or sexual reproduction, stress responses, or virulence. Therefore, it is misleading to suggest that FgDML1 directly regulates these processes. The observed phenotypes are, rather, a consequence of severely impaired mitochondrial function. Without functional mitochondria, the cell cannot operate properly, leading to widespread physiological defects. In this regard, statements such as those in lines 139-140 and 343-344 are misleading.

      Thank you very much for your advice. We have adjusted the order of the images based on your suggestion, placing the characterization of ΔFgDML1 in nutritional growth, sexual reproduction, and other aspects before DON toxin. And we have made adjustments to the corresponding statements. 'These findings demonstrate that FgDML1 is a positive regulator of virulence in F. graminearum. '(in L140-141).

      (5) Lines 185-186: The authors do not provide sufficient evidence to support the claim that FgQCR2, FgQCR8, and FgQCR9 overexpression is the main cause of reduced cyazofamid sensitivity. Although expression of these genes is altered, reduced sensitivity may result from changes in other proteins or pathways. To strengthen this claim, overexpression of FgQCR2, 8, and 9 in the wild-type background, followed by assessment of cyazofamid resistance, would be necessary. As it stands, there is no support for the claim presented in lines 329-332.

      Thank you very much for your advice. To establish a causal link between the overexpression of FgQCR2, FgQCR7, and FgQCR8 and the observed reduction in cyazofamid sensitivity, we first quantified the protein levels of these assembly factor. Western blot analysis confirmed their elevated expression in the ΔFgDML1 mutant compared to the wild-type PH-1. We further generated individual overexpression strains for FgQCR2, FgQCR7, and FgQCR8 in the wild-type PH-1 background. Fungicide sensitivity assays revealed that all three overexpression mutants displayed significantly reduced sensitivity to cyazofamid compared to the parental strain. These genetic complementation experiments confirm that upregulation of FgQCR2, FgQCR7, and FgQCR8 is sufficient to confer reduced cyazofamid sensitivity.We have incorporated these explanations and provided supporting images in the appropriate section of the manuscript. 'To further clarify whether the upregulated expression of FgQCR2, FgQCR7, and FgQCR8 genes affects their protein expression levels, we measured the protein levels. The results showed that the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 were higher than those in PH-1(Fig. 6F). Subsequently, we overexpressed FgQCR2, FgQCR7, and FgQCR8 in the wild-type background, and the corresponding overexpression mutants exhibited reduced sensitivity to cyazofamid(Fig. 6E). '(in L205-211)(in Fig. 6E, F)

      (6) Lines 187-190: This segment is confusing and difficult to follow. It requires rewriting for clarity.

      Thank you very much for your advice. Based on your suggestion, we have made corresponding modifications in the corresponding locations. 'Since Complex III is involved in the action of both cyazofamid (targeting the QI site) and pyraclostrobin (targeting the QO site), the sensitivity of ΔFgDML1 to cyazofamid and pyraclostrobin was investigated. ''(in L214-216)

      (7) Lines 345-346: The authors state that in this study, FgDML1 is localised in mitochondria, which implies that in other studies, its localisation was different. Is this accurate? Clarification is needed.

      Thank you very much for your question. In previous studies, the localization of this protein was not clearly defined, and its function was only emphasized to be related to mitochondria. Whether in yeast or in Drosophila melanogaster. (Miklos et al., 1997; Gurvitz et al., 2002)

      Miklos GLG, Yamamoto M-T, Burns RG, Maleszka R. 1997. An essential cell division gene of drosophila, absent from saccharomyces, encodes an unusual protein with  tubulin-like and myosin-like peptide motifs. Proc Natl Acad Sci 94:5189–5194. doi:10.1073/pnas.94.10.5189

      Gurvitz A, Hartig A, Ruis H, Hamilton B, de Couet HG. 2002. Preliminary characterisation of DML1, an essential saccharomyces cerevisiae gene related to misato of drosophila melanogaster. FEMS Yeast Res 2:123–135. doi:10.1016/S1567-1356(02)00083-1

      Material and Methods Section

      (8) In general, the methods require more detailed descriptions, including the brands and catalog numbers of reagents and kits used. Simply stating that procedures were performed according to manufacturers' instructions is insufficient, particularly when the specific brand or kit is not identified.

      Thank you very much for your advice. We have added corresponding content based on your suggestion to more comprehensively display the reagent brand and complete product name. 'Transformants were selected on PDA plates containing either 100 μg/mL Hygromycin B (Yeasen, Shanghai, China) or 0.2 μmol/mL 5-Fluorouracil 2'-deoxyriboside (F2du) (Solarbio, Beijing, China)(Zhao et al., 2022). ' (in L405-407). 'DON toxin was measured using a Wise Science ELISA-based kit (Wise Science, Jiangsu, China) (Li et al., 2019; Zheng et al., 2018) '. (in L469-471)

      (9) Line 364: What do CM and MM stand for? Please define.

      Thank you very much for your advice. Based on your suggestion, we have made modifications in the corresponding locations. 'To evaluate vegetative growth, complete medium (CM), minimal medium (MM), and V8 Juice Agar (V8) media were prepared as described previously(Tang et al., 2020). '(in L385-387)

      Generation of Deletion and Complemented Mutants:

      (10) This section lacks detail. For example, were PCR products used directly for PEG-mediated transformation, or were the fragments cloned into a plasmid?

      Thank you very much for your question. We directly use the fused fragments for protoplast transformation after sequencing confirmation. We have clearly defined the fragment form used for transformation at the corresponding location. 'The resulting fusion fragment was transformed into the wild-type F. graminearum PH-1 strain via polyethylene glycol (PEG)-mediated protoplast transformation. '(in L403-405).

      (11) PCR and Southern blot validation results should be included as supplementary material, along with clear interpretations of these results.

      Thank you very much for your advice. In the supplementary material we submitted, Supplementary Figure 2 already includes the results of PCR and Southern blot validation.(in Fig. S2)

      (12) There is almost no description of how the mutants mentioned in lines 388-390 were generated.

      Thank you very much for your advice. Based on your suggestions, we have added relevant content in the appropriate sections to more comprehensively and clearly reflect the experimental process. 'Specifically, FgDML1, including its native promoter region and open reading frame (ORF) (excluding the stop codon), was amplified.The PCR product was then fused with the XhoI -digested pYF11 vector. After transformation into E. coli and sequence verification, the plasmid was extracted and subsequently introduced into PH-1 protoplasts. For FgDnm1-3×Flag, the 3×Flag tag was added to the C-terminus of FgDnm1 by PCR, fused with the hygromycin resistance gene and the FgDnm1 downstream arm, and then introduced into PH-1 protoplasts. The overexpression mutant was constructed according to a previously described method. Specifically, the ORF of FgDML1 was amplified and the PCR product was ligated into the SacII-digested pSXS overexpression vector. The resulting plasmid was then transformed into PH-1 protoplasts (Shi et al., 2023). For the construction of PH-1::FgTri1+GFP and ΔFgDML1::FgTri1+GFP, the ORF of FgTri1 was amplified and ligated into the XhoI-digested pYF11 vector as described above. The resulting vectors were then transformed into protoplasts of PH-1 or ΔFgDML1, respectively.'(in L413-426).

      Vegetative Growth and Conidiation Assays:

      (13) There is no information about how long the plates were incubated before photos were taken. Judging by the images, it appears that different incubation times may have been used.

      Thank you very much for your advice. Due to the slower growth of ΔFgDML1, we adopted different incubation periods and have supplemented the relevant content in the corresponding section. 'All strains were incubated at 25°C in darkness; however, due to ΔFgDML1 slower growth, the ΔFgDML1 mutant required a 5-day incubation period compared to the 3 days used for PH-1 and ΔFgDML1-C. '(in L490-493).

      (14) There is no description of the MBL medium.

      Thank you very much for your advice. Based on your suggestion, we have supplemented the corresponding content in the corresponding positions. 'Mung bean liquid (MBL) medium was used for conidial production, while carrot agar (CA) medium was utilized to assess sexual reproduction(Wang et al., 2011). '(in L387-389).

      DON Production and Pathogenicity Assays:

      (15) Were DON levels normalised to mycelial biomass? The vegetative growth assays show that FgDML1 null mutants exhibit reduced growth on all tested media. If mutant and wild-type strains were incubated for the same period under the same conditions, it is reasonable to assume that the mutants accumulated significantly less biomass. Therefore, results related to DON production, as well as acetyl-CoA and ATP levels, must be normalised to biomass.

      Thank you very much for your question. We have taken into account the differences in mycelial biomass. Therefore, when measuring DON, acetyl-CoA, and ATP levels, all data were normalized to mycelial mass and calculated as amounts per unit of mycelium, thereby avoiding discrepancies arising from variations in biomass.

      Sensitivity Assays:

      (16) While the authors mention that gradient concentrations were used, the specific concentrations and ranges are not provided. Importantly, have the plates shown in Figure 5 been grown for different periods or lengths? Given the significantly reduced growth rate shown in Figure 6A, the mutants should not have grown to the same size as the WT (PH-1) as shown in Figures 5A and 5B unless the pictures have been taken on different days. This needs to be explained.

      Thank you very much for your question. Due to the slower growth of ΔFgDML1, we adopted different incubation periods and have supplemented the relevant content in the corresponding section. 'All strains were incubated at 25°C in darkness; however, due to ΔFgDML1 slower growth, the ΔFgDML1 mutant required a 5-day incubation period compared to the 3 days used for PH-1 and ΔFgDML1-C. '(in L490-493).

      (17) Additionally, was inhibition measured similarly for both stress agents and fungicides? This should be clarified.

      Thank you very much for your question. We have supplemented the specific concentration gradient of fungicides. 'The concentration gradients for each fungicide in the sensitivity assays were set up according to Supplementary Table S2. '(in L493-494)(in Table. S2).

      Complex III Enzyme Activity:

      (18) A more detailed description of how this assay was performed is needed.

      Thank you very much for your advice. We have provided further detailed descriptions of the corresponding sections. 'Briefly, 0.1 g of mycelia was homogenized with 1 mL of extraction buffer in an ice bath. The homogenate was centrifuged at 600 ×g for 10 min at 4°C. The resulting supernatant was then subjected to a second centrifugation at 11,000 ×g for 10 min at 4°C. The pellet was resuspended in 200 μL of extraction buffer and disrupted by ultrasonication (200 W, 5 s pulses with 10 s intervals, 15 cycles). Complex III enzyme activity was finally measured by adding the working solution as per the manufacturer's protocol. '(in L511-517)

      (19) Were protein concentrations standardised prior to the assay?

      Thank you very much for your question. Protein concentrations for all Western blot samples were quantified using a BCA assay kit to ensure equal loading.

      (20) Line 448: Are ΔFgDML1::Tri1+GFP and ΔFgDML1+GFP the same strain? ΔFgDML1::Tri1+GFP has not been previously described.

      Thank you very much for your question. These two strains are not the same strain, and we have supplemented their construction process in the corresponding section. 'For the construction of PH-1::FgTri1+GFP and ΔFgDML1::FgTri1+GFP, the ORF of FgTri1 was amplified and ligated into the XhoI-digested pYF11 vector as described above. The resulting vectors were then transformed into protoplasts of PH-1 or ΔFgDML1, respectively. '(in L423-426)

      (21) Lines 460 and 468: Please adopt a consistent nomenclature, either RT-qPCR or qRT-PCR.

      Thank you very much for your advice. We have unified it and modified the corresponding content in the corresponding sections. 'Reverse Transcription Quantitative Polymerase Chain Reaction (RT-qPCR) was carried out using the QuantStudio 6 Flex real-time PCR system (Thermo, Fisher Scientific, USA) to assess the relative expression of three subunits of Complex III (FgCytb, FgCytc1, FgISP), five assembly factors (FgQCR2, FgQCR6, FgQCR7, FgQCR8, FgQCR9), and DON biosynthesis-related genes (FgTri5 and FgTri6). '(in L526-531)

      (22) Lines 472-473: Why was FgCox1 used as a reference for FgCytb? Clarification is needed.

      Thank you very much for your question. FgCytb (cytochrome b) and FgCOX1 (cytochrome c oxidase subunit I) are both encoded by the mitochondrial genome and serve as core components of the oxidative phosphorylation system (Complex III and Complex IV, respectively). Their transcription is co-regulated by mitochondrial-specific mechanisms in response to cellular energy status. Consequently, under experimental conditions that perturb energy homeostasis, FgCOX1 expression exhibits relative, context-dependent stability with FgCytb, or at least co-varies directionally, making it a superior reference for normalizing target gene expression. In contrast, FgGapdh operates within a distinct genetic and regulatory system. Using FgCOX1 ensures that both reference and target genes reside within the same mitochondrial compartment and functional module, thereby preventing normalization artifacts arising from independent variation across disparate pathways.

      (23) Lines 476-477: This step requires a clearer and more detailed explanation.

      Thank you very much for your advice. We provided detailed descriptions of them in their respective positions. 'For FgDnm1-3×Flag, the 3×Flag tag was added to the C-terminus of FgDnm1 by PCR, fused with the hygromycin resistance gene and the FgDnm1 downstream arm, and then introduced into PH-1 protoplasts. '(in L417-419). 'The FgDnm1-3×Flag fragment was introduced into PH-1 and FgDML1+GFP protoplasts, respectively, to obtain single-tagged and double-tagged strains. '(in L541-543)

      Western blotting:

      (24) Uncropped Western blot images should be provided as supplementary material.

      Thank you very much for your advice. All Western blot images will be submitted to the supplementary material package.

      (25) Lines 485-489: A more thorough description of the antibodies used (including source, catalogue number, and dilution) is necessary.

      Thank you very much for your advice. The antibodies used are clearly stated in terms of brand, catalog number, and dilution. We have added the dilution ratio. 'All antibodies were diluted as follows: primary antibodies at 1:1000 and secondary antibodies at 1:10000. '(in L550-551)

      (26) The Western blot shown in Figure 3D appears problematic, particularly the anti-GAPDH band for FgDML1::FgTri1+GFP. Are both anti-GAPDH bands derived from the same gel?

      Thank you very much for your advice. We are unequivocally certain that these data derive from the same gel. Therefore, we are providing the original image for your inspection.

      Author response image 4.

    1. Reviewer #2 (Public review):

      Summary:

      Xu et al. used fMRI to examine the neural correlates associated with retrieving temporal information from an external compared to internal perspective ('mental time watching' vs. 'mental time travel'). Participants first learned a fictional religious ritual composed of 15 sequential events of varying durations. They were then scanned while they either (1) judged whether a target event happened in the same part of the day as a reference event (external condition); or (2) imagined themselves carrying out the reference event and judged whether the target event occurred in the past or will occur in the future (internal condition). Behavioural data suggested that the perspective manipulation was successful: RT was positively correlated with sequential distance in the external perspective task, while a negative correlation was observed between RT and sequential distance for the internal perspective task. Neurally, the two tasks activated different regions, with the external task associated with greater activity in the supplementary motor area and supramarginal gyrus, and the internal condition with greater activity in default mode network regions. Of particular interest, only a cluster in the posterior parietal cortex demonstrated a significant interaction between perspective and sequential distance, with increased activity in this region for longer sequential distances in the external task but increased activity for shorter sequential distances in the internal task. Only a main effect of sequential distance was observed in the hippocampus head, with activity being positively correlated with sequential distance in both tasks. No regions exhibited a significant interaction between perspective and duration, although there was a main effect of duration in the hippocampus body with greater activity for longer durations, which appeared to be driven by the internal perspective condition. On the basis of these findings, the authors suggest that the hippocampus may represent event sequences allocentrically, whereas the posterior parietal cortex may process event sequences egocentrically.

      Strengths:

      The topic of egocentric vs. allocentric processing has been relatively under-investigated with respect to time, having traditionally been studied in the domain of space. As such, the current study is timely and has the potential to be important for our understanding of how time is represented in the brain in the service of memory. The study is well thought out and the behavioural paradigm is, in my opinion, a creative approach to tackling the authors' research question. A particular strength is the implementation of an imagination phase for the participants while learning the fictional religious ritual. This moves the paradigm beyond semantic/schema learning and is probably the best approach besides asking the participants to arduously enact and learn the different events with their exact timings in person. Importantly, the behavioural data point towards successful manipulation of internal vs. external perspective in participants, which is critical for the interpretation of the fMRI data. The use of syllable length as a sanity check for RT analyses as well as neuroimaging analyses is also much appreciated.

      Suggestions:

      The authors have done a commendable job addressing my previous comments. In particular, the additional analyses elucidating the potential contribution of boundary effects to the behavioural data, the impact of incorporating RT into the fMRI GLMs, and the differential contributions of RT and sequential distance to neural activity (i.e., in PPC) are valuable and strengthen the authors' interpretation of their findings.

      My one remaining suggestion pertains to the potential contribution of boundary effects. While the new analyses suggest that the RT findings are driven by sequential distance and duration independent of a boundary effect (i.e., Same vs. Different factor), I'm wondering whether the same applies to the neural findings? In other words, have the authors run a GLM in which the Same vs. Different factor is incorporated alongside distance and duration?

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this fMRI study, the authors wished to assess neural mechanisms supporting flexible "temporal construals". For this, human participants learned a story consisting of fifteen events. During fMRI, events were shown to them, and they were instructed to consider the event from "an internal" or from "an external" perspective. The authors found opposite patterns of brain activity in the posterior parietal cortex and the anterior hippocampus for the internal and the external viewpoint. They conclude that allocentric sequences are stored in the hippocampus, whereas egocentric sequences are used in the parietal cortex. The claims align with previous fMRI work addressing this question.

      We appreciate the reviewer's concise summary of our research. We would like to offer two clarifications to prevent any potential misunderstandings.

      First, the activity patterns in the parietal cortex and hippocampus are not entirely opposite across internal and external perspectives. Specifically, the activation level in the posterior parietal cortex shows a positive correlation with sequential distance during external-perspective tasks, but a negative correlation during internal-perspective tasks. In contrast, the activation level in the anterior hippocampus positively correlates with sequential distance, irrespective of the observer's perspective. Therefore, our results suggest that the parietal cortex, with its perspective-dependent activity, supports egocentric representation; the hippocampus, with its consistent activity across perspectives, supports allocentric representation.

      Second, while some of our findings align with previous fMRI studies, to our knowledge, no prior research has explicitly investigated how the neural representation of time may vary depending on the observer's viewpoint. This gap in the literature is the primary motivation for our current study.

      Strengths:

      The research topic is fascinating, and very few labs in the world are asking the question of how time is represented in the human brain. Working hypotheses have been recently formulated, and this work seems to want to tackle some of them.

      We appreciate the reviewer's acknowledgment of the theoretical significance of our study.

      Weaknesses:

      The current writing is fuzzy both conceptually and experimentally. I cannot provide a sufficiently well-informed assessment of the quality of the experimental work because there is a paucity of details provided in the report. Any future revisions will likely improve transparency.

      (1) Improving writing and presentation:

      The abstract and the introduction make use of loaded terms such as "construals", "mental timeline", "panoramic views" in very metaphoric and unexplained ways. The authors do not provide a comprehensive and scholarly overview of these terms, which results in verbiage and keywords/name-dropping without a clear general framework being presented. Some of these terms are not metaphors. They do refer to computational concepts that the authors should didactically explain to their readership. This is all the more important that some statements in the Introduction are misattributed or factually incorrect; some statements lack attributions (uncited published work). Once the theory, the question, and the working hypothesis are clarified, the authors should carefully explain the task.

      We appreciate the reviewer's critics.

      The formulation of the scientific question in the introduction is grounded in the spatial construals of time hypothesis and conceptual metaphor theory (e.g., Traugott, 1978; Lakoff & Johnson, 1980; see recent reviews by Núñez & Cooperrider, 2013; Bender & Beller, 2014). These frameworks were originally developed through analyses of how spatial metaphors are used to describe temporal concepts in natural language. Consequently, it is theoretically motivated and largely unavoidable to introduce the two primary temporal construals—mental time travel and mental time watching— using metaphorical expressions.

      However, we do agree with the reviewer that the introduction in the original manuscript was overly long and that the working hypothesis was not clearly stated. In the revised manuscript, we have streamlined the introduction and substantially revised the following two paragraphs to clarify the formulation of our working hypothesis (Pages 5-6):

      “Recent studies have already begun to investigate the neural representation of the memorized event sequence (e.g., Deuker et al., 2016; Thavabalasingam et al., 2018; Bellmund et al., 2019, 2022; see reviews by Cohn-Sheehy & Ranganath, 2017; Bellmund et al., 2020). Yet, the neural mechanisms that enable the brain to construct distinct construals of an event sequence remain largely unknown. Valuable insights may be drawn from research in the spatial domain, which diPerentiates the neural representation in allocentric and egocentric reference frames. According to an influential neurocomputational model (Byrne et al., 2007; Bicanski & Burgess, 2018; Bicanski & Burgess, 2020), allocentric and egocentric spatial representations are dissociable in the brain—they are respectively implemented in the medial temporal lobe (MTL)—including the hippocampus—and the parietal cortex. Various egocentric representations in the parietal cortex derived from diPerent viewpoints can be transformed and integrated into a unified allocentric representation and stored in the MTL (i.e., bottom-up process). Conversely, the allocentric representation in the MTL can serve as a template for reconstructing diverse egocentric representations across diPerent viewpoints in the parietal cortex (i.e., top-down process).”

      “In line with the spatial construals of time hypothesis, several authors have recently suggested that such mutually engaged egocentric and allocentric reference frames (in the parietal cortex and the medial temporal lobe, respectively) proposed in the spatial domain might also apply to the temporal one (e.g., Gauthier & van Wassenhove, 2016ab; Gauthier et al., 2019, 2020; Bottini & Doeller, 2020). If this hypothesis holds, it could explain how the brain flexibly generates diverse construals of the same event sequence. Specifically, the hippocampus may encode a consistent representation of an event sequence that is independent of whether an individual adopts an internal or external perspective, reflecting an allocentric representation of time. In contrast, parietal cortical representations are expected to vary flexibly with the adopted perspective that is shaped by task demands, reflecting an egocentric representation of time.”

      In the revised manuscript, we also corrected statements in the Introduction that may have been misattributed (see Reviewer 2, comment 4(ii)) and added several relevant and important publications.

      (2) The experimental approach lacks sufficient details to be comprehensible to a general audience. In my opinion, the results are thus currently uninterpretable. I highlight only a couple of specific points (out of many). I recommend revision and clarification.

      (a) No explanation of the narrative is being provided. The authors report a distribution of durations with no clear description of the actual sequence of events. The authors should provide the text that was used, how they controlled for low-level and high-level linguistic confounds.

      We thank the reviewer for the suggestions. The event sequence for the odd-numbered participants is shown in the original Figure 1. In the revised manuscript, we added to Figure 1 the figure supplement 1 to illustrate the actual sequence of events for the participants with both odd and even numbers. We also added the narratives used in the reading phase of the learning procedures for the participants with both odd and even numbers (Figure 1—source data 1).

      To control for low-level linguistic confounds, we included the number of syllables as a covariate in the first-level general linear model in the fMRI analysis. To address high-level linguistic confounds, such as semantic information (which is difficult to quantify), we randomly assigned event labels to the 15 events twice, creating two counterbalanced versions for participants with even and odd numbers (see Comment 2b below).

      (b) The authors state, "we randomly assigned 15 phrases to the events twice". It is impossible to comprehend what this means. Were these considered stimuli? Controls? IT is also not clear which event or stimulus is part of the "learning set" and whether these were indicated to be such to participants.

      We apologize for any confusion in the Results section and the legend of Figure 1. Our motivation was explained in the "Stimuli" section of the Methods. In the revised manuscript, we have clarified this by adding an explanation to the legend of Figure 1 and including the supplementary Figure 1: " To minimize potential confounds between the semantic content of the event phrases and the temporal structure of the events, we randomly assigned the phrases to the events, creating two versions for participants with even and odd ID numbers. Both versions can be seen in Figure1—figure supplement 1 and Figure 1—source data 1."

      (c) The left/right counterbalancing is not being clearly explained. The authors state that there is counterbalancing, but do not sufficiently explain what it means concretely in the experiment. If a weak correlation exists between sequential position and distance, it also means that the position and the distance have not been equated within. How do the authors control for these?

      We thank the reviewer for highlighting this point and apologize for the lack of clarity in the original manuscript. In the current version (Page 40), we have provided further clarification: “We carefully selected two sets of 20 event pairs from the 210 possible combinations, assigning them to the odd and even runs of the fMRI experiment. Using a brute-force search, we identified 20 pairs in which sequential distance showed only weak correlations with positional information for both reference and target events (ranging from 1 to 15), as well as with behavioral responses (Same vs. Different or Future vs. Past, coded as 0 and 1), with all correlation coefficients below 0.2. At the same time, we balanced the proportion of correct responses across conditions: for the external-perspective task, Same/Different = 11/9 and 12/8; for the internal-perspective task, Future/Past = 12/8 and 8/12. Under these constraints, the sequential distances in both sets ranged from 1 to 5. To further mitigate spatial response biases, we pseudorandomized the left/right on-screen positions of the two response options within each task block, while ensuring an equal number of correct responses mapped to the left and right buttons (i.e., 10 per block).”

      The event pairs we selected already represent the best possible choice given all the criteria we aimed to satisfy. It is impossible to completely eliminate all potential correlations. For instance, if the target event occurs near the beginning of the day, it will tend to fall in the past, whereas if it occurs near the end of the day, it is more likely to fall in the future. To further ensure that the significant results were not driven by these weak confounding factors, we constructed another GLM that included three additional parametric modulators: the sequence position of the target event (ranging from 1 to 15) and the behavioral responses (Future vs. Past in the internal-perspective task; Same vs. Different in the external-perspective task, coded as 0 and 1). The significant findings were unaffected.

      (d) The authors used two tasks. In the "external perspective" one, the authors asked participants to report whether events were part of the same or a different part of the day. In the "internal perspective one", the authors asked participants to project themselves to the reference event and to determine whether the target event occurred before or after the projected viewpoint. The first task is a same/different recognition task. The second task is a temporal order task (e.g., Arzy et al. 2009). These two asks are radically different and do not require the same operationalization. The authors should minimally provide a comprehensive comparison of task requirements, their operationalization, and, more importantly, assess the behavioral biases inherent to each of these tasks that may confound brain activity observed with fMRI.

      We understand the reviewer’s concern. We agree that there is a substantial difference between the two tasks. However, the primary goal of this study was not to directly compare these tasks to isolate a specific cognitive component. Rather, the neural correlates of temporal distance were first identified as brain regions showing a significant correlation between neural activity and temporal distance using the parametric modulation analysis. We then compared these neural correlates between the two tasks. Therefore, any general differences between the tasks should not be a confound for our main results. Our aim was to examine whether the hippocampal representation of temporal distance remains consistent across different perspectives, and whether the parietal representation of temporal distance varies as a function of the perspective adopted.

      Therefore, the main aim of our task manipulation was to ensure that participants adopted either an external or an internal perspective on the event sequence, depending on the task condition. In the Introduction (Pages 6–7), we clarify this manipulation as follows: “In the externalperspective task, participants localized events with respect to external temporal boundaries, judging whether the target event occurred in the same or a different part of the day as the reference event. In the internal-perspective task, participants were instructed to mentally project themselves into the reference event and localize the target event relative to their own temporal point, judging whether the target event happened in the future or the past of the reference event (see Methods for details of the scanning procedure).”

      We believe this task manipulation was successful. Behaviorally, the two tasks showed opposite correlations between reaction time and temporal distance, resembling the symbolic distance versus mental scanning effect. Neurally, contrasting the internal- and external-perspective tasks revealed activation of the default mode network, which is known to play a central role in self-projection (Buckner et al., 2017).

      (e) The authors systematically report interpreted results, not factual data. For instance, while not showing the results on behavioral outcomes, the authors directly interpret them as symbolic distance effects.

      Thank you for this comment. In the original paper, we reported the relevant statistics before our interpretation: “Sequential Distance was correlated positively with RT in the external-perspective task (z = 3.80, p < 0.001) but negatively in the internal-perspective task (z = -3.71, p < 0.001).” However, they may have been difficult to notice, and we are including a figure for the RT analysis in the revised manuscript.

      Crucially, the authors do not comment on the obvious differences in task difficulty in these two tasks, which demonstrates a substantial lack of control in the experimental design. The same/different task (task 1 called "external perspective") comes with known biases in psychophysics that are not present in the temporal order task (task 2 called " internal perspective"). The authors also did not discuss or try to match the performance level in these two tasks. Accordingly, the authors claim that participants had greater accuracy in the external (same/different) task than in the internal task, although no data are shown and provided to support this report. Further, the behavioral effect is trivialized by the report of a performance accuracy trade off that further illustrates that there is a difference in the task requirements, preventing accurate comparison of the two tasks.

      As noted in Question 2d, we acknowledge the substantial difference between the two tasks. However, the primary goal of this study was not to directly compare these tasks to isolate a specific cognitive component. Instead, we first identified the neural correlates of temporal distance as brain regions showing a significant correlation between neural activity and temporal distance, independent of task demands. We then compared these neural correlates across the two task conditions, which were designed to engage different temporal perspectives. Therefore, any general differences between the tasks should not be a confound for our main findings and interpretation.

      Our aim was to investigate whether the hippocampal representation of temporal distance remains consistent across different perspectives and whether the parietal representation of temporal distance varies as a function of the perspective adopted. We do not see how this doubledissociation pattern could be explained by differences in task difficulty.

      While we do not consider the overall difference in task difficulty between the two tasks to be a confounding factor, we acknowledge the potential confound posed by variations in task difficulty across temporal distances (1 to 5). This concern arises from the similarity between the activity patterns in the posterior parietal cortex and reaction time across temporal distances. To address this, we conducted control analyses to test this hypothesis (see the second and third points from Reviewer 2 for details).

      On page 8, we present the behavioral accuracy data: “Participants showed significantly higher accuracy in the external-perspective task than in the internal-perspective task (external-perspective task: M = 93.5%, SD = 4.7%; internal-perspective task: M = 89.5%, SD = 8.1%; paired t(31) = 3.33, p = 0.002).”

      All fMRI contrasts are also confounded by this experimental shortcoming, seeing as they are all reported at the interaction level across a task. For instance, in Figure 4, the authors report a significant beta difference between internal and external tasks. It is impossible to disentangle whether this effect is simply due to task difference or to an actual processing of the duration that differs across tasks, or to the nature of the representation (the most difficult to tackle, and the one chosen by the authors).

      We thank the reviewer for pointing out this important issue. Like temporal distance, the neural correlates of duration were not derived from a direct contrast between the two tasks. Instead, they were identified by detecting brain regions showing a significant correlation between neural activity and the implied duration of each event using the parametric modulation analysis. Therefore, what is shown in Figure 4 reflects the significant differences in these neural correlations with duration between the two tasks.

      The observed difference in the neural representation of duration between the two tasks was unexpected. In the original manuscript, we provided a post hoc explanation: “Since the externalperspective task in the current study encouraged the participants to compare the event sequence with the external parallel temporal landmarks, duration representation in the hippocampus may be dampened.”

      However, we agree that this difference might also arise from other factors distinguishing the two tasks. In the revised manuscript, we have clarified this possibility as follows: “The difference in duration representation between the two tasks remains open to interpretation. One possible explanation is that the hippocampus is preferentially involved in memory for durations embedded within event sequences (see review by Lee et al., 2020). In the internal-perspective task, participants indeed localized events within the event sequence itself. In contrast, the externalperspective task encouraged participants to compare the event sequence with external temporal landmarks, which may have attenuated the hippocampal representation of duration.”

      Conclusion:

      In conclusion, the current experimental work is confounded and lacks controls. Any behavioral or fMRI contrasts between the two proposed tasks can be parsimoniously accounted for by difficulty or attentional differences, not the claim of representational differences being argued for here.

      We hope that our explanations and clarifications above adequately address the reviewer’s concerns. We would like to reiterate that we did not directly compare the two tasks. Rather, we first identified the neural representations of sequential distance and duration, and then examined how these representations differed across tasks. It is unclear to us how the overall difference in task difficulty or attentional demands could lead to the observed pattern of results.

      By determining where the neural representations were consistent and where they diverged, we were able to differentiate brain regions that encode temporal information allocentrically from those that represent temporal information in a perspective-dependent manner, modulated by task demands.

      Reviewer #2 (Public review):

      Summary:

      Xu et al. used fMRI to examine the neural correlates associated with retrieving temporal information from an external compared to internal perspective ('mental time watching' vs. 'mental time travel'). Participants first learned a fictional religious ritual composed of 15 sequential events of varying durations. They were then scanned while they either (1) judged whether a target event happened in the same part of the day as a reference event (external condition); or (2) imagined themselves carrying out the reference event and judged whether the target event occurred in the past or will occur in the future (internal condition). Behavioural data suggested that the perspective manipulation was successful: RT was positively correlated with sequential distance in the external perspective task, while a negative correlation was observed between RT and sequential distance for the internal perspective task. Neurally, the two tasks activated different regions, with the external task associated with greater activity in the supplementary motor area and supramarginal gyrus, and the internal condition with greater activity in default mode network regions. Of particular interest, only a cluster in the posterior parietal cortex demonstrated a significant interaction between perspective and sequential distance, with increased activity in this region for longer sequential distances in the external task, but increased activity for shorter sequential distances in the internal task. Only a main effect of sequential distance was observed in the hippocampus head, with activity being positively correlated with sequential distance in both tasks. No regions exhibited a significant interaction between perspective and duration, although there was a main effect of duration in the hippocampus body with greater activity for longer durations, which appeared to be driven by the internal perspective condition. On the basis of these findings, the authors suggest that the hippocampus may represent event sequences allocentrically, whereas the posterior parietal cortex may process event sequences egocentrically.

      We sincerely appreciate the reviewers for providing an accurate, comprehensive, and objective summary of our study.

      Strengths:

      The topic of egocentric vs. allocentric processing has been relatively under-investigated with respect to time, having traditionally been studied in the domain of space. As such, the current study is timely and has the potential to be important for our understanding of how time is represented in the brain in the service of memory. The study is well thought out, and the behavioural paradigm is, in my opinion, a creative approach to tackling the authors' research question. A particular strength is the implementation of an imagination phase for the participants while learning the fictional religious ritual. This moves the paradigm beyond semantic/schema learning and is probably the best approach besides asking the participants to arduously enact and learn the different events with their exact timings in person. Importantly, the behavioural data point towards successful manipulation of internal vs. external perspective in participants, which is critical for the interpretation of the fMRI data. The use of syllable length as a sanity check for RT analyses, as well as neuroimaging analyses, is also much appreciated.

      We thank the reviewer for the positive and encouraging comments.

      Weaknesses/Suggestions:

      Although the design and analysis choices are generally solid, there are a few finer details/nuances that merit further clarification or consideration in order to strengthen the readers' confidence in the authors' interpretation of their data.

      (1) Given the known behavioural and neural effects of boundaries in sequence memory, I was wondering whether the number of traversed context boundaries (i.e., between morning-afternoon, and afternoon-evening) was controlled for across sequential length in the internal perspective condition? Or, was it the case that reference-target event pairs with higher sequential numbers were more likely to span across two parts of the day compared to lower sequential numbers? Similarly, did the authors examine any potential differences, whether behaviourally or neurally, for day part same vs. day part different external task trials?

      We thank the reviewer for the thoughtful comments. When we designed the experiment, we minimized the correlation between the sequential distance between the target and reference events and whether the reference and target events occurred within the same or different parts of the day (coded as Same = 0, Different = 1). The point-biserial correlation coefficient between these two variables across all the trials within the same run were controlled below 0.2.

      To investigate the effect of day-part boundaries on behavior, as well as the contribution of other factors, we conducted a new linear mixed-effects model analysis incorporating four additional variables. They are whether the target and the reference events are within the same or different parts of the day (i.e., Same vs. Different), whether the target event is in the future or the past of the reference event (i.e., Future vs. Past), and the interactions of the two factors with Task Type (i.e., internal- vs. external-perspective task).

      The results are largely the same as the original one in the table: There was a significant main effect of Syllable Length, and the interaction effects between Task Type and Sequence Distance and between Task Type and Duration remain significant. What's new is we also found a significant interaction effect between Task Type and Same vs. Different.

      As shown in the Figure 2—figure supplement 1, this Same vs. Different effect was in line with the effect of Sequential Distance, with two events in the same and different parts of the day corresponding to the short and long sequential distances. Given that Sequential Distance had already been considered in the model, the effect of parts of the day should result from the boundary effect across day parts or the chunking effect within day parts, i.e., the sequential distance across different parts of the day was perceived longer while the sequential distance within the same parts of the day was perceived shorter. We have incorporated these findings into the manuscript.

      Neurally, to further verify that the significant effects of sequential distance were not driven by its weak correlation with the Same/Different judgment or other potential confounding factors, we constructed another GLM that incorporated three additional parametric modulators: the sequence position of the target event (ranging from 1 to 15) and the behavioral responses (Future vs. Past in the internal-perspective task; Same vs. Different in the external-perspective task, coded as 0 and 1). The significant findings were unaffected.

      (2) I would appreciate further insight into the authors' decision to model their task trials as stick functions with duration 0 in their GLMs, as opposed to boxcar functions with varying durations, given the potential benefits of the latter (e.g., Grinband et al., 2008). I concur that in certain paradigms, RT is considered a potential confound and is taken into account as a nuisance covariate (as the authors have done here). However, given that RTs appear to be critical to the authors' interpretation of participant behavioural performance, it would imply that variations in RT actually reflect variations in cognitive processes of interest, and hence, it may be worth modelling trials as boxcar functions with varying durations.

      We appreciate the reviewer’s insightful comment on this important issue. Whether to control for RT’s influence on fMRI activation is indeed a long-standing paradox. On the one hand, RT reflects underlying cognitive processes and therefore should not be fully controlled for. On the other hand, RT can independently influence neural activity, as several brain networks vary with RT irrespective of the specific cognitive process involved—a domain-general effect. For example, regions within the multiple-demand network are often positively correlated with RT across different cognitive domains.

      Our strategy in the manuscript is to first present the results without including RT as a control variable and then examine whether the effects are preserved after controlling for RT. In the revised manuscript, we have clarified this approach (Page 13): “Here, changes in activity levels within the PPC were found to align with RT. Whether to control for RT’s influence on fMRI activation represents a well-known paradox. On the one hand, RT reflects underlying cognitive processes and therefore should not be fully controlled for. On the other hand, RT can independently influence neural activity, as several brain networks vary with RT irrespective of the specific cognitive process involved—a domain-general effect. For instance, regions within the multiple-demand network are often positively correlated with RT and task difficulty across diverse cognitive domains (e.g., Fedorenko et al., 2013; Mumford et al., 2024). To evaluate the second possibility, we conducted an additional control analysis by including trial-by-trial RT as a parametric modulator in the first-level model (see Methods). Notably, the same PPC region remained the only area in the entire brain showing a significant interaction between Task Type and Sequential Distance (voxel-level p < 0.001, clusterlevel FWE-corrected p < 0.05). This finding indicates that PPC activity cannot be fully attributed to RT. Furthermore, we do not interpret the effect as reflecting a domain-general RT influence, as regions within the multiple-demand system—typically sensitive to RT and task difficulty—did not exhibit significant activation in our data.”

      The reason we did not use boxcar functions with varying durations in our original manuscript is that we also applied parametric modulation in the same model. In the parametric modulation, all parametric modulators inherit the onsets and durations of the events being modulated. Consequently, the modulators would also take the form of boxcar functions rather than stick functions—the height of each boxcar reflecting the parameter value and its length reflecting the RT. We were uncertain whether this approach would be appropriate, as we have not encountered other studies implementing parametric modulation in this manner.

      For exploratory purposes, we also conducted a first-level analysis using boxcar functions with variable durations. The same PPC region remained the strongest area in the entire brain that shows an interaction effect between Task Type and Sequential Distance. However, the cluster size was slightly reduced (voxel-level p < 0.001, cluster-level FWE-corrected p = 0.0610; see the Author response image 1 below). The cross indicates the MNI coordinates at [38, –69, 35], identical to those shown in the main results (Figure 4A).

      Author response image 1.

      (3) The activity pattern across tasks and sequential distance in the posterior parietal cortex appears to parallel the RT data. Have the authors examined potential relationships between the two (e.g., individual participant slopes for RT across sequential distance vs. activity betas in the posterior parietal cortex)?

      We thank the reviewer for this helpful suggestion. As shown in the Author response image 2, the interaction between Task Type and Sequential Distance was a stronger predictor of PPC activation than of RT. Because PPC activation and RT are measured on different scales, we compared their standardized slopes (standardized β) measuring the change in a dependent variable in terms of standard deviations for a one-standard-deviation increase in an independent variable. The standardized β for the Task Type × Sequential Distance interaction was −0.30 (95% CI [−0.42, −0.19]) for PPC activation and −0.21 (95% CI [−0.30, −0.13]) for RT. The larger standardized effect for PPC activation indicates that the Task Type × Sequential Distance interaction was a stronger predictor of neural activation than of behavioral RT.

      Author response image 2.

      A more relevant question is whether PPC activation can be explained by temporal information (i.e., the sequential distance) independently of RT. To test this, we included both Sequential Distance and RT in the same linear mixed-effects model predicting PPC Activation Level. As shown in the Author response table 1, although RT independently influenced PPC activation (F(1, 288) = 4.687, p = 0.031), the interaction between Task Type and Sequential Distance was a much stronger independent predictor (F(1, 290) = 19.319, p < 0.001).

      Author response table 1.

      PPC Activation Level Predicted by Sequential Distance and RT

      (3) Linear Mixed Model Formula: PPC Activation Level ~ 1 + Task Type * (Sequential Distance + RT) + (1 | Participant)

      (4) There were a few places in the manuscript where the writing/discussion of the wider literature could perhaps be tightened or expanded. For instance:

      (i) On page 16, the authors state 'The negative correlation between the activation level in the right PPC and sequential distance has already been observed in a previous fMRI study (Gauthier & van Wassenhove, 2016b). The authors found a similar region (the reported MNI coordinate of the peak voxel was 42, -70, 40, and the MNI coordinate of the peak voxel in the present study was 39, -70, 35), of which the activation level went up when the target event got closer to the self-positioned event. This finding aligns with the evidence suggesting that the posterior parietal cortex implements egocentric representations.' Without providing a little more detail here about the Gauthier & van Wassenhove study and what participants were required to do (i.e., mentally position themselves at a temporal location and make 'occurred before' vs. 'occurred after' judgements of a target event), it could be a little tricky for readers to follow why this convergence in finding supports a role for the posterior parietal cortex in egocentric representations.

      We appreciate the reviewer’s comments. In the revised manuscript, we have provided a more detailed explanation of Gauthier and van Wassenhove’s study (Page 17): “The negative correlation between the activation level in the right PPC and sequential distance has already been observed in a previous fMRI study by Gauthier & van Wassenhove (2016b). In their study, the participants were instructed to mentally position themselves at a specific time point and judge whether a target event occurred before or after that time point. The authors identified a similar brain region (reported MNI coordinates of the peak voxel: 42, −70, 40), closely matching the activation observed in the present study (MNI coordinates of the peak voxel: 39, −70, 35). In both studies, activation in this region increased as the target event approached the self-positioned time point, which aligns with the evidence suggesting that the posterior parietal cortex implements egocentric representations.”

      (ii) Although the authors discuss the Lee et al. (2020) review and related studies with respect to retrospective memory, it is critical to note that this work has also often used prospective paradigms, pointing towards sequential processing being the critical determinant of hippocampal involvement, rather than the distinction between retrospective vs. prospective processing.

      We sincerely thank the reviewer for highlighting these important points. In response, we have revised the section of the Introduction discussing the neural underpinnings of duration (Pages 3-4). “Neurocognitive evidence suggests that the neural representation of duration engages distinct brain systems. The motor system—particularly the supplementary motor area—has been associated with prospective timing (e.g., Protopapa et al., 2019; Nani et al., 2019; De Kock et al., 2021; Robbe, 2023), whereas the hippocampus is considered to support the representation of duration embedded within an event sequence (e.g., Barnett et al., 2014; Thavabalasingam et al., 2018; see also the comprehensive review by Lee et al., 2020).”

      (iii) The authors make an interesting suggestion with respect to hippocampal longitudinal differences in the representation of event sequences, and may wish to relate this to Montagrin et al. (2024), who make an argument for the representation of distant goals in the anterior hippocampus and immediate goals in the posterior hippocampus.

      We thank the reviewer for bringing this intriguing and relevant study to our attention. In the Discussion of the manuscript, we have incorporated it into our discussion (Page 21): “Evidence from the spatial domain has suggested that the anterior hippocampus (or the ventral rodent hippocampus) implements global and gist-like representations (e.g., larger receptive fields), whereas the posterior hippocampus (or the dorsal rodent hippocampus) implements local and detailed ones (e.g., finer receptive fields) (e.g., Jung et al., 1994; Kjelstrup et al., 2008; Collin et al., 2015; see reviews by Poppenk et al., 2013; Robin & Moscovitch, 2017; see Strange et al., 2014 for a different opinion). Recent evidence further shows that the organizational principle observed along the hippocampal long axis may also extend to the temporal domain (Montagrin et al., 2024). In that study, the anterior hippocampus showed greater activation for remote goals, whereas the posterior hippocampus was more strongly engaged for current goals, which are presumed to be represented in finer detail.”

      Reviewing Editor Comments:

      While both reviewers acknowledged the significance of the topic, they raised several important concerns. We believe that providing conceptual clarification, adding important methodological details, as well as addressing potential confounds will further strengthen this paper.

      We thank the editor for the suggestions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Please, provide the actual ethical approval #.

      We have added the ethical approval number in the revised manuscript (P 36): “The ethical committee of the University of Trento approved the experimental protocol (Approval Number 2019-018),”

      (2) Thirty-two participants were tested. Please report how you estimated the sample size was sufficient to test your working hypothesis.

      We thank the editor for pointing out this omission. In the revised manuscript, we have added an explanation for our choice of sample size (p. 36): “The sample size was chosen to align with the upper range of participant numbers reported in previous fMRI studies that successfully detected sequence or distance effects in the hippocampus (N = 15–34; e.g., Morgan et al., 2011; Howard et al., 2014; Deuker et al., 2016; Garvert et al., 2017; Theves et al., 2019; Park et al., 2021; Cristoforetti et al., 2022).”

      (3) All MRI figures: please orient the reader; left/right should be stated.

      In the revised manuscript, we have added labels to all MRI figures to indicate the left and right hemispheres.

      (4) In Figure 3A-B, the clear lateralization of the activation is not discussed in the Results or in the Discussion. Was it predicted?

      We thank the editors for highlighting this important point regarding hemispheric lateralization. The right-lateralization observed in our findings is indeed consistent with previous literature. In the revised manuscript, we have expanded our discussion to emphasize this aspect more clearly.

      For the parietal cortex, we now note (Page 17-18): “The negative correlation between activation in the right posterior parietal cortex (PPC) and sequential distance has previously been reported in an fMRI study by Gauthier and van Wassenhove (2016b). In their paradigm, participants were instructed to mentally position themselves at a specific time point and judge whether a target event occurred before or after that point. The authors identified a similar region (peak voxel MNI coordinates: 42, −70, 40), closely corresponding to the activation observed in the present study (peak voxel MNI coordinates: 39, −70, 35). In both studies, activation in this region increased as the target event approached the self-positioned time point, consistent with evidence suggesting that the posterior parietal cortex supports egocentric representations. Neuropsychological studies have further shown that patients with lesions in the bilateral or right PPC exhibit ‘egocentric disorientation’ (Aguirre & D’Esposito, 1999), characterized by an inability to localize objects relative to themselves (e.g., Case 2: Levine et al., 1985; Patient DW: Stark, 1996; Patients MU: Wilson et al., 1997, 2005).”

      For the hippocampus, we have added (Page 19): “Previous research has shown that hippocampal activation correlates with distance (e.g., Morgan et al., 2011; Howard et al., 2014; Garvert et al., 2017; Theves et al., 2019; Viganò et al., 2023), and that distributed hippocampal activity encodes distance information (e.g., Deuker et al., 2016; Park et al., 2021). Most studies have reported hippocampal ePects either bilaterally or predominantly in the right hemisphere, whereas only one study (Morgan et al., 2011) found the ePect localized to the left hippocampus.”

    1. Emulsions are colloids formed by the dispersion of a hydrophobic liquid in water, thereby bringing two mutually insoluble liquids, such as oil and water, in close contact. Various agents have been developed to stabilize emulsions, the most successful being molecules that combine a relatively long hydrophobic “tail” with a hydrophilic “head”. Soaps are natural emulsifying agents and detergents are synthetic ones.  Figure 13.6.4 shows the similarity in structure between the soap sodium stearate [NaCH3(CH2)16CO2,] and detergent sodium dodecyl sulfate [NaCH3(CH2)11OSO3], both of which are salts with a charge on one end (the head) and a long tailed (nonpolar) hydrocarbon chain on the other.  The charged head is soluble in polar compounds like water while the long hydrophobic head is soluble in non polar compounds like fats and oils.

      This passage explains that emulsions are mixtures of two normally immiscible liquids, like oil and water, stabilized by emulsifying agents. Soaps and detergents work because they have both a hydrophilic (water-attracting) charged head and a hydrophobic (oil-attracting) nonpolar tail. This dual structure allows them to interact with both water and oil, helping keep the mixture stable.

    1. PSA. Analogy: “Hi Mr. Mechanic! Here’s my ‘72 Datsun. I’d like you to give it a tune up. Oh, by the way, I’m not a mechanic but I pulled out the steering column and took apart the starter. Here are the parts in this bag” (When a client takes apart their machine, got in way over their head and wants me to fix their mistakes. Please… I appreciate you’re experimenting, but this is a huge nightmare to me. I’ve passed on several of these machines.)

      via Todd Young https://www.facebook.com/groups/TypewriterCollectors/posts/10163194812329678/

    1. Reviewer #1 (Public review):

      Summary:

      Hoverflies are known for a striking sexual dimorphism in eye morphology and early visual system physiology. Surprisingly, the male and female flight behaviors show only subtle differences. Nicholas et al. investigate the sensori-motor transformation of sexually dimorphic visual information to flight steering commands via descending neurons. The authors combined intra- and extracellular recordings, neuroanatomy, and behavioral analysis. They convincingly demonstrate that descending neurons show sexual dimorphisms - in particular at high optic flow velocities - while wing steering responses seem relatively monomorphic. The study highlights a very interesting discrepancy between neuronal and behavioral response properties.

      More specifically, the authors focused on two types of descending neurons that receive inputs from well-characterized wide-field sensitive tangential cells: OFS DN1, which receives inputs from so-called HS cells, and OFS DN2, which receives input from a set of VS cells. Their likely counterparts in Drosophila connect to the neck, wing, and haltere neuropils. The authors characterized the visual response properties of these two neuronal classes in both male and female hoverflies and identified several interesting differences. They then presented the same set of stimuli, tracked wing beat amplitude, and analyzed the sum and the difference of right and left wing beat amplitude as a readout of lift or thrust, and yaw turning, respectively. Behavioral responses showed little to no sexual dimorphism, despite the observed neuronal differences.

      Strengths:

      I find the question very interesting and the results both convincing and intriguing. A fundamental goal in neuroscience is to link neuronal responses and behavior. The current study highlights that the transformations - even at the level of descending neurons to motoneurons - are complex and less straightforward than one might expect.

      Weaknesses:

      The authors investigated two types of descending neurons, but it was not clear to me how many other descending neurons are thought to be involved in wing steering responses to wide-field motion. I would suggest providing a more in-depth overview of what is known about hoverflies and Drosophila, since the conclusions drawn from the study would be different if these two types were the only descending neurons involved, as opposed to representing a subset of the neurons conveying visual information to the wing neuropil.

      Both neuronal classes have counterparts in Drosophila that also innervate neck motor regions. The authors filled the hoverfly DNs in intracellular recordings to characterize their arborization in the ventral nerve cord. In my opinion, these anatomical data could be further exploited and discussed a bit more: is the innervation in hoverflies also consistent with connecting to the neck and haltere motor regions? Are there any obvious differences and similarities to the Drosophila neurons mentioned by the authors? If the arborization also supports a role in neck movements, the authors could discuss whether they would expect any sexual dimorphism in head movements.

    2. Author response:

      eLife Assessment

      Hoverflies are known for their sexually dimorphic visual systems and exquisite flight behaviors. This valuable study reports how two types of visual descending neurons differ between males and females in their motion- and speed-dependent responses, yet surprisingly, the behavior they control lacks any sexual dimorphism. The results convincingly support these findings, which will be of interest for studies of visuomotor transformations and network-level brain organization.

      This statement perfectly recapitulates our findings.

      Public Reviews:

      Reviewer #1 (Public review):  

      Summary: 

      Hoverflies are known for a striking sexual dimorphism in eye morphology and early visual system physiology. Surprisingly, the male and female flight behaviors show only subtle differences. Nicholas et al. investigate the sensori-motor transformation of sexually dimorphic visual information to flight steering commands via descending neurons. The authors combined intra- and extracellular recordings, neuroanatomy, and behavioral analysis. They convincingly demonstrate that descending neurons show sexual dimorphisms - in particular at high optic flow velocities - while wing steering responses seem relatively monomorphic. The study highlights a very interesting discrepancy between neuronal and behavioral response properties.

      Thank you for this summary. Most of the statement perfectly recapitulates the main findings of our paper. However, we want to emphasize that some hoverfly flight behaviors are strongly sexually dimorphic, especially those related to courtship and mating. Indeed, only male hoverflies pursue targets at high speed, chase away territorial intruders, and pursue females for mating. However, other flight behaviours, such as those related to optomotor responses and flights between flowers when feeding, are not sexually dimorphic. We will amend the Introduction to make the difference between flight behaviors clear.

      More specifically, the authors focused on two types of descending neurons that receive inputs from well-characterized wide-field sensitive tangential cells: OFS DN1, which receives inputs from so-called HS cells, and OFS DN2, which receives input from a set of VS cells. Their likely counterparts in Drosophila connect to the neck, wing, and haltere neuropils. The authors characterized the visual response properties of these two neuronal classes in both male and female hoverflies and identified several interesting differences. They then presented the same set of stimuli, tracked wing beat amplitude, and analyzed the sum and the difference of right and left wing beat amplitude as a readout of lift or thrust, and yaw turning, respectively. Behavioral responses showed little to no sexual dimorphism, despite the observed neuronal differences.

      Thank you for this very nice summary of our work. We want to clarify that LPTC input to DN1 and DN2 has not been shown directly in hoverflies using e.g. dye coupling, or dual recordings. Instead, the presumed HS and VS input is inferred from morphological and physiological DN evidence, and comparisons to similar data in Drosophila and blowflies. We will amend the Introduction to clarify this. The rest of the paragraph perfectly recapitulates the main findings of our paper.

      Strengths:

      I find the question very interesting and the results both convincing and intriguing. A fundamental goal in neuroscience is to link neuronal responses and behavior. The current study highlights that the transformations - even at the level of descending neurons to motoneurons - are complex and less straightforward than one might expect.

      Thank you.

      Weaknesses:

      The authors investigated two types of descending neurons, but it was not clear to me how many other descending neurons are thought to be involved in wing steering responses to wide-field motion. I would suggest providing a more in-depth overview of what is known about hoverflies and Drosophila, since the conclusions drawn from the study would be different if these two types were the only descending neurons involved, as opposed to representing a subset of the neurons conveying visual information to the wing neuropil.

      This is a great point. There are around 1000 fly DNs, of which many could respond to widefield motion, without being specifically tuned to widefield motion. For example, many looming sensitive neurons also respond to widefield motion, and could therefore be involved in the WBA movements that we measured here. In addition, there are many multimodal neurons that could be involved in optomotor responses in free flight, but these may not have been stimulated when we only provided visual input. Furthermore, many visual neurons are modulated by proprioceptive feedback, which is lacking in immobilized physiology preps. Finally, in blowflies, up to 5 optic flow sensitive DNs have been identified morphologically, and in Drosophila 3 have been identified morphologically and physiologically. In summary, it is more than likely that other neurons project visual widefield motion information to the wing neuropil. We will amend our Introduction and Discussion to make this important point clear to the readers.

      Both neuronal classes have counterparts in Drosophila that also innervate neck motor regions. The authors filled the hoverfly DNs in intracellular recordings to characterize their arborization in the ventral nerve cord. In my opinion, these anatomical data could be further exploited and discussed a bit more: is the innervation in hoverflies also consistent with connecting to the neck and haltere motor regions? Are there any obvious differences and similarities to the Drosophila neurons mentioned by the authors? If the arborization also supports a role in neck movements, the authors could discuss whether they would expect any sexual dimorphism in head movements.

      These are all great points. We did not see any clear arborizations to the frontal nerve, where we would expect to find the neck motor neurons (NMNs). In addition, while we did see fine arborizations throughout the length of the thoracic ganglion, we saw no strong outputs projecting directly to the haltere nerve (HN). In the revised version of the MS we will modify figure 4 (morphological characterization) to clarify.

      There are important differences between the morphology of DN1 and DN2 in hoverflies and DNHS1 and DNOVS2 in Drosophila, in terms of their projections in the thoracic ganglion. For example, In Drosophila DNOVS2, there are several fine branches along the length of the neuron in the thoracic ganglia. Similarly, we found fine branches in Eristalis tenax DN2, however, in addition, we found a wide branch projecting to the area of the thoracic ganglion where the prothoracic and pterothoracic nerves likely get their inputs (Figure 4), suggesting that the neuron could contribute to controlling the wings and/or the forelegs (which is why we quantified the WBA). In Drosophila DNHS1, there is a similar fat branch to the prothoracic and pterothoracic nerves, which we also found in Eristalis tenax OFS DN1 (Figure 4). Indeed, while Drosophila DNHS1 and DNOVS2 have quite strikingly different morphology, DN1 and DN2 in Eristalis looked quite similar. We will modify the Results section to make this clear.

      In addition, to investigate this further, in the revised version of the MS we will include analysis of the movement of different body parts (including the head) to investigate the presence of any potential sexual dimorphism. Unfortunately, however, this will not include the halteres, as they cannot be seen well in the videos.

      Reviewer #2 (Public review):

      Summary:

      Many fly species exhibit male-specific visual behaviors during courtship, while little is known about the circuit underlying the dimorphic visuomotor transformations. Nicholas et al focus on two types of visual descending neurons (DNs) in hoverflies, a species in which only males exhibit high-speed pursuit of conspecifics. They combined electrophysiology and behavior analysis to identify these DNs and characterize their response to a variety of visual stimuli in both male and female flies. The results show that the neurons in both sexes have similar receptive fields but exhibit speed-dependent dimorphic responses to different optic flow stimuli.

      This statement perfectly recapitulates the main findings of our paper. However, as mentioned above, while hoverfly flight behaviors related to courtship and mating are strongly sexually dimorphic, other flight behaviours, such as those related to optomotor responses and flights between flowers when feeding, are not. We will amend the Introduction to make the difference between flight behaviors clear.

      Strengths:

      Hoverflies, though not a common model system, show very interesting dimorphic behaviors and provide a unique and valuable entry point to explore the brain organization behind sexual dimorphism. The findings here are not only interesting on their own right but will also likely inspire those working in other systems, particularly Drosophila.

      Thank you.

      The authors employed rigorous morphology, electrophysiology, and behavior methods to deliver a comprehensive characterization of the neurons in question. The precision of the measurements allowed for identifying a subtle and nuanced neuronal dimorphism and set a standard for future work in this area.

      Thank you.

      Weaknesses:

      Cell-typing using receptive field preferred directions (RFPDs): if I understood correctly, this classification method mostly relies on the LPDs near the center of the receptive field (median within the contour in Fig.1). I have two concerns here. First, this method is great if we are certain there are only two types of visual DNs as described in the manuscript. But how certain is this? Given the importance of vision in flight control, I would expect many DNs that transmit optic flow information to the motor center. I'd also like to point out that there are other lobula plate tangential cells (LPTCs) than HS and VS cells, which are much less studied and could potentially contribute to dimorphic behaviors.

      This is very true, and an important point. As mentioned above, in blowflies, up to 5 optic flow sensitive DNs have been identified morphologically, however, if these correspond to 5 different physiological types remain unclear. In both blowflies and Drosophila 3 have been identified morphologically and physiologically (DNHS1, DNOVS1, DNOVS2). Importantly, in both blowflies and fruitflies DNOVS1 gives graded responses, and no action potentials, meaning that we would not be able to record from it using extracellular electrophysiology.

      We previously used clustering techniques to show that in Eristalis, we can reliably distinguish two types of optic flow sensitive DNs from extracellular electrophysiological data, based on a range of receptive field parameters, and we think that these correspond to DNHS1 and DNOVS2 in Drosophila (Nicholas et al, J Comp Physiol A, 2020, cited in paper). As mentioned above in response to Reviewer 1, this does not mean that there are no other neurons that could respond to widefield optic flow, and which might be involved in the WBA we recorded in the paper. However, the point of this paper was not to conclusively show that there are only two optic flow sensitive descending neurons. The point was to say that there are two quite distinct optic flow sensitive neurons that have similar receptive fields in males and females, while the responses to widefield motion show differences between males and females.

      We will modify the Introduction and Discussion to make these important points clear to the Reader, including the discussion of the 45-60 LPTCs that exist in the lobula plate, and what their role might be.

      Second, this method feels somewhat impoverished given the richness of the data. The authors have nicely mapped out the directional tuning for almost the entire visual field. Instead of reducing this measurement to 2 values (center and direction), I was wondering if there is a better method to fully utilize the data at hand to get a better characterization of these DNs. As the authors are aware, local features alone can be ambiguous in characterizing optic flows. What's more, taking into account more global features can be useful for discovering potentially new cell types.

      This is a great point, and we did an extensive analysis of other receptive field properties in this study (shown in supp fig 1). In addition, and as mentioned above, we have published a clustering analysis across receptive field properties of these neurons (Nicholas et al, J Comp Physiol A, 2020, cited in paper). The point that we attempted to make in this paper was that by using two strikingly simple metrics, we can reliably distinguish which of the two neuron types we are recording from (if we accept that there are two main types that we are likely to record from) simply based on location and overall directional preference. This makes automated analysis very easy and straightforward. Indeed, we now use this routinely to ID what neuron we are recording from, rather than making a human-based assumption.

      However, we agree that further in depth analysis is warranted. Therefore, to address this, we will provide additional receptive field analysis and clustering in the revised version of the MS. In addition, we want to highlight that all data is uploaded to DataDryad for anyone interested in doing additional in-depth analyses.

      Line 131, it wasn't clear to me why full-screen stimuli were used for comparison here, instead of the full receptive field maps. Male flies exhibit sexual dimorphic behaviors only during courtship, which would suggest that small-sized visual stimuli (mimicking an intruder or female conspecific) would be better suited to elicit dimorphic neuronal responses. A similar comment applies to the later results as well. Based on the receptive field mapping in Figure 1, I'm under the impression that these 2 DN types are more suited to detect wide-field optic flows, those induced by self-motion as mentioned in the manuscript. The results are still very interesting, but it's good to make this point clear early on to help set appropriate expectations. Conversely, this would also suggest that there are other visual DN types that are responsible for the courtship-related sexually dimorphic behaviors.

      Thank you for mentioning these important points. Our reasoning for using full-screen stimuli for the analysis on line 131 was that since we used the small sinusoidal gratings for mapping the receptive fields, and to subsequently classify the neurons, it would be unfair to use the same data to investigate potential sexual dimorphism. I.e., we selected neurons that fulfilled certain criteria, and then we cannot rightfully use the same criteria to determine differences. This was not explicitly mentioned in the paper, so we will modify the text to make this clear to the Reader.

      However, in Supp Figure 1d/e we show that there are no striking receptive field differences between males and females in terms of receptive field center nor directional preference. In Supp Figure 1f we show that there is no difference between male and female receptive field height and width. We will modify the text to draw the Reader’s attention to this figure, and also mention the additional analysis done in response to the comment above.

      As a side note, I personally expected at least DNHS1 to have a smaller receptive field in males, as the hoverfly HSN is strikingly sexually dimorphic (Nordström et al, Curr Biol 2008), and also very sensitive to small objects. However, while optic flow sensitive DNs do respond to small objects (see e.g. the J Comp Physiol paper mentioned above) we did not detect any obvious sexual dimorphism in receptive field properties. Indeed, we think that a different subset of DNs control target pursuit behavior (target selective DNs (TSDNs)). This will be addressed in the modified version of the paper.

    1. eLife Assessment

      This valuable study reports results showing how different neurons in the dysgranular retrosplenial cortex code spatial orientation. Specifically, the paper reports that some neurons maintain tuning for a single head direction across multi-compartmental environments, while other neurons are tuned to different head directions that reflect the geometry within each compartment. The study was viewed as likely to expand the field's understanding of directional tuning of neurons, but incomplete evidence was provided to support the conclusions.

    2. Reviewer #1 (Public review):

      Summary:

      The dysgranular retrosplenial cortex (RSD) and hippocampus both encode information related to an animal's navigation through space. Here, the authors study the different ways in which these two brain regions represent spatial information when animals navigate through interconnected rooms. Most importantly, they find that the RSD contains a small fraction of neurons that encode properties of interconnected rooms by firing in different head directions within each room. This direction is shifted by 180 degrees in 2-room environments, and by 90 degrees in 4-room environments. While it cannot be definitively proven that this encoding is not just related to the presence of exits (doors) in each room, this is a noteworthy finding and will motivate further study in more complex and well-controlled environments to understand this coding scheme in the RSD. The recordings and analyses used to identify these multi-directional cells are mostly solid. Additional conclusions regarding the rotational symmetry across rooms seen in the RSD neurons that do not encode direction (representing the majority of RSD neurons) remain incomplete, given the evidence presented thus far. The differences between RSD and hippocampus encoding of space are clear and consistent with prior observations.

      Strengths:

      (1) Use of tetrode recordings from the RSD to identify multi-direction cells that only encode one direction in each room, but shift the preferred direction by either 180 or 90 degrees depending on the number of rooms in the environment.

      (2) Solid controls to show that this multi-direction encoding is stable over time and across some environmental manipulations.

      (3) Convincing evidence that these multi-direction cells can co-exist with single-direction head direction cells in the RSD (as both cell types can be simultaneously recorded).

      (4) Convincing evidence for clear differences between directional and spatial encoding in the RSD versus hippocampus, consistent with prior observations.

      Weaknesses:

      (1) The paper mostly uses the term "retrosplenial cortex", but it is important to clarify that the study is only focused on the dysgranular retrosplenial cortex (RSD; Brodmann Area 30) and not the granular retrosplenial cortex (Brodmann Area 29). These are two distinct regions (despite the similar names), each with distinct connectivity and distinct behavioral encoding and function, so it is important to clarify in the abstract and title that the present study is solely about the RSD to prevent confusion in the literature.

      (2) The proportion of each observed cell type is not clearly stated, although it is clear that the multi-directional cells are in the minority. Having the proportion of well-isolated neurons in distinct sessions that encode each type of information (e.g., multi vs single direction encoding) would greatly aid the interpretation of the result and help the field know how common each cell type is in the RSD.

      (3) The authors state that "MDCs [multi-directional cells] never exhibited multidirectional activity within a single room" - but many of the single room examples from the 4-room environment (shown in Figures 2E and 2F) reveal multi-peaked directional encoding. This suggests that the multi-direction encoding may be more compatible with encoding some property of the number of exits rather than relative room orientations.

      (4) The spatial rotation analyses of non-directional cell analyses are considered incomplete. This is impacted by the slower speed at the doors and hence altered firing rates (as evidenced in spatial rate plots). The population rate is not relevant as the correlational analyses are done on a single cell level. Since some cells fire more with increasing speed and others fire less, that will necessarily result in a population rate map that minimizes firing rate differences near the doorway, where the animals move more slowly. But on a single cell level, that reduced speed is having a big effect, as evidenced by individual rate map examples, and the rooms will need to be rotated to obtain a higher correlation by overlapping the doorway regions. This does not necessarily say anything about spatial coding across the two or four interconnected rooms being rotationally symmetric, and it would appear difficult to draw any conclusions related to spatial encoding from those analyses.

    3. Reviewer #2 (Public review):

      Summary:

      Laurent et al. perform in vivo electrophysiological recordings in the retrosplenial cortex of rats foraging in multi-compartment environments with either identical or unique visual features. The authors characterize two types of directional signals in the area that they have previously reported: classic head direction cells anchored to the global allocentric reference frame and multi-direction cells (MDCs), which have a rotationally preserved directional field anchored to local compartments. The primary finding of this work is that MDCs seem sensitive to local environmental geometry rather than visual context. They also show that MDC tuning persists in the absence of hippocampal place field repetition, further dissociating the RSC local directional signal from the broader allocentric representation of space. A novel observation is that RSC non-directional spatial signals are anchored to the local environment, which could and should be explored further. While the data is solid and the analyses are mostly appropriate, the primary findings are incremental, and more interesting novel claims are not explored in detail or not explicitly tested.

      Strengths:

      The environmental manipulations clearly demonstrate that tuning is not modulated by complex visual information.

      The finding that RSC two-dimensional spatial responses are stable and anchored to environmental features is novel and can be further explored in future work.

      Weaknesses:

      The observation that BDCs and MDCs are insensitive to visual context builds upon the author's previous work (and replicates aspects of Zhang et al., 2022) but leaves many open questions that are not addressed with the current set of experiments. Specifically, what exactly are MDCs anchoring to? The primary theory is that they anchor to environmental geometry, but there are no explicit experimental manipulations to test this theory. It is important to note that 2- and 4-compartment environments share many features, including the same cardinal axes, making any differences/similarities in these two conditions difficult to interpret.

      The main finding presented with respect to BDC/MDs tuning is that they are not sensitive to visual context as manipulated by distinct visual patterns on the wall and floor in multicompartment environments. One could argue that the individual rooms are, in actuality, quite similar in low-level visual features - each possesses a large white background square visual feature on a single wall with a fixed relationship to the door(s). How can the authors rule out that i) BDC/MDC responses are modulated by these low-level features rather than geometry and/or ii) that the rats are not paying attention to any visual features at all? There is no task requiring them to indicate which room they are in. Furthermore, the doorways themselves are prominent visual features that are present in each context. It would be interesting to see if MDC/BDC tuning persisted in a square room where the number of doorways was manipulated to rule out this possibility.

      A strong possibility is that the rotational symmetry of both MDCs and non-directional spatial neurons is related to i) door-related firing, 2) stereotyped movement, and 3) stereotyped directional sampling. In Supplemental Figure 8, the authors begin to address this by comparing a 'population ratemap' to a 'population speed map.' I do not think this is sufficient and is difficult to interpret. Instead, the authors should assess whether MDC and BDCs fire more at doorways and what the overlap is with the speed-modulated cells they report. Moreover, they should assess whether the spatial speed profile itself is rotationally symmetric within each session. It would also be useful to look at the confluence of the variables simultaneously using some form of regression analysis. The authors could generate a directional predictor that captures the main response property of these cells and see if it accounts for greater variability in spiking than speed or x,y position. Finally, rotationally symmetric directional sampling biases could arise from the doors being present on the same two walls in each room. The authors should assess whether MDC tuning is still present if directional sampling is randomly downsampled to match directional observations in each compartment.

      Recent work has demonstrated that neurons with egocentric corner or boundary tuning are observed in RSC. The authors do not address whether egocentric tuning contributes to MDC signals. An explicit analysis of the relationship and potential overlap of MDC and egocentric populations is warranted.

      Many of the MDCs presented in the main figures are not especially compelling. This includes alterations to MDC tuning in Figure 2, which is a key datapoint. The authors should show significantly more (if not all) examples of MDCs in each environment. It would similarly be useful to see all/more examples of non-directional spatially tuned neurons with rotationally symmetric firing patterns.

      "One might hypothesize that specific environmental cues, such as door orientation or landmark positioning, drive these tuning shifts. However, our results argue against this interpretation. In four-room environments, each room had multiple entry points, yet MDCs never exhibited multidirectional activity within a single room."

      I do not understand the logic here. Can the authors unpack this? Also, it is clear that some of the example cells have more than one peak in individual compartments. How is this quantified?

    4. Reviewer #3 (Public review):

      Summary:

      The authors examine firing of dysgranular retrosplenial cortex (dRSC) neurons in relation to head orientation and location for rats exploring open-field environments. One environment utilized was a square arena with high walls that is split into two rectangular spaces connected by a doorway. Another environment is a square arena split into quadrants connected by doors near the center. For each, the different sub-spaces of the environments are either identical in terms of visual and tactile cues or different. For head direction neurons, the authors present one population where each neuron maintains a single tuning direction for the two or four sub-compartments of the two environments. A second population exhibits what is termed multi-directional firing, wherein neurons exhibit (overall) two or four head direction peaks in firing. For such neurons, firing in each of the sub-compartments is associated with only a single preferred direction, but the directions across compartments are shown to be at 180-degree (two-compartment environment) or 90-degree offsets. The offsets evidence tuning to the "same" orientation for the sub-compartments that are, in the global reference frame, oriented at 180 or 90 degree offsets. The results are similar whether or not the sub-compartments have the same or different tactile and visual cues. Thus, the first population is said to be global in its head direction tuning, while the second relates to each local environment in a way that is systematic across sub-compartments. Spatially-specific activity of another population of non-direction-tuned RSC neurons is examined, and comparisons of sub-compartment spatial firing maps suggest that spatial tuning in RSC also repeats across compartments when the firing maps for the compartments are rotated to match each other (as in physical space). Finally, a population of hippocampal "place" cells exhibited different location mapping across sub-compartments. The findings are interpreted to indicate that RSC can simultaneously map orientation in both local and global reference frames, possibly forming a mechanism whereby the sub-compartments' shared geometry (given by the boundary shapes and the door locations) can be related to each other and to the global space they share.

      Strengths:

      This paper addresses an interesting problem and expands how the field will think about directional tuning.

      Weaknesses:

      It is not clear that the experimental design allows for a clear interpretation of the data. Rates for preferred turning are low, as are ratemap correlations for spatially-tuned neurons.

      (1) It is concerning that the neurons with head direction tuning have fairly low peak firing rates (mean close to 5 Hz), where prior studies examining head direction tuning in dRSC found head direction-tuned neurons with peak rates more than an order of magnitude higher (100 Hz or more). Under circumstances where neurons are tuned well to variables other than head direction (for example, angular velocity of movement), weak head direction tuning may be observed if those other variables are not sampled equally across head directions. The manuscript contains no rigorous control for this possibility. One place to start to address this issue would be to map out variables such as angular velocity by head orientation, and to test whether such relationships also carry 90 and 180 degree offsets.

      (2) There is some question as to whether dRSC neurons (spatial or directional) following the sub-compartment "geometry" is appropriate in terms of interpreting the data. In the condition with sub-compartments carrying different tactile and visual cues, it seems that such cues pertain only to the floor of the environments. The distal visual space of the boundaries appears to be identical. One is left to wonder whether distinguishing environments according to boundary wall visual cues would lead to different results. The CA1 data does not help to rule this possibility out. A second reason to doubt the "shared geometry" interpretation is that there is no condition where sub-compartment geometry is varied. It is also the case that the sub-compartment doorways may stand as the only salient distal visual cue linking the environments. Local sensory cues and geometry seem not so disentangled in this study, but this is a major claim in the abstract.

      (3) There is some concern with the interpretation that the spatial tuning of some dRSC neurons repeats in rotated form across sub-compartments. The firing rate map correlations are very low on average (~0.2), and far lower than the population of CA1 having repeating fields across the same vs different visual/tactile cue conditions. The authors should define the chance level of ratemap correlation by shuffling neuron identities. Apologies if this is indeed the current approach, but it seems not to be (I was left a bit lost by the description in the methods). For any population of hippocampal place cells, the cross-neuron correlations of firing rate maps are typically not zero, and correlations at 0.2 would normally be evidence for remapping.

      (4) A somewhat picky point here that is not meant to claim that multi-compartment studies are not useful - the introduction states that real-world environments typically consist of multi-compartment rooms. This is certainly not true for rodents and is only sometimes true in humans.

      (5) The discussion lacks a consideration of how such dRSC output might impact the target structures of dRSC.

      (6) The discussion speaks to the idea that multi-directional neurons may aid in transitioning between contexts (sub-compartments). But it is notable that none of the multidirectional neurons have multi-directional tuning in all sub-compartments, but such firing was seen in the 2017 Nature Neuroscience study by Jacob/Jeffery. The discussion should address this difference and perhaps posit a means by which the firing of global and local head direction neurons can be related to each other to yield navigation that depends on both scales.

      (7) The authors should provide the size of the smoothing function for spatial firing rate maps.

      (8) The authors should devise a measure to define directional tuning in 4 directions (with 90-degree offsets).

      (9) Figures 2D and 2H - The offsets in preferred tuning across sub-compartments are rather variable.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript uses primarily simulation tools to probe the pathway of cholesterol transport with the smoothened (SMO) protein. The pathway to the protein and within SMO is clearly discovered, and interactions deemed important are tested experimentally to validate the model predictions.

      Strengths:

      The authors have clearly demonstrated how cholesterol might go from the membrane through SMO for the inner and outer leaflets of a symmetrical membrane model. The free energy profiles, structural conformations, and cholesterol-residue interactions are clearly described.

      We thank the reviewer for their kind words.

      (1) Membrane Model: The authors decided to use a rather simple symmetric membrane with just cholesterol, POPC, and PSM at the same concentration for the inner and outer leaflets. This is not representative of asymmetry known to exist in plasma membranes (SM only in the outer leaflet and more cholesterol in this leaflet). This may also be important to the free energy pathway into SMO. Moreover, PE and anionic lipids are present in the inner leaflet and are ignored. While I am not requesting new simulations, I would suggest that the authors should clearly state that their model does not consider lipid concentration leaflet asymmetry, which might play an important role.

      We thank the reviewer for their comment. Membrane asymmetry is inherent in endogenous systems; we acknowledge that as a limitation of our current model. We have addressed the comment by adding this limitation to our discussion in the manuscript.

      Added lines: (End of paragraph 6, Results subsection 2):

      “One possibility that might alter the thermodynamic barriers is native membrane asymmetry, particularly the anionic lipid-rich inner leaflet. This presents as a limitation of our current model.”

      (2) Statistical comparison of barriers: The barriers for pathways 1 and 2 are compared in the text, suggesting that pathway 2 has a slightly higher barrier than pathway 1. However, are these statistically different? If so, the authors should state the p-value. If not, then the text in the manuscript should not state that one pathway is preferred over the other.

      We thank the reviewer for their comment. We have added statistical t-tests for the barriers.

      Changes made: (Paragraph 6, Results subsection 2)

      “However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol v/s 6.5 ± 0.8 kcal/mol, p = 0.0013)”

      (3) Barrier of cholesterol (reasoning): The authors on page 7 argue that there is an enthalpy barrier between the membrane and SMO due to the change in environment. However, cholesterol lies in the membrane with its hydroxyl interacting with the hydrophilic part of the membrane and the other parts in the hydrophobic part. How is the SMO surface any different? It has both characteristics and is likely balanced similarly to uptake cholesterol. Unless this can be better quantified, I would suggest that this logic be removed.

      We thank the reviewer for this suggestion. We have removed the line to avoid confusion.

      Reviewer #2 (Public review):

      Summary:

      In this work, the authors applied a range of computational methods to probe the translocation of cholesterol through the Smoothened receptor. They test whether cholesterol is more likely to enter the receptor straight from the outer leaflet of the membrane or via a binding pathway in the inner leaflet first. Their data reveal that both pathways are plausible but that the free energy barriers of pathway 1 are lower, suggesting this route is preferable. They also probe the pathway of cholesterol transport from the transmembrane region to the cysteine-rich domain (CRD).

      Strengths:

      (1) A wide range of computational techniques is used, including potential of mean force calculations, adaptive sampling, dimensionality reduction using tICA, and MSM modelling. These are all applied rigorously, and the data are very convincing. The computational work is an exemplar of a well-carried out study.

      (2) The computational predictions are experimentally supported using mutagenesis, with an excellent agreement between their PMF and mRNA fold change data.

      (3) The data are described clearly and coherently, with excellent use of figures. They combine their findings into a mechanism for cholesterol transport, which on the whole seems sound.

      (4) The methods are described well, and many of their analysis methods have been made available via GitHub, which is an additional strength.

      Weaknesses:

      (1) Some of the data could be presented a little more clearly. In particular, Figure 7 needs additional annotation to be interpretable. Can the position of the cholesterol be shown on the graph so that we can see the diameter change more clearly?

      We thank the reviewer for this suggestion. We have added the cholesterol positions as requested.

      Changes made: (Caption, Figure 7)

      “The tunnel profile during cholesterol translocation in SMO. (a) Free energy plot of the zcoordinate v/s the tunnel diameter when cholesterol is present in the core TMD. The tunnel shows a spike in the radius in the TMD domain, indicating the presence of a cholesterol-accommodating cavity. (b) Representative figure for the tunnel when a cholesterol molecule is in the TMD. (c) Same as (a), when cholesterol is at the TMD-CRD interface. (e) same as (b), when cholesterol is at the TMD-CRD interface. (e) same as (a), when cholesterol is at the CRD binding site. (f) same as (b), when cholesterol is at the CRD binding site. Tunnel diameters shown as spheres. Cholesterol positions marked on plots using dotted lines. All snapshots presented are frames taken from MD simulations.”

      (2) In Figure 3C, it doesn’t look like the Met is constricting the tunnel at all. What residue is constricting the tunnel here? Can we see the Ala and Met panels from the same angle to compare the landscapes? Or does the mutation significantly change the tunnel? Why not A283 to a bulkier residue? Finally, the legend says that the figure shows that cholesterol can still pass this residue, but it doesn’t really show this. Perhaps if the HOLE graph was plotted, we could see the narrowest point of the tunnel and compare it to the size of cholesterol.

      We thank the reviewer for this suggestion. A283 was mutated to methionine as it presents with a longer heavy tail containing sulfur. We have plotted the tunnel radii for both WT and A283M mutants and added them as a supplemental figure. As shown in the figure, the presence of methionine doesn’t completely block the tunnel, but occludes it, thereby increasing the barrier for cholesterol transport slightly.

      Changes made: (End of Results subsection 1)

      “When we calculated the PMF for cholesterol entry, A<sup>2.60f</sup>M mutant showed restricted tunnel but it did not fully block the tunnel (Figure 3—figure Supplement 3).”

      (3) The PMF axis in 3b and d confused me for a bit. Looking at the Supplementary data, it’s clear that, e.g., the F455I change increases the energy barrier for chol entering the receptor. But in 3d this is shown as a -ve change, i.e., favourable. This seems the wrong way around for me. Either switch the sign or make this clearer in the legend, please.

      We thank the reviewer for this suggestion. We measured ∆PMF as PMF<sub>WT</sub> PMF<sub>mutant</sub>, hence the negative values. We have added additional text to the legend to clarify this.

      Changes made: (Caption, Figure 3)

      “(b) ∆Gli1 mRNA fold change (high SHH vs untreated) and ∆ PMF (difference of peak PMF , calculated as PMF<sub>WT</sub> - PMF<sub>mutant</sub>) plotted for the mutants in Pathway 1. (c) Example mutant A<sup>2_._60f</sup>M shows that cholesterol can enter SMO through Pathway 1 even on a bulky mutation. (d) Same as (b) but for Pathway 2 (e) Example mutant L<sup>5.62f</sup>A shows that cholesterol can enter SMO through Pathway 2 due to lesser steric hindrance. All snapshots presented are frames taken from MD simulations.”

      Changes made: (Caption, Figure 6)

      “(b) ∆Gli1 mRNA fold change (high SHH vs untreated) and ∆ PMF (difference of peak PMF, calculated as PMF<sub>WT</sub> - PMF<sub>mutant</sub>) plotted for mutants along the TMD-CRD pathway. (c, d) Example mutants Y<sup>LD</sup>A and F<sup>5.65f</sup>A show that cholesterol is unable to translocate through this pathway because of the loss of crucial hydrophobic contacts provided by Y207 and F484 and along the solvent-exposed pathway.”

      (4) The impact of G280V is put down to a decrease in flexibility, but it could also be a steric hindrance. This should be discussed.

      We thank the reviewer for this suggestion. We have added it as a possible mechanism of the decrease in activity of SMO.

      Changes made: (Paragraph 5, Results subsection 1)

      “We mutated G280<sup>2.57f</sup>  to valine - G<sup>2.57f</sup>V to test whether reducing the flexibility of TM2 prevents cholesterol entry into the TMD. Consequently, the activity of mSMO showed a decrease. However, this decrease could also be attributed to steric hindrance added by the presence of a bulky propyl group in valine.”

      (5) Are the reported energy barriers of the two pathways (5.8plus minus0.7 and 6.5plus minus0.8 kcal/mol) significantly and/or substantially different enough to favour one over the other? This could be discussed in the manuscript.

      We thank the reviewer for this suggestion. We have added statistical t-tests for the barriers.

      Changes made: (Paragraph 6, Results subsection 2)

      “However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol v/s 6.5 ± 0.8 kcal/mol, p = 0.001)”

      (6) Are the energy barriers consistent with a passive diffusion-driven process? It feels like, without a source of free energy input (e.g., ion or ATP), these barriers would be difficult to overcome. This could be discussed.

      We thank the reviewer for this suggestion. We have added a discussion to further clarify this point.

      Discussion: (Paragraph 6, Results subsection 2)

      “These values are comparable to ATP-Binding Cassette (ABC) transporters of membrane lipids, which use ATP hydrolysis (-7.54 ± 0.3 kcal/mol) (Meurer et al., 2017) to drive lipid transport from the membrane to an extracellular acceptor. Some of these transporters share the same mechanism as SMO, where the lipid from the inner leaflet is flipped and transported to the extracellular acceptor protein (Tarling et al., 2013). Additionally, for secondary active transporters that do not use ATP for the transport of substrates, a thermodynamic barrier of 5-6 kcal/mol has been reported in literature. (Chan et al., 2022; Selvam et al., 2019; McComas et al., 2023; Thangapandian et al., 2025).”

      (7) Regarding the kinetics from MSM, it is stated that the values seen here are similar to MFS transporters, but this then references another MSM study. A comparison to experimental values would support this section a lot.

      We thank the reviewer for this suggestion. We have added a discussion discussing millisecond-scale timescales measured for MFS transporters.

      Changes made: (Paragraph 2, Results subsection 5)

      “These timescales are comparable to the substrate transport timescales of Major Facilitator Superfamily (MFS) transporters (Chan et al., 2022). Furthermore, several experimental studies have also resolved the millisecond-scale kinetics of MFS transporters (Blodgett and Carruthers, 2005; Körner et al., 2024; Bazzone et al., 2022; Smirnova et al., 2014; Zhu et al., 2019), further corroborating the results from our study.”

      Reviewer #2 (Recommendations for the authors):

      (1) The heatmaps in Figures 2a and 4a are great. On these, an arrow denotes what looks like a minimum energy path. Is it possible to see this plotted, as this might show the height of the energy barriers more clearly?

      We thank the reviewer for this suggestion. We have computed the minimum energy paths for both pathways and presented them in a supplementary figure.

      Added lines: (Paragraph 4, Results subsection 1):

      For further clarity, we have plotted the minimum energy path taken by cholesterol as it translocates along this pathway (Figure 2—figure Supplement 3)a,b)

      Added lines: (Paragraph 4, Results subsection 2):

      For further clarity, we have plotted the minimum energy path taken by cholesterol as it translocates along this pathway (Figure 2—figure Supplement 3)c,d)

      (2) The tiCA data in S15 is first referred to on line 137, but the technique isn’t introduced until line 222. This makes understanding the data a little confusing. Reordering this might improve readability.

      We thank the reviewer for this suggestion. We have reordered the text to make it clearer.

      Changes made: (Paragraph 2, Results subsection 1) This provides evidence for multiple stable poses along the pathway as observed in the multiple stable poses of cholesterol in Cryo-EM structures of SMO bound to sterols (Deshpande et al., 2019; Qi et al., 2019b, 2020). A reliable estimate of the barriers comes from using the time-lagged Independent Components (tICs), which project the entire dataset along the slowest kinetic degrees of freedom. Overall, the highest barrier along Pathway 1 is 5.8 ± 0.7 kcal/mol, and it is associated with the entry of cholesterol into the TMD (Figure 2—Figure Supplement 2).

      Changes made: (Paragraph 3, Results subsection 2)

      “On plotting the first two components of tICs, (Figure 2—Figure Supplement 2), we observe that the energetic barrier between η and θ is ∼6.5 ± 0.8 kcal/mol.”

      (3) Missing bracket on line 577.

      We thank the reviewer for this suggestion. The typo has been fixed.

      (4) Line 577: Fig. S2nd?

      We thank the reviewer for this suggestion. This typo has been fixed.

      Reviewer #3 (Public review):

      Summary:

      This manuscript presents a study combining molecular dynamics simulations and Hedgehog (Hh) pathway assays to investigate cholesterol translocation pathways to Smoothened (SMO), a G protein-coupled receptor central to Hedgehog signal transduction. The authors identify and characterize two putative cholesterol access routes to the transmembrane domain (TMD) of SMO and propose a model whereby cholesterol traverses through the TMD to the cysteine-rich domain (CRD), which is presented as the primary site of SMO activation. The MD simulations and biochemical experiments are carefully executed and provide useful data.

      Weaknesses:

      However, the manuscript is significantly weakened by a narrow and selective interpretation of the literature, overstatement of certain conclusions, and a lack of appropriate engagement with alternative models that are well-supported by published data-including data from prior work by several of the coauthors of this manuscript. In its current form, the manuscript gives a biased impression of the field and overemphasizes the role of the CRD in cholesterol-mediated SMO activation. Below, I provide specific points where revisions are needed to ensure a more accurate and comprehensive treatment of the biology.

      (1) Overstatement of the CRD as the Orthosteric Site of SMO Activation

      The manuscript repeatedly implies or states that the CRD is the orthosteric site of SMO activation, without adequate acknowledgment of alternative models. To give just a few examples (of many in this manuscript):

      (a) “PTCH is proposed to modulate the Hh signal by decreasing the ability of membrane cholesterol to access SMO’s extracellular cysteine-rich domain (CRD)” (p. 3).

      (b) “In recent years, there has been a vigorous debate on the orthosteric site of SMO” (p. 3).

      (c) “cholesterol must travel through the SMO TMD to reach the orthosteric site in the CRD” (p. 4).

      (d) “we observe cholesterol moving along TM6 to the TMD-CRD interface (common pathway, Fig. 1d) to access the orthosteric binding site in the CRD” (p. 6).

      While the second quote in this list at least acknowledges a debate, the surrounding text suggests that this debate has been entirely resolved in favor of the CRD model. This is misleading and not reflective of the views of other investigators in the field (see, for example, a recent comprehensive review from Zhang and Beachy, Nature Reviews Molecular and Cell Biology 2023, which makes the point that both the CRD and 7TM sites are critical for cholesterol activation of SMO as well as PTCH-mediated regulation of SMO-cholesterol interactions).

      In contrast, a large body of literature supports a dual-site model in which both the CRD and the TMD are bona fide cholesterol-binding sites essential for SMO activation. Examples include:

      (a) Byrne et al., Nature 2016: point mutation of the CRD cholesterol binding site impairs-but does not abolish-SMO activation by cholesterol (SMO D99A, Y134F, and combination mutants - Fig 3 of the 2016 study).

      (b) Myers et al., Dev Cell 2013 and PNAS 2017: CRD deletion mutants retain responsiveness to PTCH regulation and cholesterol mimetics (similar Hh responsiveness of a CRD deletion mutant is also observed in Fig. 4 Byrne et al, Nature 2016).

      (c) Deshpande et al., Nature 2019: mutation of residues in the TMD cholesterol binding site blocks SMO activation entirely, strongly implicating the TMD as a required site, in contrast to the partial effects of mutating or deleting the CRD site.

      Qi et al., Nature 2019, and Deshpande et al., Nature 2019, both reported cholesterol binding at the TMD site based on high-resolution structural data. Oddly, Deshpande et al., Nature 2019, is not cited in the discussion of TMD binding on p. 3, despite being one of the first papers to describe cholesterol in the TMD site and its necessity for activation (the authors only cite it regarding activation of SMO by synthetic small molecules).

      Kinnebrew et al., Sci Adv 2022 report that CRD deletion abolished PTCH regulation, which is seemingly at odds with several studies above (e.g., Byrne et al, Nature 2016; Myers et al, Dev Cell 2013); but this difference may reflect the use of an N-terminal GFP fusion to SMO in the Kinnebrew et al 2022, which could alter SMO activation properties by sterically hindering activation at the TMD site by cholesterol (but not synthetic SMO agonists like SAG); in contrast, the earlier work by Byrne et al is not subject to this caveat because it used an untagged, unmodified form of SMO.

      Although overexpression of PTCH1 and SMO (wild-type or mutant) has been noted as a caveat in studies of CRD-independent SMO activation by cholesterol, this reviewer points out that several of the studies listed above include experiments with endogenous PTCH1 and low-level SMO expression, demonstrating that SMO can clearly undergo activation by cholesterol (as well as regulation by PTCH1) in a manner that does not require the CRD.

      Recommendation: The authors should revise the manuscript to provide a more balanced overview of the field and explicitly acknowledge that the CRD is not the sole activation site. Instead, a dual-site model is more consistent with available structural, mutational, and functional data. In addition, the authors should reframe their interpretation of their MD studies to reflect this broader and more accurate view of how cholesterol binds and activates SMO.

      We thank the reviewer for this comprehensive overview of the existing literature. We agree that cholesterol binding to both the TMD and CRD sites is required for full activation of SMO. As described below in responses to comments, we have made changes to the manuscript to make this point clear. For instance, in the revised manuscript, we refrain from calling the CRD cholesterol binding site the “orthosteric site”. Instead, we highlight that the goal of the manuscript is not to resolve the debate over whether the TMD or CRD site is more important for PTCH1 regulation by SMO but rather to use molecular dynamics to understand the fascinating question of how cholesterol in the membrane can reach the CRD, located at a significant distance above the outer leaflet of the membrane. We believe that this is an important goal since there is an abundance of evidence that supports the view that PTCH1 inhibits SMO by reducing cholesterol access to the CRD. This evidence is now summarized succinctly in the introduction:

      Changes made: (Paragraph 4, Introduction)

      “While cholesterol binding to both the TMD and CRD sites is required for full SMO activation, our work focuses on how cholesterol gains access to the CRD site, perched above the outer leaflet of the membrane (Luchetti et al., 2016; Kinnebrew et al., 2022). Multiple lines of evidence suggest that PTCH1-regulated cholesterol binding to the CRD plays an instructive role in SMO regulation both in cells and animals. Mutations in residues predicted to make hydrogen bonds with the hydroxyl group of cholesterol bound to the CRD reduced both the potency and efficacy of SHH in cellular signaling assays (Kinnebrew et al., 2022; Byrne et al., 2016) and, more importantly, eliminated HH signaling in mouse embryos (Xiao et al., 2017). Experiments using both covalent and photocrosslinkable sterol probes in live cells directly show that PTCH1 activity reduces sterol access to the CRD (Kinnebrew et al., 2022; Xiao et al., 2017). Notably, our simulations evaluate a path of cholesterol translocation that includes both the TMD and CRD sites: cholesterol first enters the 7-transmembrane domain bundle from the membrane; it then engages the TMD site before continuing along a conduit to the CRD site. Thus, we analyze translocation energetics and residue-level contacts along a path that includes both the TMD and the CRD.”

      However, Reviewer 3 makes several comments below that are biased, inaccurate, or selective. We feel it is important to address these so readers can approach the literature from a balanced perspective. Indeed, the eLife review forum provides an ideal venue to present contrasting views on a scientific model. We encourage the editors to publish both Reviewer 3’s comments and our response in full so readers can read the original papers and reach their own conclusions. It is important to note these issues are not relevant to the quality of the computational and experimental data presented in this paper.

      We have now removed the term “orthosteric” to describe the CRD site throughout the paper and clearly state in the introduction that “both the CRD and TMD sites are required for SMO activation” but that our focus is on how cholesterol moves from the membrane to the CRD site. There is no doubt that cholesterol binding to the CRD plays a key role in SMO activation– our focus on this path is justified and does not devalue the importance of the TMD site. Our prior models (see Figure 7 of Kinnebrew 2022 explicitly include contributions of both sites).

      Now we respond to some of the concerns outlined, individually:

      (1) Byrne et al., Nature 2016: point mutation of the CRD cholesterol binding site impairs-but does not abolish-SMO activation by cholesterol (SMO D99A, Y134F, and combination mutants - Fig 3 of the 2016 study)

      The fact that a point mutation dramatically diminishes (but does not abolish signaling) does not mean that the CRD cholesterol binding site is not important for SMO regulation. Indeed, the reviewer fails to mention that Song et. al. (Molecular Cell, 2017) found that a SMO protein carrying a subtle mutation at D99 (D95/99N, a residue that makes a hydrogen bond with the cholesterol hydroxyl) completely abolishes SMO signaling in mouse embryos. Thus, the CRD site is critical for SMO activation in an intact animal, justifying our focus on evaluating the path of cholesterol translocation to the CRD site.

      (2) Myers et al., Dev Cell 2013 and PNAS 2017: CRD deletion mutants retain responsiveness to PTCH regulation and cholesterol mimetics (similar Hh responsiveness of a CRD deletion mutant is also observed in Fig 4 Byrne et al, Nature 2016).

      The Reviewer fails to note that CRD-deleted versions of SMO have markedly (>10-fold) higher basal (i.e. ligand-independent) activity compared to full-length SMO. The response to SHH is minimal (∼2-fold), compared to >50-100-fold with full-length SMO. Thus, CRD-deleted SMO is likely in a non-native conformation. Local changes in cholesterol accessibility caused by PTCH1 inactivation or cholesterol loading can cause small fluctuations in delta-CRD activity, but this cannot be used to infer meaningful insights about how native, full-length SMO (with >10-fold lower basal activity) is regulated. We encourage the reviewer to read our previous paper (Kinnebrew et. al. 2022), which presents a unified view of how the TMD and CRD sites together regulate SMO activation.

      A more physiological experiment, reported in Kinnebrew et. al. 2022, tested mutations in residues that make hydrogen bonds with cholesterol at the CRD and TMD sites in the context of full-length SMO. These mutants were stably expressed at moderate levels in Smo<sup>−/−</sup> cells. Mutations at the CRD site reduced the fold-increase in signaling output in response to SHH, as would be expected for a PTCH1-regulated site. In contrast, analogous mutations in the TMD site reduced the magnitude of both basal and maximal signaling, without affecting the fold-change in response to SHH. In signaling assays, the key parameter in evaluating the impact of a mutation is whether it impacts the change in output in response to a signal (in this case PTCH1 inactivation by SHH). A mutation in SMO that affects PTCH1 regulation is expected to decrease the fold-change in signaling in response to SHH, a criterion that is fulfilled by mutations in the CRD site. Accordingly, mutations in the CRD site abolish SMO signaling in mouse embryos (Xiao et al., 2017).

      (3) Deshpande et al., Nature 2019: mutation of residues in the TMD cholesterol binding site blocks SMO activation entirely, strongly implicating the TMD as a required site, in contrast to the partial effects of mutating or deleting the CRD site.

      Introduction of bulky mutations at the TMD site (V333F) that abolish SMO activity were first reported by Byrne et. al. 2016 and were used to markedly increase the stability of SMO for protein expression. These mutations indeed stabilize the inactive state of SMO, increasing protein abundance and completely preventing its localization at primary cilia. SMO variants carrying such bulky mutations cannot be used to infer the importance of the TMD site since they do not distinguish between the following possibilities: (1) SMO is inactive because the sterol cannot bind, or (2) SMO is inactive because it is locked in an inactive conformation, or (3) SMO is inactive because it cannot localize to primary cilia (where it must be localized to activate downstream signaling).

      As described in Response 3.3, a better evaluation of the importance of the TMD site is the use of mutations in residues that make hydrogen bonds with the hydroxyl group of TMD cholesterol. These mutations do not markedly increase protein stability or prevent ciliary localization (Kinnebrew 2022, Fig.S2). While a TMD site mutation decreases the magnitude of maximal (and basal) SMO signaling, it does not impact the fold-increase in signal output in response to Hh ligands (the key parameter that should be used to evaluate PTCH1 activity).

      (4) Qi et al., Nature 2019, and Deshpande et al., Nature 2019, both reported cholesterol binding at the TMD site based on high-resolution structural data. Oddly, Deshpande et al., Nature 2019 not cited in the discussion of TMD binding on p. 3, despite being one of the first papers to describe cholesterol in the TMD site and its necessity for activation (the authors only cite it regarding activation of SMO by synthetic small molecules)

      The reference has now been added at this location in the manuscript.

      (5) Kinnebrew et al., Sci Adv 2022 report that CRD deletion abolished PTCH regulation, which is seemingly at odds with several studies above (e.g., Byrne et al, Nature 2016; Myers et al, Dev Cell 2013); but this difference may reflect the use of an N-terminal GFP fusion to SMO in the Kinnebrew et al 2022, which could alter SMO activation properties by sterically hindering activation at the TMD site by cholesterol (but not synthetic SMO agonists like SAG); in contrast, the earlier work by Byrne et al is not subject to this caveat because it used an untagged, unmodified form of SMO.

      The reviewer fails to note that CRD deleted versions of SMO have markedly (>10-fold) higher basal activity than full-length SMO. The response to SHH is minimal (∼2fold), compared to >50-fold with full-length SMO. Thus, CRD-deleted SMO is likely in a non-native conformation. Local changes in cholesterol accessibility caused by PTCH1 inactivation or cholesterol loading can cause small fluctuations in delta-CRD activity, but this cannot be used to infer meaningful insights about how native, full-length SMO (with >10-fold lower basal activity) is regulated. Please see Response 3.3 for further details.

      Reviewer 3 presents an incomplete picture of the extensive experiments reported in Kinnebrew et. al. to establish the functionality of YFP-tagged delta-CRD SMO. Most importantly, a TMDselective sterol analog (KK174) can fully activate YFP-tagged delta-CRD, showing conclusively that the YFP fusion does not block sterol access to the TMD site. The fact that this protein is nearly unresponsive to SHH highlights the critical role of the CRD-bound cholesterol in SMO regulation by PTCH1. Indeed, the YFP-tagged, CRD-deleted SMO was made purposefully to test the requirement of the CRD in a construct that had normal basal activity. Again, this data justifies the value of investigating the path of cholesterol movement from the membrane via the TMD site to the CRD.

      (6) Although overexpression of PTCH1 and SMO (wild-type or mutant) has been noted as a caveat in studies of CRD-independent SMO activation by cholesterol, this reviewer points out that several of the studies listed above include experiments with endogenous PTCH1 and low-level SMO expression, demonstrating that SMO can clearly undergo activation by cholesterol (as well as regulation by PTCH1) in a manner that does not require the CRD.

      This comment is inaccurate. The data presented in Deshpande et. al. (and prior work in Myers et. al.) used transient transfection to overexpress SMO in Smo<sup>−/−</sup> cells. At the individual cell level transient transfection produces expression levels that are markedly higher (10-1000-fold) than stable expression (in addition to being more variable). Most scientists would agree that stable expression (as used in Kinnebrew 2022) at a moderate expression level is a better system to compare mutant phenotypes, assess basal and activated signaling, and provide an accurate measure of the fold-change in signal output in response to SHH. Notably, introduction of a mutation in the CRD cholesterol binding site at the endogenous mouse Smo locus (an even better experiment than stable expression) leads to complete loss of SMO activity (PMID 28344083). This result again justifies our investigation of the pathway of cholesterol movement from the membrane to the CRD site.

      We have changed the initial discussion and reflect a more general outlook.

      Changes made: (Paragraph 1, Introduction)

      “PTCH modulates the availability of accessible cholesterol at the primary cilium and thereby regulates SMO, with models invoking effects on both the CRD and 7TM pockets.”

      Changes made: (Results subsection 3, paragraph 1)

      “According to the dual-site model, to reach the binding site in the CRD (ζ), cholesterol translocate along the TMD-CRD interface from the TM binding site (α∗) is required.”

      Added lines: (Paragraph 5, Results subsection 3):

      “The computational investigation showed here covers the dual-site model, where cholesterol reaches the CRD site via binding to the TM binding site first. In comparison to the CRD site, the TM site is more stable by ∼ 2 kcal/mol (Figure 2—Figure Supplement 3b, d).”

      Added lines: (Paragraph 2, Conclusions):

      “Here we have explored the role the CRD-site plays in SMO activation. In addition, through simulating the CRD site-dependent SMO activation hypothesis, we have also simulated the TMD site-dependent activation. We show that the overall stability of cholesterol is higher than the CRD site by ∼ 2 kcal/mol.”

      (2) Bias in Presentation of Translocation Pathways

      The manuscript presents the model of cholesterol translocation through SMO to the CRD as the predominant (if not sole) mechanism of activation. Statements such as: "Cholesterol traverses SMO to ultimately reach the CRD binding site" (p. 6) suggest an exclusivity that is not supported by prior literature in the field. Indeed, the authors’ own MD data presented here demonstrate more stable cholesterol binding at the TMD than at the CRD (p 17), and binding of cholesterol to the TMD site is essential for SMO activation. As such, it is appropriate to acknowledge that cholesterol may activate SMO by translocating through the TM5/6 tunnel, then binding to the TMD site, as this is a likely route of SMO activation in addition to the CRD translocation route they highlight in their discussion.

      The authors describe two possible translocation pathways (Pathway 1: TM2/3 entry to TMD; Pathway 2: TM5/6 entry and direct CRD transfer), but do not sufficiently acknowledge that their own empirical data support Pathway 2 as more relevant. Indeed, because their experimental data suggest Pathway 2 is more strongly linked to SMO activation, this pathway should be weighted more heavily in the authors’ discussion. In addition, Pathway 2 is linked to cholesterol binding to both the TMD and CRD sites (the former because the TMD binding site is at the terminus of the hydrophobic tunnel, the latter via the translocation pathway described in the present manuscript), so it is appropriate that Pathway 2 figures more prominently than Pathway 1 in the authors’ discussion.

      The authors also claim that "there is no experimental structure with cholesterol in the inner leaflet region of SMO TMD" (p 16). However, a structural study of apo-SMO from the Manglik and Cheng labs (Zhang et al., Nat Comm, 2022) identified a cholesterol molecule docked at the TM5/6 interface and also proposed a "squeezing" mechanism by which cholesterol could enter the TM5/6 pocket from the membrane. The authors do not consider this SMO conformation in their models, nor do they discuss the possibility that conformational dynamics at the TM5/6 interface could facilitate cholesterol flipping and translocation into the hydrophobic conduit, despite both possibilities having precedent in the 2022 empirical cryoEM structural analysis.

      Recommendation: The authors should avoid oversimplifying the SMO cholesterol activation process, either by tempering these claims or broadening their discussion to better reflect the complexity and multiplicity of cholesterol access and activation routes for SMO. They should also consider the 2022 apo-SMO cryoEM structure in their analysis of the TM5/6 translocation pathway.

      We thank the reviewer for this comprehensive overview of the existing literature and parts we have missed to include in the discussion. We agree with the reviewer, since our data shows that both pathways are probable. Through our manuscript, we have avoided using a competitive approach (that one pathway dominates over the other). Instead, we have evaluated both pathways independently and presented a comparative rather than competitive overview of both pathways from our observations. While we agree that experimental evidence suggests the inner leaflet pathway is possible, we cannot discount the observations made in previous studies that support the outer leaflet pathway, particularly Hedger et al. (2019), Bansal et al. (2023), and Kinnebrew et al. (2021). Therefore, considering the reviewer’s comments have made the following changes:

      (1) Added lines: (Paragraph 3, Conclusions):

      “We show that the barriers associated with the pathway starting from the outer leaflet are lower by ∼0.7 kcal, (p=0.0013). We also provide evidence that cholesterol can enter SMO via both leaflets, considering that multiple computational and experimental studies have found cholesterol entry sites and activation modulation via the outer leaflet, between TM2TM3. This is countered by evidence from multiple experimental and computational studies corroborating entry via the inner leaflet, between TM5-TM6, including this study. Overall, we posit that cholesterol translocation from either pathway is feasible.”

      (2)nChanges made: (Paragraph 6, Results subsection 2)

      “Based on our experimental and computational data, we conclude that cholesterol translocation can happen via either pathway. This is supported on the basis of the following observations: mutations along pathway 2 affect SMO activity more significantly, and the presence of a direct conduit that connects the inner leaflet to the TMD binding site. In addition, a resolved structure of SMO in the presence of cholesterol shows a cholesterol situated at the entry point from the membrane into the protein between TM5 and TM6, in the inner leaflet. However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol vs. 6.5 ± 0.8 kcal/mol, p \= 0.0013). Additionally, PTCH1 controls cholesterol accessibility in the outer leaflet. This shows that there is a possibility for transport from both leaflets. One possibility that might alter the thermodynamic barriers is native membrane asymmetry, particularly the anionic lipid-rich inner leaflet. This presents as a limitation of our current model.”

      (3)nChanges made: (Paragraph 1, Results subsection 2)

      “In a structure resolved in 2022, cholesterol was observed at the interface between the protein and the membrane, in the inner leaflet, between TMs 5 and 6. However, cholesterol in the inner leaflet has a downward orientation, with the polar hydroxyl group pointing intracellularly (η). A striking observation is that this cholesterol binding site pose was never used as a starting point for simulations and was discovered independent of the pose described in Zhang et al. (2022) (Figure 4—Figure Supplement 1).”

      (3) Alternative Possibility: Direct Membrane Access to CRD

      The possibility that the CRD extracts cholesterol directly from the membrane outer leaflet is not considered. While the crystal structures place the CRD in a stable pose above the membrane, multiple cryo-EM studies suggest that the CRD is dynamic and adopts a variety of conformations, raising the possibility that the stability of the CRD in the crystal structures is a result of crystal packing and that the CRD may be far more dynamic under more physiological conditions.

      Recommendation: The authors should explicitly acknowledge and evaluate this potential mechanism and, if feasible, assess its plausibility through MD simulations.

      We thank the reviewer for the suggestion. We have addressed this comment by calculating the distance from the lipid headgroups for each lipid in the membrane to the cholesterol binding site. We show that in our study, we do not observe any bending of the CRD over the membrane, precluding any cholesterol from being extracted from the membrane directly.

      Added lines: (Paragraph 3, Conclusions):

      “An alternative possibility states that the flexibility associated with the CRD would allow it to directly access the membrane, and consequently, cholesterol. In the extensive simulations reported in this study, the binding site of cholesterol in the CRD remains at least 20 Å away from the nearest lipid head group in the membrane, suggesting that such direct extraction and the bending of the CRD do not occur within the timescales sampled (Appendix 2 – Figure 6).

      The mechanistic details of this process are still unexplored and form the basis of future work.”

      (4) Inconsistent Framing of Study Scope and Limitations

      The discussion contains some contradictory and misleading language. For example, the authors state that "In this study we only focused on the cholesterol movement from the membrane to the CRD binding site," and then several sentences later state that "We outline the entire translocation mechanism from a kinetic and thermodynamic perspective." These statements are at odds. The former appropriately (albeit briefly) notes the limited scope of the modeling, while the latter overstates the generality of the findings.

      In addition, the authors’ narrow focus on the CRD site constitutes a major caveat to the entire work. It should be acknowledged much earlier in the manuscript, preferably in the introduction, rather than mentioned as an aside in the penultimate paragraph of the conclusion.

      Recommendation: The authors should clarify the scope of the study and expand the discussion of its limitations. They should explicitly acknowledge that the study models one of several cholesterol access routes and that the findings do not rule out alternative pathways.

      We thank the reviewer for the suggestion. We have addressed this comment by explicitly mentioning the scope of the study.

      Changes made: (Paragraph 3, Conclusions)

      “We outline the entire translocation mechanism from a kinetic and thermodynamic perspective for one of the leading hypotheses for the activation mechanism of SMO.”

      (5) Summary:

      This study has the potential to make a useful contribution to our understanding of cholesterol translocation and SMO activation. However, in its current form, the manuscript presents an overly narrow and, at times, misleading view of the literature and biological models; as such, it is not nearly as impactful as it could be. I strongly encourage the authors to revise the manuscript to include:

      (1) A more balanced discussion of the CRD vs. TMD binding sites.

      (2) Acknowledgment of alternative cholesterol access pathways.

      (3) More comprehensive citation of prior structural and functional studies.

      (4) Clarification of assumptions and scope.

      Of note, the above suggestions require little to no additional MD simulations or experimental studies, but would significantly enhance the rigor and impact of the work.

      We thank the reviewer for the suggestions. We have taken into account the literature and diverse viewpoints. We have changed the initial discussion and reflected a more general outlook. In the revised version of the manuscript, we have refrained from referring to the CRD site as the orthosteric site. Instead, we refer to it as the CRD sterol-binding site. To better represent the dual-site model, we add further discussion in the Introduction. Through our manuscript, we have avoided using a competitive approach (that one pathway dominates over the other). Instead, we have evaluated both pathways independently and presented a comparative rather than competitive overview of both pathways from our observations. We explicitly mention the scope of the study.

    1. I was in your shoes and I dove in head first. After reading, owning, and watching countless videos on the matter, here's what I have learned: Don't buy online Only buy what you can have your hands on before exchanging money Be picky, don't just get any machine on the belief you'll start fixing them. Do not view them as being "rescued" when you buy another broken machine. Start off with a solid machine with no issues. (I suggest an Olympia brand, sm-3 etc) Honorable mention: only acquire organically through yard sales, estate sales antique stores etc. It imbues your machine with magic 🪄

      via u/Forge_Le_Femme

    1. fresh perspective

      I think the suggestion to set aside your writing for a day is the most underrated part of this chapter. When you're too close to a draft, your brain automatically fills in the logical gaps because you already know what you meant to say, so taking that break is actually a mechanical necessity to help you read the words that are actually on the page rather than the ones in your head.

    1. Many people say they work better with distractions—they prefer to leave the television or the radio on—but the truth is that an environment with too many interruptions is rarely helpful when focus is required. Before deciding that the television or talkative roommates do not bother you when you work, take an honest accounting of the work you produce with interruptions compared to work you do without.

      something to really consider. people even put their headset on with music playing when they are reading, i wonder how they assimilate or even understand what they are reading, talk less of to retain and remember the information that they read about, someone like me, the music lyrics will just be ringing in my head and my mind.

    1. Looking closely, we see each word as a picture: the part of our brain that recog- nizes words has a twin that recognizes faces, and if we never learned to read, both parts would focus on faces.

      This is a good point to begin with. For my Women's Literature class, I recently was assigned to read an essay that I only barely grasped. My head hurt from trying to piece together what the incredibly intellectual author was saying. There were several passages I had to reread out loud. Which brings up the point of how interesting it is that rereading something, especially out loud, can help us "see" what the author wants us to see, or at least our interpretation of it.

    2. the part of our brain that recog- nizes words has a twin that recognizes faces,

      I've actually never thought about that before! Then again, i cannot visualize anything in my head, so maybe that is why. But hearing out loud is such an interesting concept. I wonder why, in theory, our brain has to visualize each and every word on the page?

    3. I often use a modified version of CRP as a starting point,

      As someone with little experience going into this reading, I would have appreciated an inclusion of what CRP stands for, as it would help me visualize how to reverse the steps better in my head.

  3. read.amazon.com read.amazon.com
    1. These fears are so real and so relatable. My initial fears were around judgement. Judgement that I was too old, or whats the point of going to college now at 63. A bit of shame that I had not gone to college after high School, which was always a thought in the back of my head. It made me feel less than, even though I had acheived great financial and professional success. I am now loving taking classes, and have a 4.0. My current anxiety is can I cut it at U.C. Berkeley or UC Davis one of which I plan to transfer to next year.

    1. “But, mother, if I do get asleep, you won’t let him get me?” “No! so may God help me!” said his mother, with a paler cheek, and a brighter light in her large dark eyes. “You’re sure, an’t you, mother?” “Yes, sure!” said the mother, in a voice that startled herself; for it seemed to her to come from a spirit within, that was no part of her; and the boy dropped his little weary head on her shoulder, and was soon asleep. How the touch of those warm arms, the gentle breathings that came in her neck, seemed to add fire and spirit to her movements! It seemed to her as if strength poured into her in electric streams, from every gentle touch and movement of the sleeping, confiding child. Sublime is the dominion of the mind over the body, that, for a time, can make flesh and nerve impregnable, and string the sinews like steel, so that the weak become so mighty.

      This scene shocks through the contrast between a child’s innocent fear and the extreme resolve it awakens in his mother. The question “you won’t let him get me?” reveals how the threat of being sold has invaded even the safety of sleep, underscoring slavery’s reach into the most intimate spaces of family life. This pure and genuine question awakens something in Eliza- a strength to continue the fight to escape and bring Harry to freedom no matter what. Stowe intensifies the emotional impact by showing maternal love as a source of physical power, framing motherhood as both sacred and revolutionary in the face of slavery’s violence.

    2. Rachel never looked so truly and benignly happy as at the head of her table. There was so much motherliness and full-heartedness even in the way she passed a plate of cakes or poured a cup of coffee, that it seemed to put a spirit into the food and drink she offered.

      The importance of this description of Rachel and her treatment of the slaves cannot be underscored for the juxtaposition it provides between the torture slaves were typically forced to endure and the warm hospitality she instead offers them. It presents to the audiences that the privileged and those who have been enslaved can live harmoniously as equals and not be oppositions to one another.

    3. What business had his slave to be marching round the country, inventing machines, and holding up his head among gentlemen? He’d soon put a stop to it. He’d take him back, and put him to hoeing and digging, and “see if he’d step about so smart.”

      This sentence reveals the master's jealousy and how threatened he feels by George’s confidence, skill, and intelligence. Seeing George’s accomplishments at the factory makes the master view him as more than just a worker, so he acts quickly to reduce George back to a piece of property by taking him home and forcing him into hard labor.

    1. Montague. Who set this ancient quarrel new abroach? 125Speak, nephew, were you by when it began? Benvolio. Here were the servants of your adversary, And yours, close fighting ere I did approach: I drew to part them: in the instant came The fiery Tybalt, with his sword prepared, 130Which, as he breathed defiance to my ears, He swung about his head and cut the winds, Who nothing hurt withal hiss'd him in scorn: While we were interchanging thrusts and blows, Came more and more and fought on part and part, 135Till the prince came, who parted either part. Lady Montague. O, where is Romeo? saw you him to-day? Right glad I am he was not at this fray. Benvolio. Madam, an hour before the worshipp'd sun Peer'd forth the golden window of the east, 140A troubled mind drave me to walk abroad; Where, underneath the grove of sycamore That westward rooteth from the city's side, So early walking did I see your son: Towards him I made, but he was ware of me 145And stole into the covert of the wood: I, measuring his affections by my own, That most are busied when they're most alone, Pursued my humour not pursuing his, And gladly shunn'd who gladly fled from me. 150 Montague. Many a morning hath he there been seen, With tears augmenting the fresh morning dew. Adding to clouds more clouds with his deep sighs; But all so soon as the all-cheering sun Should in the furthest east begin to draw 155The shady curtains from Aurora's bed, Away from the light steals home my heavy son, And private in his chamber pens himself, Shuts up his windows, locks far daylight out And makes himself an artificial night: 160Black and portentous must this humour prove, Unless good counsel may the cause remove. Benvolio. My noble uncle, do you know the cause? Montague. I neither know it nor can learn of him. Benvolio. Have you importuned him by any means? 165 Montague. Both by myself and many other friends: But he, his own affections' counsellor, Is to himself—I will not say how true— But to himself so secret and so close, So far from sounding and discovery, 170As is the bud bit with an envious worm, Ere he can spread his sweet leaves to the air, Or dedicate his beauty to the sun. Could we but learn from whence his sorrows grow. We would as willingly give cure as know.

      the conversation shift too romeo and lady montaque is asking benvolioo where he is and is explained by montaque that he is always withdrawn and sad

    2. You Capulet; shall go along with me: And, Montague, come you this afternoon, 120To know our further pleasure in this case, To old Free-town, our common judgment-place. Once more, on pain of death, all men depart. [Exeunt all but MONTAGUE, LADY MONTAGUE, and BENVOLIO] Montague. Who set this ancient quarrel new abroach? 125Speak, nephew, were you by when it began? Benvolio. Here were the servants of your adversary, And yours, close fighting ere I did approach: I drew to part them: in the instant came The fiery Tybalt, with his sword prepared, 130Which, as he breathed defiance to my ears, He swung about his head and cut the winds, Who nothing hurt withal hiss'd him in scorn: While we were interchanging thrusts and blows, Came more and more and fought on part and part, 135Till the prince came, who parted either part. Lady Montague. O, where is Romeo? saw you

      Benvolio explains that the servants of both houses started fighting first. He tried to stop them, but then Tybalt being the hot head he is, came and attacked him he later gets asked by Lady Montague if he was involved in the fight which he wasnt

    3. Exeunt all but MONTAGUE, LADY MONTAGUE, and BENVOLIO] Montague. Who set this ancient quarrel new abroach? 125Speak, nephew, were you by when it began? Benvolio. Here were the servants of your adversary, And yours, close fighting ere I did approach: I drew to part them: in the instant came The fiery Tybalt, with his sword prepared, 130Which, as he breathed defiance to my ears, He swung about his head and cut the winds, Who nothing hurt withal hiss'd him in scorn: While we were interchanging thrusts and blows, Came more and more and fought on part and part, 135Till the prince came, who parted either part. Lady Montague. O, where is Romeo? saw you him to-day? Right glad I am he was not at this fray. Benvolio. Madam, an hour before the worshipp'd sun Peer'd forth the golden window of the east, 140A troubled mind drave me to walk abroad; Where, underneath the grove of sycamore That westward rooteth from the city's side, So early walking did I see your son: Towards him I made, but he was ware of me 145And stole into the covert of the wood: I, measuring his affections by my own, That most are busied when they're most alone, Pursued my humour not pursuing his, And gladly shunn'd who gladly fled from me.

      Montague is asking who started the fight

    1. Whatif we care about our technologie

      I like this! Our world runs based on what we pay attention to, and what is attention and a willingness to do something about what is noticed but care? If we don't care about something, only the sensational or head turning is revealed to us-- it seems like we lose the everyday

    1. “There was a canoe of men, my husband, Hat, among them. They passed by Spirit Island. Saw the dead. Saw you.” “So it was they who brought me back?” “No,” said Tallow, simply. “They saw me,” said Omakakiins, making sure, “but they didn’t save me.” Old Tallow shook her head in the dusk. Then she shook herself all over, just like one of her dogs. “Hay’! My husband, Hat, was a fearful fool. I was going to put his things out the door, anyway. When he told me that he and the other men had seen you, and gone on! Leaving you!” Old Tallow’s voice took fury. “I made him leave. ‘Don’t show me your face, ever!’ I said to him. And then I took my canoe over to that island.” The wintery trees clacked their branches, ticking and moaning. The wind picked up often, at dusk, on the island. Omakakiins could feel in her heart what it was like for that baby, for herself, all alone with the dead, with her mother, walking from those she loved as though walking stone to stone. Somehow, deep inside, she remembered. “It was spring,” she said softly. “Ziigwan.” “Owah!” said Old Tallow in surprise, peering closely at her. “You remember!” “The birds,” said Omakakiins, “I remember the birds, the songs of the birds.” “Howah!” Tallow was excited. “I had forgotten, myself. There were birds on that island, singing so prettily, so loudly! Too small to eat. The little birds with white throats, those sweet spring cries. Eya’! My girl, you remember them.” “They kept me alive,” said Omakakiins, to herself, not quite understanding her own words. “I remember their song because their song was my comfort, my lullaby. They kept me alive.”

      Now we know its 100 percent confirmed that Omakakiins was the girl in the beginning

    1. The e-books that I published, for example, are years’ worth of blog posts, where I take the best ones, synthesize them, and put them together into a better product. The first book that I published was after 10 years of blogging. I took all these 10 years’ worth of content and launched the Perpetual Beta series. Then, I’ve added several to that. The last e-book was launched in 2024, and that was more of a questioning of what the heck is going on, because that was where AI was starting to rear its ugly head and everything.

      Harold used his blog as source for creating ebooks on diff topics. #idea Vgl jaarboeken v blogarchief

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1

      Chen et al. engineered and characterized a suite of next-generation GECIs for the Drosophila NMJ that allow for the visualization of calcium dynamics within the presynaptic compartment, at presynaptic active zones, and in the postsynaptic compartment. These GECIs include ratiometric presynaptic Scar8m (targeted to synaptic vesicles), ratiometric active zone localized Bar8f (targeted to the scaffold molecule BRP), and postsynaptic SynapGCaMP8m. The authors demonstrate that these new indicators are a large improvement on the widely used GCaMP6 and GCaMP7 series GECIs, with increased speed and sensitivity. They show that presynaptic Scar8m accurately captures presynaptic calcium dynamics with superior sensitivity to the GCaMP6 and GCaMP7 series and with similar kinetics to chemical dyes. The active-zone targeted Bar8f sensor was assessed for the ability to detect release-site-specific nanodomain changes, but the authors concluded that this sensor is still too slow to accurately do so. Lastly, the use of postsynaptic SynapGCaMP8m was shown to enable the detection of quantal events with similar resolution to electrophysiological recordings. Finally, the authors developed a Python-based analysis software, CaFire, that enables automated quantification of evoked and spontaneous calcium signals. These tools will greatly expand our ability to detect activity at individual synapses without the need for chemical dyes or electrophysiology.

      We thank this Reviewer for the overall positive assessment of our manuscript and for the incisive comments.

      (1) The role of Excel in the pipeline could be more clearly explained. Lines 182-187 could be better worded to indicate that CaFire provides analysis downstream of intensity detection in ImageJ. Moreover, the data type of the exported data, such as .csv or .xlsx, should be indicated instead of 'export to graphical program such as Microsoft Excel'.

      We thank the Reviewer for these comments, many of which were shared by the other reviewers. In response, we have now 1) more clearly explained the role of Excel in the CaFire pipeline (lines 677-681), 2) revised the wording in lines 676-679 to indicate that CaFire provides analysis downsteam of intensity detection in ImageJ, and 3) Clarified the exported data type to Excel (lines 677-681). These efforts have improved the clarity and readability of the CaFire analysis pipeline.

      (2) In Figure 2A, the 'Excel' step should either be deleted or included as 'data validation' as ImageJ exports don't require MS Excel or any specific software to be analysed. (Also, the graphic used to depict Excel software in Figure 2A is confusing.)

      We thank the reviewer for this helpful suggestion. In the Fig. 2A, we have changed the Excel portion and clarified the processing steps in the revised methods. Specifically, we now indicate that ROIs are first selected in Fiji/ImageJ and analyzed to obtain time-series data containing both the time information and the corresponding imaging mean intensity values. These data are then exported to a spreadsheet file (e.g., Excel), which is used to organize the output before being imported into CaFire for subsequent analysis. These changes can be found in the Fig. 2A and methods (lines 676-681).

      (3) Figure 2B should include the 'Partition Specification' window (as shown on the GitHub) as well as the threshold selection to give the readers a better understanding of how the tool works.

      We absolutely agree with this comment, and have made the suggested changes to the Fig. 2B. In particular, we have replaced the software interface panels and now include windows illustrating the Load File, Peak Detection, and Partition functions. These updated screenshots provide a clearer view of how CaFire is used to load the data, detect events, and perform partition specification for subsequent analysis. We agree these changes will give the readers a better understanding of how the tool works, and we thank the reviewer for this comment.

      (4) The presentation of data is well organized throughout the paper. However, in Figure 6C, it is unclear how the heatmaps represent the spatiotemporal fluorescence dynamics of each indicator. Does the signal correspond to a line drawn across the ROI shown in Figure 6B? If so, this should be indicated.

      We apologize that the heatmaps were unclear in Fig panel 6C (Fig. 7C in the Current revision). Each heatmap is derived from a one-pixel-wide vertical line within a miniature-event ROI. These heatmaps correspond to the fluorescence change in the indicated SynapGCaMP variant of individual quantal events and their traces shown in Fig. 7C, with a representative image of the baseline and peak fluorescence shown in Fig. 7B. Specifically, we have added the following to the revised Fig. 7C legend:

      The corresponding heatmaps below were generated from a single vertical line extracted from a representative miniature-event ROI, and visualize the spatiotemporal fluorescence dynamics (ΔF/F) along that line over time.

      (5) In Figure 6D, the addition of non-matched electrophysiology recordings is confusing. Maybe add "at different time points" to the end of the 6D legend, or consider removing the electrophysiology trace from Figure 6D and referring the reader to the traces in Figure 7A for comparison (considering the same point is made more rigorously in Figure 7).

      This is a good point, one shared with another reviewer. We apologize this was not clear, and have now revised this part of the figure to remove the electrophysiological traces in what is now Fig. 7 while keeping the paired ones still in what is now Fig. 8A as suggested by the reviewer. We agree this helps to clarify the quantal calcium transients.

      (6) In GitHub, an example ImageJ Script for analyzing the images and creating the inputs for CaFire would be helpful to ensure formatting compatibility, especially given potential variability when exporting intensity information for two channels. In the Usage Guide, more information would be helpful, such as how to select ∆R/R, ideally with screenshots of the application being used to analyze example data for both single-channel and two-channel images.

      We agree that additional details added to the GitHub would be helpful for users of CaFire. In response, we have now added the following improvements to the GitHub site: 

      - ImageJ operation screenshots

      Step-by-step illustrations of ROI drawing and Multi Measure extraction.

      - Example Excel file with time and intensity values

      Demonstrates the required data format for CaFire import, including proper headers.

      - CaFire loading screenshots for single-channel and dual-channel imaging

      Shows how to import GCaMP into Channel 1 and mScarlet into Channel 2.

      - Peak Detection and Partition setting screenshots

      Visual examples of automatic peak detection, manual correction, and trace partitioning.

      - Instructions for ROI Extraction and CaFire Analysis

      A written guide describing the full workflow from ROI selection to CaFire data export.

      These changes have improved the usability and accessibility of CaFire, and we thank the reviewer for these points.

      Reviewer #2

      Calcium ions play a key role in synaptic transmission and plasticity. To improve calcium measurements at synaptic terminals, previous studies have targeted genetically encoded calcium indicators (GECIs) to pre- and postsynaptic locations. Here, Chen et al. improve these constructs by incorporating the latest GCaMP8 sensors and a stable red fluorescent protein to enable ratiometric measurements. In addition, they develop a new analysis platform, 'CaFire', to facilitate automated quantification. Using these tools, the authors demonstrate favorable properties of their sensors relative to earlier constructs. Impressively, by positioning postsynaptic GCaMP8m near glutamate receptors, they show that their sensors can report miniature synaptic events with speed and sensitivity approaching that of intracellular electrophysiological recordings. These new sensors and the analysis platform provide a valuable tool for resolving synaptic events using all-optical methods.

      We thank the Reviewer for their overall positive evaluation and comments.

      Major comments:

      (1) While the authors rigorously compared the response amplitude, rise, and decay kinetics of several sensors, key parameters like brightness and photobleaching rates are not reported. I feel that including this information is important as synaptically tethered sensors, compared to freely diffusible cytosolic indicators, can be especially prone to photobleaching, particularly under the high-intensity illumination and high-magnification conditions required for synaptic imaging. Quantifying baseline brightness and photobleaching rates would add valuable information for researchers intending to adopt these tools, especially in the context of prolonged or high-speed imaging experiments.

      This is a good point made by the reviewer, and one we agree will be useful for researchers to be aware. First, it is important to note that the photobleaching and brightness of the sensors will vary depending on the nature of the user’s imaging equipment, which can vary significantly between widefield microscopes (with various LED or halogen light sources for illumination), laser scanning systems (e.g., line scans with confocal systems), or area scanning systems using resonant scanners (as we use in our current study). Under the same imaging settings, GCaMP8f and 8m exhibit comparable baseline fluorescence, whereas GCaMP6f and 6s are noticeably dimmer; because our aim is to assess each reagent’s potential under optimal conditions, we routinely adjust excitation/camera parameters before acquisition to place baseline fluorescence in an appropriate dynamic range. As an important addition to this study, motivated by the reviewer’s comments above, we now directly compare neuronal cytosolic GCaMP8m expression with our Scar8m sensor, showing higher sensitivity with Scar8m (now shown in the new Fig. 3F-H).

      Regarding photobleaching, GCaMP signals are generally stable, while mScarlet is more prone to bleaching: in presynaptic area scanned confocal recordings, the mScarlet channel drops by ~15% over 15 secs, whereas GCaMP6s/8f/8m show no obvious bleaching over the same window (lines 549-553). In contrast, presynaptic widefield imaging using an LED system (CCD), GCaMP8f shows ~8% loss over 15 secs (lines 610-611). Similarly, for postsynaptic SynapGCaMP6f/8f/8m, confocal resonant area scans show no obvious bleaching over 60 secs, while widefield shows ~2–5% bleaching over 60 secs (lines 634-638). Finally, in active-zone/BRP calcium imaging (confocal), mScarlet again bleaches by ~15% over 15 s, while GCaMP8f/8m show no obvious bleaching. The mScarlet-channel bleaching can be corrected in Huygens SVI (Bleaching correction or via the Deconvolution Wizard), whereas we avoid applying bleaching correction to the green GCaMP channel when no clear decay is present to prevent introducing artifacts. This information is now added to the methods (lines 548-553).

      (2) In several places, the authors compare the performance of their sensors with synthetic calcium dyes, but these comparisons are based on literature values rather than on side-by-side measurements in the same preparation. Given differences in imaging conditions across studies (e.g., illumination, camera sensitivity, and noise), parameters like indicator brightness, SNR, and photobleaching are difficult to compare meaningfully. Additionally, the limited frame rate used in the present study may preclude accurate assessment of rise times relative to fast chemical dyes. These issues weaken the claim made in the abstract that "...a ratiometric presynaptic GCaMP8m sensor accurately captures .. Ca²⁺ changes with superior sensitivity and similar kinetics compared to chemical dyes." The authors should clearly acknowledge these limitations and soften their conclusions. A direct comparison in the same system, if feasible, would greatly strengthen the manuscript.

      We absolutely agree with these points made the reviewer, and have made a concerted effort to address them through the following:

      We have now directly compared presynaptic calcium responses on the same imaging system using the chemical dye Oregon Green Bapta-1 (OGB-1), one of the primary synthetic calcium indicators used in our field. These experiments reveal that Scar8f exhibits markedly faster kinetics and an improved signal-to-noise ratio compared to OGB-1, with higher peak fluorescence responses (Scar8f: 0.32, OGB-1: 0.23). The rise time constants of the two indicators are comparable (both ~3 msecs), whereas the decay of Scar8f is faster than that of OGB-1 (Scar8f: ~40, OGB-1: ~60), indicating more rapid signal recovery. These results now directly demonstrate the superiority of the new GCaMP8 sensors we have engineered over conventional synthetic dyes, and are now presented in the new Fig. 3A-E of the manuscript.

      We agree with the reviewer that, in the original submission, the relatively slow resonant area scans (~115 fps) limited the temporal resolution of our rise time measurements. To address this, we have re-measured the rise time using higher frame-rate line scans (kHz). For Scar8f, the rise time constant was 6.736 msec at ~115 fps resonant area scanned, but shortened to 2.893 msec when imaged at ~303 fps, indicating that the original protocol underestimated the true kinetics. In addition, for Bar8m, area scans at ~118 fps yielded a rise time constant of 9.019 msec, whereas line scans at ~1085 fps reduced the rise time constant to 3.230 msec. These new measurements are now incorporated into the manuscript ( Figs. 3,4, and 6) to more accurately reflect the fast kinetics of these indicators.

      (3) The authors state that their indicators can now achieve measurements previously attainable with chemical dyes and electrophysiology. I encourage the authors to also consider how their tools might enable new measurements beyond what these traditional techniques allow. For example, while electrophysiology can detect summed mEPSPs across synapses, imaging could go a step further by spatially resolving the synaptic origin of individual mEPSP events. One could, for instance, image MN-Ib and MN-Is simultaneously without silencing either input, and detect mEPSP events specific to each synapse. This would enable synapse-specific mapping of quantal events - something electrophysiology alone cannot provide. Demonstrating even a proof-of-principle along these lines could highlight the unique advantages of the new tools by showing that they not only match previous methods but also enable new types of measurements.

      These are excellent points raised by the reviewer. In response, we have done the following: 

      We have now included a supplemental video as “proof-of-principle” data showing simultaneous imaging of SynapGCaMP8m quantal events at both MN-Is and -Ib, demonstrating that synapse-specific spatial mapping of quantal events can be obtained with this tool (see new Supplemental Video 1). 

      We have also included an additional discussion of the potential and limitations of these tools for new measurements beyond conventional approaches. This discussion is now presented in lines 419-421 in the manuscript.

      (4) For ratiometric measurements, it is important to estimate and subtract background signals in each channel. Without this correction, the computed ratio may be skewed, as background adds an offset to both channels and can distort the ratio. However, it is not clear from the Methods section whether, or how, background fluorescence was measured and subtracted.

      This is a good point, and we agree more clarification about how ratiometric measurements were made is needed. In response, we have now added the following to the Methods section (lines 548-568):

      Time-lapse videos were stabilized and bleach-corrected prior to analysis, which visibly reduced frame-toframe motion and intensity drift. In the presynaptic and active-zone mScarlet channel, a bleaching factor of ~1.15 was observed during the 15 sec recording. This bleaching can be corrected using the “Bleaching correction” tool in Huygens SVI. For presynaptic and active-zone GCaMP signals, there was minimal bleaching over these short imaging periods. Therefore, the bleaching correction step for GCaMP was skipped. Both GCaMP and mScarlet channels were processed using the default settings in the Huygens SVI “Deconvolution Wizard” (with the exception of the bleaching correction option). Deconvolution was performed using the CMLE algorithm with the Huygens default stopping criterion and a maximum of 30 iterations, such that the algorithm either converged earlier or, if convergence was not reached, was terminated at this 30iteration limit; no other iteration settings were used across the GCaMP series. ROIs were drawn on the processed images using Fiji ImageJ software, and mean fluorescence time courses were extracted for the GCaMP and mScarlet channels, yielding F<sub>GCaMP</sub>(t) and F<sub>mScarlet</sub>(t). F(t)s were imported into CaFire with GCaMP assigned to Channel #1 (signal; required) and mScarlet to Channel #2 (baseline/reference; optional). If desired, the mScarlet signal could be smoothed in CaFire using a user-specified moving-average window to reduce high-frequency noise. In CaFire’s ΔR/R mode, the per-frame ratio was computed as R(t)=F<sub>GCaMP</sub>(t) and F<sub>mScarlet</sub>(t); a baseline ratio R0 was estimated from the pre-stimulus period, and the final response was reported as ΔR/R(t)=[R(t)−R0]/R0, which normalizes GCaMP signals to the co-expressed mScarlet reference and thereby reduces variability arising from differences in sensor expression level or illumination across AZs.

      (5) At line 212, the authors claim "... GCaMP8m showing 345.7% higher SNR over GCaMP6s....(Fig. 3D and E) ", yet the cited figure panels do not present any SNR quantification. Figures 3D and E only show response amplitudes and kinetics, which are distinct from SNR. The methods section also does not describe details for how SNR was defined or computed.

      This is another good point. We define SNR operationally as the fractional fluorescence change (ΔF/F). Traces were processed with CaFire, which estimates a per-frame baseline F<sub>0</sub>(t) with a user-configurable sliding window and percentile. In the Load File panel, users can specify both the length of the moving baseline window and the desired percentile; the default settings are a 50-point window and the 30th percentile, representing a 101-point window centered on each time point (previous 50 to next 50 samples) and took the lower 30% of values within that window to estimate F<sub>0</sub>(t). The signal was then computed as ΔF/F=[F(t)−F0(t)]/F0(t). This ΔF/F value is what we report as SNR throughout the manuscript and is now discussed explicitly in the revised methods (lines 686-693).

      (6) Lines 285-287 "As expected, summed ΔF values scaled strongly and positively with AZ size (Fig. 5F), reflecting a greater number of Cav2 channels at larger AZs". I am not sure about this conclusion. A positive correlation between summed ΔF values and AZ size could simply reflect more GCaMP molecules in larger AZs, which would give rise to larger total fluorescence change even at a given level of calcium increase.

      The reviewer makes a good point, one that we agree should be clarified. The reviewer is indeed correct that larger active zones should have more abundant BRP protein, which in turn will lead to a higher abundance of the Bar8f sensor, which should lead to a higher GCaMP response simply by having more of this sensor. However, the inclusion of the ratiometric mScarlet protein should normalize the response accurately, correcting for this confound, in which the higher abundance of GCaMP should be offset (normalized) by the equally (stoichiometric) higher abundance of mScarlet. Therefore, when the ∆R/R is calculated, the differences in GCaMP abundance at each AZ should be corrected for the ratiometric analysis. We now use an improved BRP::mScarlet3::GCaMP8m (Bar8m) and compute ΔR/R with R(t)=F<sub>GCaMP8m</sub>/F<sub>mScarlet3</sub>. ROIs were drawn over individual AZs (Fig. 6B). CaFire estimated R0 with a sliding 101-point window using the lowest 10% of values, and responses were reported as ΔR/R=[R−R0]/R0. Area-scan examples (118 fps) show robust ΔR/R transients (peaks ≈1.90 and 3.28; tau rise ≈9.0–9.3 ms; Fig. 6C, middle).

      We have now made these points more clearly in the manuscript (lines 700-704) and moved the Bar8f intensity vs active zone size data to Table S1. Together, these revisions improve the indicator-abundance confound (via mScarlet normalization). 

      (6) Lines 313-314: "SynapGCaMP quantal signals appeared to qualitatively reflect the same events measured with electrophysiological recordings (Fig. 6D)." This statement is quite confusing. In Figure 6D, the corresponding calcium and ephys traces look completely different and appear to reflect distinct sets of events. It was only after reading Figure 7 that I realized the traces shown in Figure 6D might not have been recorded simultaneously. The authors should clarify this point.

      Yes, we absolutely agree with this point, one shared by Reviewer 1. In response, we have removed the electrophysiological traces in Fig. 6 to clarify that just the calcium responses are shown, and save the direct comparison for the Fig. 7 data (now revised Fig. 8).

      (8) Lines 310-313: "SynapGCaMP8m .... striking an optimal balance between speed and sensitivity", and Lines 314-316: "We conclude that SynapGCaMP8m is an optimal indicator to measure quantal transmission events at the synapse." Statements like these are subjective. In the authors' own comparison, GCaMP8m is significantly slower than GCaMP8f (at least in terms of decay time), despite having a moderately higher response amplitude. It is therefore unclear why GCaMP8m is considered 'optimal'. The authors should clarify this point or explain their rationale for prioritizing response amplitude over speed in the context of their application.

      This is another good point that we agree with, as the “optimal” sensor will of course depend on the user’s objectives. Hence, we used the term “an optimal sensor” to indicate it is what we believed to be the best one for our own uses. However, this point should be clarified and better discussed. In response, we have revised the relevant sections of the manuscript to better define why we chose the 8m sensors to strike an optimal balance of speed and sensitivity for our uses, and go on to discuss situations in which other sensor variants might be better suited. These are now presented in lines 223-236 in the revised manuscript, and we thank the reviewer for making these comments, which have improved our study.

      Minor comments

      (1)  Please include the following information in the Methods section:

      (a) For Figures 3 and 4, specify how action potentials were evoked. What type of electrodes were used, where were they placed, and what amount of current or voltage was applied?

      We apologize for neglecting to include this information in the original submission. We have now added this information to the revised Methods section (lines 537-543).

      (b) For imaging experiments, provide information on the filter sets used for each imaging channel, and describe how acquisition was alternated or synchronized between the green and red channels in ratiometric measurements. Additionally, please report the typical illumination intensity (in mW/mm²) for each experimental condition.

      We thank the reviewer for this helpful comment. We have now added detailed information about the imaging configuration to the Methods (lines 512-528) with the following:

      Ca2+ imaging was conducted using a Nikon A1R resonant scanning confocal microscope equipped with a 60x/1.0 NA water-immersion objective (refractive index 1.33). GCaMP signals were acquired using the FITC/GFP channel (488-nm laser excitation; emission collected with a 525/50-nm band-pass filter), and mScarlet/mCherry signals were acquired using the TRITC/mCherry channel (561-nm laser excitation; emission collected with a 595/50-nm band-pass filter). ROIs focused on terminal boutons of MN-Ib or -Is motor neurons. For both channels, the confocal pinhole was set to a fixed diameter of 117.5 µm (approximately three Airy units under these conditions), which increases signal collection while maintaining adequate optical sectioning. Images were acquired as 256 × 64 pixel frames (two 12-bit channels) using bidirectional resonant scanning at a frame rate of ~118 frames/s; the scan zoom in NIS-Elements was adjusted so that this field of view encompassed the entire neuromuscular junction and was kept constant across experiments. In ratiometric recordings, the 488-nm (GCaMP) and 561-nm (mScarlet) channels were acquired in a sequential dual-channel mode using the same bidirectional resonant scan settings: for each time point, a frame was first collected in the green channel and then immediately in the red channel, introducing a small, fixed frame-to-frame temporal offset while preserving matched spatial sampling of the two channels.

      Directly measuring the absolute laser power at the specimen plane (and thus reporting illumination intensity in mW/mm²) is technically challenging on this resonant-scanning system, because it would require inserting a power sensor into the beam path and perturbing the optical alignment; consequently, we are unable to provide reliable absolute mW/mm² values. Instead, we now report all relevant acquisition parameters (objective, numerical aperture, refractive index, pinhole size, scan format, frame rate, and fixed laser/detector settings) and note that laser powers were kept constant within each experimental series and chosen to minimize bleaching and phototoxicity while maintaining an adequate signal-to-noise ratio. We have now added the details requested in the revised Methods section (lines 512-535), including information about the filter sets, acquisition settings, and typical illumination intensity.

      (2) Please clarify what the thin versus thick traces represent in Figures 3D, 3F, 4C, and 4E. Are the thin traces individual trials from the same experiment, or from different experiments/animals? Does the thick trace represent the mean/median across those trials, a fitted curve, or a representative example?

      We apologize this was not more clear in the original submission. Thin traces are individual stimulus-evoked trials (“sweeps”) acquired sequentially from the same muscle/NMJ in a single preparation; the panel is shown as a representative example of recordings collected across animals. The thick colored trace is the trialaveraged waveform (arithmetic mean) of those thin traces after alignment to stimulus onset and baseline subtraction (no additional smoothing beyond what is stated in Methods). The thick black curve over the decay phase is a single-exponential fit used to estimate τ. Specifically, we fit the decay segment by linear regression on the natural-log–transformed baseline-subtracted signal, which is equivalent to fitting y = y<sub>peak</sub>·e<sup>−t/τdecay</sup> over the decay window (revised Fig.4D and Fig.5C legends).

      (3) Please clarify what the reported sample size (n) represents. Does it indicate the number of experimental repeats, the number of boutons or PSDs, or the number of animals?

      Again, we apologize this was not clear. (n) refers to the number of animals (biological replicates), which is reported in Supplementary Table 1. All imaging was performed at muscle 6, abdominal segment A3. Per preparation, we imaged 1-2 NMJs in total, with each imaging targeting 2–3 terminal boutons at the target NMJ and acquired 2–3 imaging stacks choosing different terminal boutons per NMJ. For the standard stimulation protocol, we delivered 1 Hz stimulation for 1ms and captured 14 stimuli in a 15s time series imaging (lines 730-736).

      Reviewer #3

      Genetically encoded calcium indicators (GECIs) are essential tools in neurobiology and physiology. Technological constraints in targeting and kinetics of previous versions of GECIs have limited their application at the subcellular level. Chen et al. present a set of novel tools that overcome many of these limitations. Through systematic testing in the Drosophila NMJ, they demonstrate improved targeting of GCaMP variants to synaptic compartments and report enhanced brightness and temporal fidelity using members of the GCaMP8 series. These advancements are likely to facilitate more precise investigation of synaptic physiology.

      This is a comprehensive and detailed manuscript that introduces and validates new GECI tools optimized for the study of neurotransmission and neuronal excitability. These tools are likely to be highly impactful across neuroscience subfields. The authors are commended for publicly sharing their imaging software.

      This manuscript could be improved by further testing the GECIs across physiologically relevant ranges of activity, including at high frequency and over long imaging sessions. The authors provide a custom software package (CaFire) for Ca2+ imaging analysis; however, to improve clarity and utility for future users, we recommend providing references to existing Ca2+ imaging tools for context and elaborating on some conceptual and methodological aspects, with more guidance for broader usability. These enhancements would strengthen this already strong manuscript.

      We thank the Reviewer for their overall positive evaluation and comments. 

      Major comments:

      (1) Evaluation of the performance of new GECI variants using physiologically relevant stimuli and frequency. The authors took initial steps towards this goal, but it would be helpful to determine the performance of the different GECIs at higher electrical stimulation frequencies (at least as high as 20 Hz) and for longer (10 seconds) (Newman et al, 2017). This will help scientists choose the right GECI for studies testing the reliability of synaptic transmission, which generally requires prolonged highfrequency stimulation.

      We appreciate this point by the reviewer and agree it would be of interest to evaluate sensor performance with higher frequency stimulation and for a longer duration. In response, we performed a variety of stimulation protocols at high intensities and times, but found the data to be difficult to separate individual responses given the decay kinetics of all calcium sensors. Hence, we elected not to include these in the revised manuscript. However, we have now included an evaluation of the sensors with 20 Hz electrical stimulation for ~1 sec using a direct comparison of Scar8f with OGB-1. These data are now presented in a new Fig. 3D,E and discussed in the manuscript (lines 396-403).

      (2) CaFire.

      The authors mention, in line 182: 'Current approaches to analyze synaptic Ca2+ imaging data either repurpose software designed to analyze electrophysiological data or use custom software developed by groups for their own specific needs.' References should be provided. CaImAn comes to mind (Giovannucci et al., 2019, eLife), but we think there are other software programs aimed at analyzing Ca2+ imaging data that would permit such analysis.

      Thank you for the thoughtful question. At this stage, we’re unable to provide a direct comparison with existing analysis workflows. In surveying prior studies that analyze Drosophila NMJ Ca²⁺ imaging traces, we found that most groups preprocess images in Fiji/ImageJ and then rely on their own custom-made MATLAB or Python scripts for downstream analysis (see Blum et al. 2021; Xing and Wu 2018). Because these pipelines vary widely across labs, a standardized head-to-head evaluation isn’t currently feasible. With CaFire, our goal is to offer a simple, accessible tool that does not require coding experience and minimizes variability introduced by custom scripts. We designed CaFire to lower the barrier to entry, promote reproducibility, and make quantal event analysis more consistent across users. We have added references to the sentence mentioned above.

      Regarding existing software that the reviewer mentioned – CaImAn (Giovannucci et al. 2019): We evaluated CaImAn, which is a powerful framework designed for large-scale, multicellular calcium imaging (e.g., motion correction, denoising, and automated cell/ROI extraction). However, it is not optimized for the per-event kinetics central to our project - such as extracting rise and decay times for individual quantal events at single synapses. Achieving this level of granularity would typically require additional custom Python scripting and parameter tuning within CaImAn’s code-centric interface. This runs counter to CaFire’s design goals of a nocode, task-focused workflow that enables users to analyze miniature events quickly and consistently without specialized programming expertise.

      Regarding Igor Pro (WaveMetrics), (Müller et al. 2012): Igor Pro is another platform that can be used to analyze calcium imaging signals. However, it is commercial (paid) software and generally requires substantial custom scripting to fit the specific analyses we need. In practice, it does not offer a simple, open-source, point-and-click path to per-event kinetic quantification, which is what CaFire is designed to provide.

      The authors should be commended for making their software publicly available, but there are some questions:

      How does CaFire compare to existing tools?

      As mentioned above, we have not been able to adapt the custom scripts used by various labs for our purposes, including software developed in MatLab (Blum et al. 2021), Python (Xing and Wu 2018), and Igor (Müller et al. 2012). Some in the field do use semi-publically available software, including Nikon Elements (Chen and Huang 2017) and CaImAn (Giovannucci et al. 2019). However, these platforms are not optimized for the per-event kinetics central to our project - such as extracting rise and decay times for individual quantal events at single synapses. We have added more details about CaFire, mainly focusing on the workflow and measurements, highlighting the superiority of CaFire, showing that CaFire provides a no-code, standardized pipeline with automated miniature-event detection and per-event metrics (e.g., amplitude, rise time τ, decay time τ), optional ΔR/R support, and auto-partition feature. Collectively, these features make CaFire simpler to operate without programming expertise, more transparent and reproducible across users, and better aligned with the event-level kinetics required for this project.

      Very few details about the Huygens deconvolution algorithms and input settings were provided in the methods or text (outside of MLE algorithm used in STED images, which was not Ca2+ imaging). Was it blind deconvolution? Did the team distill the point-spread function for the fluorophores? Were both channels processed for ratiometric imaging? Were the same settings used for each channel? Importantly, please include SVI Huygens in the 'Software and Algorithms' Section of the methods.

      We thank the reviewer for raising this important point. We have now expanded the Methods to describe our use of Huygens in more detail and have added SVI Huygens Professional (Scientific Volume Imaging, Hilversum, The Netherlands) to the “Software and Algorithms” section. For Ca²⁺ imaging data, time-lapse stacks were processed in the Huygens Deconvolution Wizard using the standard estimation algorithm (CMLE). This is not a blind deconvolution procedure. Instead, Huygens computes a theoretical point-spread function (PSF) from the full acquisition metadata (objective NA, refractive index, voxel size/sampling, pinhole, excitation/emission wavelengths, etc.); if refractive index values are provided and there is a mismatch, the PSF is adjusted to account for spherical aberration. We did not experimentally distill PSFs from bead measurements, as Huygens’ theoretical PSFs are sufficient for our data.

      Both green (GCaMP) and red (mScarlet) channels were processed for ratiometric imaging using the same workflow (stabilization, optional bleaching correction, and deconvolution within Huygens). For each channel, the PSF, background, and SNR were estimated automatically by the same built-in algorithms, so the underlying procedures were identical even though the numerical values differ between channels because of their distinct wavelengths and noise characteristics. Importantly, Huygens normalizes each PSF to unit total intensity, such that the deconvolution itself does not add or remove signal and therefore preserves intensity ratios between channels; only background subtraction and bleaching correction can change absolute fluorescence values. For the mScarlet channel, where we observed modest bleaching (~1.10 over 15 sec), we applied Huygens’ bleaching correction and visually verified that similar structures maintained comparable intensities after correction. For presynaptic GCaMP signals, bleaching over these short recordings was negligible, so we omitted the bleaching-correction step to avoid introducing multiplicative artifacts. This workflow ensures that ratiometric ΔR/R measurements are based on consistently processed, intensity-conserving deconvolved images in both channels.

      The number of deconvolution iterations could have had an effect when comparing GCAMP series; please provide an average number of iterations used for at least one experiment. For example, Figure 3, Syt::GCAMP6s, Scar8f & Scar8m, and, if applicable, the maximum number of permissible iterations.

      We thank the reviewer for this comment. For all Ca²⁺ imaging datasets, deconvolution in Huygens was performed using the recommended default settings of the CMLE algorithm with a maximum of 30 iterations. The stopping criterion was left at the Huygens default, so the algorithm either converged earlier or, if convergence was not reached, terminated at this 30-iteration limit. No other iteration settings were used across the GCaMP series (lines 555-559).

      Please clarify if the 'Express' settings in Huygens changed algorithms or shifted input parameters.

      We appreciate the reviewer’s question regarding the Huygens “Express” settings. For clarity, we note that all Ca²⁺ imaging data reported in this manuscript were deconvolved using the “Deconvolution Wizard”, not the “Deconvolution Express” mode. In the Wizard, we explicitly selected the CMLE algorithm (or GMLE in a few STED-related cases as recommended by SVI), using the recommended maximum of 30 iterations, and other recommended settings while allowing Huygens to auto-estimate background and SNR for each channel.Bleaching correction was toggled manually per channel (applied to mScarlet when bleaching was evident, omitted for GCaMP when bleaching was negligible), as described in the revised Methods (lines 553-559).

      By contrast, the Deconvolution Express tool in Huygens is a fully automated front-end that can internally adjust both the choice of deconvolution algorithm (e.g., CMLE vs. GMLE/QMLE) and key input parameters such as SNR, number of iterations, and quality threshold based on the selected “smart profile” and the image metadata. In preliminary tests on our datasets, Express sometimes produced results that were either overly smoothed or showed subtle artifacts, so we did not use it for any data included in this study. Instead, we relied exclusively on the Wizard with explicitly controlled settings to ensure consistency and transparency across all GCaMP series and ratiometric analyses.

      We suggest including a sample data set, perhaps in Excel, so that future users can beta test on and organize their data in a similar fashion.

      We agree that this would be useful, a point shared by R1 above. In response, we have added a sample data set to the GitHub site and included sample ImageJ data along with screenshots to explain the analysis in more detail. These improvements are discussed in the manuscript (lines 705-708).

      (3) While the challenges of AZ imaging are mentioned, it is not discussed how the authors tackled each one. What is defined as an active zone? Active zones are usually identified under electron microscopy. Arguably, the limitation of GCaMP-based sensors targeted to individual AZs, being unable to resolve local Ca2+ changes at individual boutons reliably, might be incorrect. This could be a limitation of the optical setup being used here. Please discuss further. What sensor performance do we need to achieve this performance level, and/or what optical setup would we need to resolve such signals?

      We appreciate the reviewer’s thoughtful comments and agree that the technical challenges of active zone (AZ) Ca²⁺ imaging merit further clarification. We defined AZs, as is the convention in our field, as individual BRP puncta at NMJs. These BRP puncta co-colocalize with individual puncta of other AZ components, including CAC, RBP, Unc13, etc. ROIs were drawn tightly over individual BRP puncta and only clearly separable spots were included.

      To tackle the specific obstacles of AZ imaging (small signal volume, high AZ density, and limited photon budget at high frame rates), we implemented both improved sensors and optimized analysis (Fig. 6). First, we introduced a ratiometric AZ-targeted indicator, BRP::mScarlet3::GCaMP8m (Bar8m), and computed ΔR/R with ΔR/R with R(t)=F<sub>GCaMP8m</sub>/F<sub>mScarlet3</sub>. ROIs were drawn over individual AZs (Fig. 6B). Under our standard resonant area-scan conditions (~118 fps), Bar8m produces robust ΔR/R transients at individual AZs (example peaks ≈ 3.28; τ<sub>rise</sub>≈9.0 ms; Fig. 6C, middle), indicating that single-AZ signals can be detected reproducibly when AZs are optically resolvable.

      Second, we increased temporal resolution using high-speed Galvano line-scan imaging (~1058 fps), which markedly sharpened the apparent kinetics (τ<sub>rise</sub>≈3.23 ms) and revealed greater between-AZ variability (Fig. 6C, right; 6D–E). Population analyses show that line scans yield much faster rise times than area scans (Fig. 6D) and a dramatically higher fraction of significantly different AZ pairs (8.28% and 4.14% in 8f and 8m areascan vs 78.62% in 8m line-scan, lines 721-725), uncovering pronounced AZ-to-AZ heterogeneity in Ca²⁺ signals. Together, these revisions demonstrate that under our current confocal configuration, AZ-targeted GCaMP8m can indeed resolve local Ca²⁺ changes at individual, optically isolated boutons.

      We have revised the Discussion to clarify that our original statement about the limitations of AZ-targeted GCaMPs refers specifically to this combination of sensor and optical setup, rather than an absolute limitation of AZ-level Ca²⁺ imaging. In our view, further improvements in baseline brightness and dynamic range (ΔF/F or ΔR/R per action potential), combined with sub-millisecond kinetics and minimal buffering, together with optical configurations that provide smaller effective PSFs and higher photon collection (e.g., higher-NA objectives, optimized 2-photon or fast line-scan modalities, and potentially super-resolution approaches applied to AZ-localized indicators), are likely to be required to achieve routine, high-fidelity Ca²⁺ measurements at every individual AZ within a neuromuscular junction.

      (4) In Figure 5: Only GCAMP8f (Bar8f fusion protein) is tested here. Consider including testing with GCAMP8m. This is particularly relevant given that GCAMP8m was a more successful GECI for subcellular post-synaptic imaging in Figure 6.

      We appreciate this point and request by Reviewer 3. The main limitation for detecting local calcium changes at AZs is the speed of the calcium sensor, and hence we used the fastest available (GCaMP8f) to test the Bar8f sensor. While replacing GCaMP8f with GCaMP8m would indeed be predicted to enhance sensitivity (SNR), since GCaMP8m does not have faster kinetics relative to GCaMP8f, it is unlikely to be a more successful GECI for visualizing local calcium differences at AZs. 

      That being said, we agree that the Bar8m tool, including the improved mScarlet3 indicator, would likely be of interest and use to the field. Fortunately, we had engineered the Bar8m sensor while this manuscript was in review, and just recently received transgenic flies. We have evaluated this sensor, as requested by the reviewer, and included our findings in Fig. 1 and 6. In short, while the sensitivity is indeed enhanced in Bar8m compared to Bar8f, the kinetics remain insufficient to capture local AZ signals. These findings are discussed in the revised manuscript (lines 424-442, 719-730), and we appreciate the reviewer for raising these important points.

      In earlier experiments, Bar8f yielded relatively weak fluorescence, so we traded frame rate for image quality during resonant area scans (~60 fps). After switching to Bar8m, the signal was bright enough to restore our standard 118 fps area-scan setting. Nevertheless, even with dual-channel resonant area scans and ratiometric (GCaMP/mScarlet) analysis, AZ-to-AZ heterogeneity remained difficult to resolve. Because Ca²⁺ influx at individual active zones evolves on sub-millisecond timescales, we adopted a high-speed singlechannel Galvano line-scan (~1 kHz) to capture these rapid transients. We first acquired a brief area image to localize AZ puncta, then positioned the line-scan ROI through the center of the selected AZ. This configuration provided the temporal resolution needed to uncover heterogeneity that was under-sampled in area-scan data. Consistent with this, Bar8m line-scan data showed markedly higher AZ heterogeneity (significant AZ-pair rate ~79%, vs. ~8% for Bar8f area scans and ~4% for Bar8m area scans), highlighting Bar8m’s suitability for quantifying AZ diversity. We have updated the text, Methods, and figure legend accordingly (tell reviewer where to find everything).

      (5) Figure 5D and associated datasets: Why was Interquartile Range (IQR) testing used instead of ZScoring? Generally, IQR is used when the data is heavily skewed or is not normally distributed. Normality was tested using the D'Agostino & Pearson omnibus normality test and found that normality was not violated. Please explain your reasoning for the approach in statistical testing. Correlation coefficients in Figures 5 E & F should also be reported on the graph, not just the table. In Supplementary Table 1. The sub-table between 4D-F and 5E-F, which describes the IQR, should be labeled as such and contain identifiers in the rows describing which quartile is described. The table description should be below. We would recommend a brief table description for each sub-table.

      Thank you for this helpful suggestion. We have updated the analysis in two complementary ways. First, we now perform paired two-tailed t-tests between every two AZs within the same preparation (pairwise AZ–AZ comparisons of peak responses). At α<0.05, the fraction of significant AZ pairs is ~79% for Bar8m line-scan data versus ~8% for Bar8f area-scan data, indicating markedly greater AZ-to-AZ diversity when measured at high temporal resolution. Second, for visually marking the outlying AZs, we re-computed the IQR (Q1–Q3) based on the individual values collected from each AZs(15 data points per AZ, 30 AZs for each genotype), and marked AZs whose mean response falls above Q3 or below Q1; IQR is used here solely as a robust dispersion reference rather than for hypothesis testing. Both analyses support the same observation: Bar8m line-scan data reveal substantially higher AZ heterogeneity than Bar8f and Bar8m area-scan data. We have revised the Methods, figure panels, and legends accordingly (t-test details; explicit “IQR (Q1–Q3)” labeling; significant AZ-pair rates reported on the plots) (lines 719-730).

      (6) Figure 6 and associated data. The authors mention: ' SynapGCaMP quantal signals appeared to qualitatively reflect the same events measured with electrophysiological recordings (Fig. 6D).' If that was the case, shouldn't the ephys and optical signal show some sort of correlation? The data presented in Figure 6D show no such correlation. Where do these signals come from? It is important to show the ROIs on a reference image.

      We apologize this was not clear, as similar points were raised by R1 and R2. We were just showing separate (uncorrelated) sample traces of electrophysiological and calcium imaging data. Given how confusing this presentation turned out to be, and the fact that we show the correlated ephys and calcium imaging events in Fig. 7, we have elected to remove the uncorrelated electrophysiological events in Fig. 6 to just focus on the calcium imaging events (now Figures 7 and 8).

      Figure 7B: Were Ca2+ transients not associated with mEPSPs ever detected? What is the rate of such events?

      This is an astute question. Yes indeed, during simultaneous calcium imaging and current clamp electrophysiology recordings, we occasionally observed GCaMP transients without a detectable mEPSP in the electrophysiological trace. This may reflect the detection limit of electrophysiology for very small minis; with our noise level and the technical limitation of the recording rig, events < ~0.2 mV cannot be reliably detected, whereas the optical signal from the same quantal event might still be detected. The fraction of calcium-only events was ~1–10% of all optical miniature events, depending on genotype (higher in lines with smaller average minis). These calcium-only detections were low-amplitude and clustered near the optical threshold (lines 361-365).

      Minor comments

      (1) It should be mentioned in the text or figure legend whether images in Figure 1 were deconvolved, particularly since image pre-processing is only discussed in Figure 2 and after.

      We thank the reviewer for pointing this out. Yes, the confocal images shown in Figure 1 were also deconvolved in Huygens using the CMLE-based workflow described in the revised Methods. We applied deconvolution to improve contrast, reduce out-of-focus blur, and better resolve the morphology of presynaptic boutons, active zones, and postsynaptic structures, so that the localization of each sensor is more clearly visualized. We have now explicitly stated in the Fig. 1 legend and Methods (lines 575-577) that these images were deconvolved prior to display. 

      (2) The abbreviation, SNR, signal-to-noise ratio, is not defined in the text.

      We have corrected this error and thank the reviewer for pointing this out.

      (3) Please comment on the availability of fly stocks and molecular constructs.

      We have clarified that all fly stocks and molecular constructs will be shared upon request (lines 747-750). We are also in the process of depositing the new Scar8f/m, Bar8f/m, and SynapGCaMP sensors to the Bloomington Drosophila Stock Center for public dissemination.

      (4) Please add detection wavelengths and filter cube information for live imaging experiments for both confocal and widefield.

      We thank the reviewer for this helpful suggestion. We have now added the detection wavelengths and filter cube configurations for both confocal and widefield live imaging to the Methods.

      For confocal imaging, GCaMP signals were acquired on a Nikon A1R system using the FITC/GFP channel (488-nm laser excitation; emission collected with a 525/50-nm band-pass filter), and mScarlet signals were acquired using the TRITC/mCherry channel (561-nm laser excitation; emission collected with a 595/50-nm band-pass filter). Both channels were detected with GaAsP detectors under the same pinhole and scan settings described above (lines 512-517).

      For widefield imaging, GCaMP was recorded using a GFP filter cube (LED excitation ~470/40 nm; emission ~525/50 nm), which is now explicitly described in the revised Methods section (lines 632-633).

      (5) Please include a mini frequency analysis in Supplemental Figure S1.

      We apologize for not including this information in the original submission. This is now included in the Supplemental Figure S1.

      (6) In Figure S1B, consider flipping the order of EPSP (currently middle) and mEPSP (currently left), to easily guide the reader through the quantification of Figure S1A (EPSPs, top traces & mEPSPs, bottom traces).

      We agree these modifications would improve readability and clarity. We have now re-ordered the electrophysiological quantifications in Fig. S1B as requested by the reviewer.

      (7) Figure 6C: Consider labeling with sensor name instead of GFP.

      We agree here as well, and have removed “GFP” and instead added the GCaMP variant to the heatmap in Fig. 7C.

      (8) Figure 6E, 7B, 7E: Main statistical differences highlighting sensor performance should be represented on the figures for clarity.

      We did not show these differences in the original submission in an effort to keep the figures “clean” and for clarity, putting the detailed statistical significance in Table S1. However, we agree with the reviewer that it would be easier to see these in the Fig. 6E and 7B,E graphs. This information has now been added the Figs. 7 and 8.

      (9) Please report if the significance tested between the ephys mini (WT vs IIB-/-, WT vs IIA-/-, IIB-/- vs IIA-/-) is the same as for Ca2+ mini (WT vs IIB-/-, WT vs IIA-/-, IIB-/- vs IIA-/-). These should also exhibit a very high correlation (mEPSP (mV) vs Ca2+ mini deltaF/F). These tests would significantly strengthen the final statement of "SynapGCaMP8m can capture physiologically relevant differences in quantal events with similar sensitivity as electrophysiology."

      We agree that adding the more detailed statistical analysis requested by the reviewer would strengthen the evidence for the resolution of quantal calcium imaging using SynapGCaMP8m. We have included the statistical significance between the ephys and calcium minis in Fig. 8 and included the following in the revised methods (lines 358-361), the Fig. 8 legend and Table S1:

      Using two-sample Kolmogorov–Smirnov (K–S) tests, we found that SynapGCaMP8m Ca²⁺ minis (ΔF/F, Fig. 8E) differ significantly across all genotype pairs (WT vs IIB<sup>-/-</sup>, WT vs IIA<sup>-/-</sup>, IIB<sup>-/-</sup> vs IIA<sup>-/-</sup>; all p < 0.0001). The genotype rank order of the group means (±SEM) is IIB<sup>-/-</sup> > WT > IIA<sup>-/-</sup> (0.967 ± 0.036; 0.713 ± 0.021; 0.427 ± 0.017; n=69, 65, 59). For electrophysiological minis (mEPSP amplitude, Fig. 8F), K–S tests likewise show significant differences for the same comparisons (all p < 0.0001) with D statistics of 0.1854, 0.3647, and 0.4043 (WT vs IIB<sup>-/-</sup>, WT vs IIA<sup>-/-</sup>, IIB<sup>-/-</sup> vs IIA<sup>-/-</sup>, respectively). Group means (±SEM) again follow IIB<sup>-/-</sup> > WT > IIA<sup>-/-</sup> (0.824 ± 0.017 mV; 0.636 ± 0.015 mV; 0.383 ± 0.007 mV; n=41 each). These K–S results demonstrate identical significance and rank order across modalities, supporting our conclusion that SynapGCaMP8m resolves physiologically relevant quantal differences with sensitivity comparable to electrophysiology.

      References

      Blum, Ian D., Mehmet F. Keleş, El-Sayed Baz, Emily Han, Kristen Park, Skylar Luu, Habon Issa, Matt Brown, Margaret C. W. Ho, Masashi Tabuchi, Sha Liu, and Mark N. Wu. 2021. 'Astroglial Calcium Signaling Encodes Sleep Need in Drosophila', Current Biology, 31: 150-62.e7.

      Chen, Y., and L. M. Huang. 2017. 'A simple and fast method to image calcium activity of neurons from intact dorsal root ganglia using fluorescent chemical Ca(2+) indicators', Mol Pain, 13: 1744806917748051.

      Giovannucci, Andrea, Johannes Friedrich, Pat Gunn, Jérémie Kalfon, Brandon L. Brown, Sue Ann Koay, Jiannis Taxidis, Farzaneh Najafi, Jeffrey L. Gauthier, Pengcheng Zhou, Baljit S. Khakh, David W. Tank, Dmitri B. Chklovskii, and Eftychios A. Pnevmatikakis. 2019. 'CaImAn an open source tool for scalable calcium imaging data analysis', eLife, 8: e38173.

      Müller, M., K. S. Liu, S. J. Sigrist, and G. W. Davis. 2012. 'RIM controls homeostatic plasticity through modulation of the readily-releasable vesicle pool', J Neurosci, 32: 16574-85.

      Wu, Yifan, Keimpe Wierda, Katlijn Vints, Yu-Chun Huang, Valerie Uytterhoeven, Sahil Loomba, Fran Laenen, Marieke Hoekstra, Miranda C. Dyson, Sheng Huang, Chengji Piao, Jiawen Chen, Sambashiva Banala, Chien-Chun Chen, El-Sayed Baz, Luke Lavis, Dion Dickman, Natalia V. Gounko, Stephan Sigrist, Patrik Verstreken, and Sha Liu. 2025. 'Presynaptic Release Probability Determines the Need for Sleep', bioRxiv: 2025.10.16.682770.

      Xing, Xiaomin, and Chun-Fang Wu. 2018. 'Unraveling Synaptic GCaMP Signals: Differential Excitability and Clearance Mechanisms Underlying Distinct Ca<sup>2+</sup> Dynamics in Tonic and Phasic Excitatory, and Aminergic Modulatory Motor Terminals in Drosophila', eneuro, 5: ENEURO.0362-17.2018.

    1. Weren’t you ever as young and dumb as that?’ ‘I’m always in the club drinking martinis,’ he told an interviewer when asked to recall his younger self. ‘What did I know from politics?’ (Richardson doesn’t find in Matthiessen’s letters and journals a coherent politics, but some leftist tendencies emerge in a remark on ‘the startling parallel between communist doctrine and the teachings of Jesus Christ’ and in his sympathy for blacklisted celebrities like Paul Robeson, who ‘got a shitty deal’.) If it were merely a matter of Matthiessen’s reputation as a writer, such explanations might have sufficed, but soon after his arrival in France, he made some new friends, and they started the Paris Review. Since Matthiessen’s employment by the CIA was first reported by the New York Times in 1977, the magazine has had the taint of the association. Given the tendency of its founders, their children and their editorial heirs to memorialise the magazine’s beginnings incessantly, often in the pursuit of fundraising, the issue keeps raising its still un-declassified head, to the extent that many young writers have the impression that since the end of the Second World War American literature has been one big government psyop. That’s why they’re not getting published.

      Nicely acid

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study presents a system for delivering precisely controlled cutaneous stimuli to freely moving mice by coupling markerless real-time tracking to transdermal optogenetic stimulation, using the tracking signal to direct a laser via galvanometer mirrors. The principal claims are that the system achieves sub-mm targeting accuracy with a latency of <100 ms. The nature of mouse gait enables accurate targeting of forepaws even when mice are moving.

      Strengths:

      The study is of high quality and the evidence for the claims is convincing. There is increasing focus in neurobiology in studying neural function in freely moving animals, engaged in natural behaviour. However, a substantial challenge is how to deliver controlled stimuli to sense organs under such conditions. The system presented here constitutes notable progress towards such experiments in the somatosensory system and is, in my view, a highly significant development that will be of interest to a broad readership.

      Weaknesses:

      (1) "laser spot size was set to 2.00 } 0.08 mm2 diameter (coefficient of variation = 3.85)" is unclear. Is the 0.08 SD or SEM? (not stated). Also, is this systematic variation across the arena (or something else)? Readers will want to know how much the spot size varies across the arena - ie SD. CV=4 implies that SD~7 mm. ie non-trivial variation in spot size, implying substantial differences in power delivery (and hence stimulus intensity) when the mouse is in different locations. If I misunderstood, perhaps this helps the authors to clarify. Similarly, it would be informative to have mean & SD (or mean & CV) for power and power density. In future refinements of the system, would it be possible/useful to vary laser power according to arena location?

      We thank the reviewer for their comments and for identifying areas needing more clarity. The previous version was ambiguous: 0.08 refers to the standard deviation (SD). We have removed the ambiguity by stating mean ± SD and reporting a unitless coefficient of variation (CV).

      The revised text reads “laser spot size was set to 2.00 ± 0.08 mm<sup>2</sup> (mean ± SD; coefficient of variation = 0.039).” This makes clear that the variability in spot size is minimal: it is 0.08 mm<sup>2</sup> SD (≈0.03 mm SD in diameter). This should help clarify that spot size variability across the arena is minute and unlikely to contribute meaningfully to differences in stimulus intensity across locations. The power was modulated depending on the experiment, so we provide the unitless CV here in “The absolute optical power and power density were uniform across the glass platform (coefficient of variation 0.035 and 0.029, respectively; Figure 2—figure supplement)”. We are grateful to the reviewer for spotting these omissions.

      The reviewer also asks whether, in the future, it is “possible/useful to vary laser power according to arena location”. This is already possible in our system for infrared cutaneous stimulation using analog modulation (Figure 4). We have added the following sentence to make this clearer: “Laser power could be modulated using the analog control.”

      (2) "The video resolution (1920 x 1200) required a processing time higher than the frame interval (33.33 ms), resulting in real-time pose estimation on a sub-sample of all frames recorded". Given this, how was it possible to achieve 84 ms latency? An important issue for closed-loop research will relate to such delays. Therefore please explain in more depth and (in Discussion) comment on how the latency of the current system might be improved/generalised. For example, although the current system works well for paws it would seem to be less suited to body parts such as the snout that do not naturally have a stationary period during the gait cycle.

      We captured and stored video with a frame-to-frame interval of 33.33 ms (30 fps). DeepLabCut-live! was run in a latency-optimization mode, meaning that new frames are not processed while the network is busy - only the most recent frame is processed when free. The processing latency is measured per processed frame, and intermediate frames are thus skipped while the network is busy. Although a wide field of view and high resolution is required to capture the large environment, increasing the per-frame compute time, the processing latency remained small enough to track and stimulate moving mice. This processing latency of 84 ± 12 ms (mean ± SD) was calculated using the timestamps stored in the output files from DeepLabCut-live!: subtracting the frame acquisition timestamp from the frame processing timestamp across 16,000 processed frames recorded across four mice (4,000 each). In addition, there is a small delay to move the galvanometers and trigger the laser, calculated as 3.3 ± 0.5 ms (mean ± SD; 245 trials). This is described in the manuscript, but can be combined with the processing latency to indicate a total closed-loop delay of ≈87 ms so we have expanded on the ‘Optical system characterization’ subsection in the Methods, adding “We estimated a processing latency of 84 ± 12 ms (mean ± SD) by subtracting…” and that “In the current configuration the end-to-end closed-loop delay is ≈87 ms from the combination of the processing latency and other delays”. To the Discussion, we now comment on how this latency can be reduced and how this can allow for generalization to more rapidly moving body parts.

      Reviewer #2 (Public review):

      Parkes et al. combined real-time keypoint tracking with transdermal activation of sensory neurons to examine the effects of recruitment of sensory neurons in freely moving mice. This builds on the authors' previous investigations involving transdermal stimulation of sensory neurons in stationary mice. They illustrate multiple scenarios in which their engineering improvements enable more sophisticated behavioral assessments, including (1) stimulation of animals in multiple states in large arenas, (2) multi-animal nociceptive behavior screening through thermal and optogenetic activation, and (3) stimulation of animals running through maze corridors. Overall, the experiments and the methodology, in particular, are written clearly. However, there are multiple concerns and opportunities to fully describe their newfound capabilities that, if addressed, would make it more likely for the community to adopt this methodology:

      The characterization of laser spot size and power density is reported as a coefficient of variation, in which a value of ~3 is interpreted as uniform. My interpretation would differ - data spread so that the standard deviation is three times larger than the mean indicates there is substantial variability in the data. The 2D polynomial fit is shown in Figure 2 - Figure Supplement 1A and, if the fit is good, this does support the uniformity claim (range of spot size is 1.97 to 2.08 mm2 and range of power densities is 66.60 to 73.80 mW). The inclusion of the raw data for these measurements and an estimate of the goodness of fit to the polynomials would better help the reader evaluate whether these parameters are uniform across space and how stable the power density is across repeated stimulations of the same location. Even more helpful would be an estimate of whether the variation in the power density is expected to meaningfully affect the responses of ChR2-expressing sensory neurons.

      We thank the reviewer for their comments. As also noted in response to Reviewer 1, the coefficient of variation (CV) is now reported in unitless form (rather than a percentage) to ensure clarity. For avoidance of doubt, the CV is 0.039 (3.9%), so the variation in laser spot size is minimal – there is negligible spot size variability across the system. The ranges are indeed consistent with uniformity. We have included the goodness-of-fit estimates in the appropriate figure legend “fit with a two-dimensional polynomial (area R<sup>2</sup> = 0.91; power R<sup>2</sup> = 0.75)”. This indicates that the polynomials fit well overall.

      The system already allows for control of spot size. To examine whether the variation in the power density affects the responses of ChR2-expressing sensory neurons, we examined this in our previous work that focused more on input-output relationships, demonstrating a steep relationship between spot size (range of 0.02 mm<sup>2</sup> to 2.30 mm<sup>2</sup>) and the probability of paw response, demonstrating a meaningful change in response probability (Schorscher-Petcu et al. eLife, 2021). In future studies, we aim to use this approach to “titrate” cutaneous inputs as mice move through their environments.

      While the error between the keypoint and laser spot error was reported as ~0.7 to 0.8 mm MAE in Figure 2L, in the methods, the authors report that there is an additional error between predicted keypoints and ground-truth labeling of 1.36 mm MAE during real-time tracking. This suggests that the overall error is not submillimeter, as claimed by the authors, but rather on the order of 1.5 - 2.5 mm, which is considerable given the width of a hind paw is ~5-6 mm and fore paws are even smaller. In my opinion, the claim for submillimeter precision should be softened and the authors should consider that the area of the paw stimulated may differ from trial to trial if, for example, the error is substantial enough that the spot overlaps with the edge of the paw.

      We thank the reviewer for identifying a discrepancy in these reported errors. We clarify this below and in the manuscript

      The real-time tracking error is the mean absolute Euclidean distance (MAE) between ground truth and DLC on the left hind paw where likelihood was relatively high. More specifically, ground truth was obtained by manual annotation of the left hind paw center. The corresponding DLC keypoint was evaluated in frames with likelihood >0.8 (the stimulation threshold). Across 1,281 frames from five videos of freely exploring mice (30 fps), the MAE was 1.36 mm.

      The targeting error is the MAE between ground truth and the laser spot location, so should reflect the real-time tracking error plus errors from targeting the laser. More specifically, this metric was determined by comparing the manually determined ground truth keypoint of the left hind paw and the actual center of the laser spot. Importantly, this metric was calculated using four five-minute high-speed videos recorded at 270 fps of mice freely exploring the open arena (463 frames) and frames were selected with a likelihood threshold >0.8. This allowed us to resolve the brief laser pulses but inadvertently introduced a difference in spatial scaling. After rescaling, the values give a targeting error MAE now in line with the real-time tracking error  (see corrected Figure 2L). This is approximately 1.3 mm across all locomotion speeds categories. These errors are small and are limited by the spatial resolution of the cameras. We thank the reviewer for noting this discrepancy and prompting us to get to its root cause.

      We have amended the subtitle on Figure 2L as “Ground truth keypoint to laser spot error” and have avoided the use of submillimeter throughout. We have added the following sentence to clarify this point: “As laser targeting relies on real-time tracking to direct the laser to the specified body part, this metric includes any errors introduced by tracking and targeting”.

      As the major advance of this paper is the ability to stimulate animals during ongoing movement, it seems that the Figure 3 experiment misses an opportunity to evaluate state-dependent whole-body reactions to nociceptor activation. How does the behavioral response relate to the animal's activity just prior to stimulation?

      The reviewers suggest analysis of state-dependent responses. In the Figure 3 experiment, mice were stimulated up to five times when stationary. Analysis of whole body reactions in stationary mice has been described in (Schorscher-Petcu et al. eLife, 2021) and doing this here would be redundant, so instead we now analyse the responses of moving mice in Figure 5. This new analysis shows robust state-dependent responses during movement as suggested by the reviewer. We find two behavioral clusters: one that is for faster, direct (coherent) movement and the other that is for slower assessment (incoherent) movement. Stimulation during the former results in robust and consistent slowing and shift towards assessment, whereas stimulation during the former results in a reduction in assessment. We describe and interpret these new data in the Results and Discussion sections and add information in the Methods and Figure legend, as given below. We believe that demonstrating movement statedependence is a valuable addition to the paper and thank the reviewer for suggesting this.

      Given the characterization of full-body responses to activation of TrpV1 sensory neurons in Figure 4 and in the authors' previous work, stimulation of TrpV1 sensory neurons has surprisingly subtle effects as the mice run through the alternating T maze. The authors indicate that the mice are moving quickly and thus that precise targeting is required, but no evidence is shared about the precision of targeting in this context beyond images of four trials. From the characterization in Figure 2, at max speed (reported at 241 +/- 53 mm/s, which is faster than the high speeds in Figure 2), successful targeting occurs less than 50% of the time. Is the initial characterization consistent with the accuracy in this context? To what extent does inaccuracy in targeting contribute to the subtlety of affecting trajectory coherence and speed? Is there a relationship between animal speed and disruption of the trajectory?

      We thank the reviewer for pointing out the discrepancy in the reported maximum speed. We have corrected the error in the main text: the average maximum speed is 142 ± 26 mm/s (four mice).

      The self-paced T-maze alternation task in Figure 5 demonstrates that mice running in a maze can be stimulated using this method. We did not optimize the particular experimental design to assess the hit accuracy, as this was determined in Figure 2. Instead, we optimized for the pulse frequencies, meaning the galvanometers tracked with processed frames but the laser was triggered whether or not the paw was actually targeted. However, even in this case with the system pulsing in the free-run mode, the laser hit rate was 54 ± 6% (mean ± sem, n = 7 mice). We have weakened references to submillimeter as it was only inferred from other experiments and was not directly measured here. We find in this experiment that stimulation in freely moving mice can cause them to briefly halt and evaluate. In the future, we will use experimental designs to more optimally examine learning.

      The reviewer also asks if there is a relationship between speed and disruption of the trajectory. We find that this is the case as described above with our additional analysis.

      Reviewer #3 (Public review):

      Summary:

      To explore the diverse nature of somatosensation, Parkes et al. established and characterized a system for precise cutaneous stimulation of mice as they walk and run in naturalistic settings. This paper provides a framework for real-time body part tracking and targeted optical stimuli with high precision, ensuring reliable and consistent cutaneous stimulation. It can be adapted in somatosensation labs as a general technique to explore somatosensory stimulation and its impact on behavior, enabling rigorous investigation of behaviors that were previously difficult or impossible to study.

      Strengths:

      The authors characterized the closed-loop system to ensure that it is optically precise and can precisely target moving mice. The integration of accurate and consistent optogenetic stimulation of the cutaneous afferents allows systematic investigation of somatosensory subtypes during a variety of naturalistic behaviors. Although this study focused on nociceptors innervating the skin (Trpv1::ChR2 animals), this setup can be extended to other cutaneous sensory neuron subtypes, such as low-threshold mechanoreceptors and pruriceptors. This system can also be adapted for studying more complex behaviors, such as the maze assay and goal-directed movements.

      Weaknesses:

      Although the paper has strengths, its weakness is that some behavioral outputs could be analyzed in more detail to reveal different types of responses to painful cutaneous stimuli. For example, paw withdrawals were detected after optogenetically stimulating the paw (Figures 3E and 3F). Animals exhibit different types of responses to painful stimuli on the hind paw in standard pain assays, such as paw lifting, biting, and flicking, each indicating a different level of pain. Improving the behavioral readouts from body part tracking would greatly strengthen this system by providing deeper insights into the role of somatosensation in naturalistic behaviors. Additionally, if the laser spot size could be reduced to a diameter of 2 mm², it would allow the activation of a smaller number of cutaneous afferents, or even a single one, across different skin types in the paw, such as glabrous or hairy skin.

      We thank the reviewer for highlighting how our system can be combined with improved readouts of coping behavior to provide deeper insights. Optogenetic and infrared cutaneous stimulation are well established generators of coping behaviors (lifting, flicking, licking, biting, guarding). Detection of these behaviors is an active and evolving field with progress being made regularly (e.g. Jones et al., eLife 2020 [PAWS];  Wotton et al., Mol Pain 2020; Zhang et al., Pain 2022; Oswell et al., bioRxiv 2024 [LUPE]; Barkai et al., Cell Reports Methods 2025 [BAREfoot], along with more general tools like Hsu et al., Nature Communications 2021 [B-SOiD]; Luxem et al., Communications Biology 2022 [VAME]; Weinreb et al,. Nature Methods 2024 [Keypoints-MoSeq]). One output of our system is bodypart keypoints, which are the typical input to many of these tools. We will leave the readers and users of the system to decide which tools are appropriate for their experimental designs - the focus of this current manuscript is describing the novel stimulation approach in moving animals.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) It is hard to see how the rig is arranged from the render of Figure 2AB due to the components being black on black. A particularly useful part of Fig2AB is the aerial view in panel B that shows the light paths. I suggest adding the labelling of Figure 2A also to that. The side/rear views could perhaps be deleted, allowing the aerial view to be larger.

      We appreciate this suggestion and have revised Figure 2B to improve the visibility of the optomechanical components. We have enlarged the side and aerial views, removed the rear view, and added further labels to the aerial view.

      (2) MAE - to interpret the 0.54 result, it would be useful to state the arena size in this paragraph.

      Thank you. We have added the arena size in this paragraph and also added scales in the relevant figure (Figure 2).

      (3) "pairwise correlations of R = 0.999 along both x- and y-axes". Is this correlation between hindpaw keypoint and galvo coordinates?

      Yes, we have added the following to clarify: “...between galvanometer coordinates and hind paw keypoints”

      (4) Latency was 84 ms. Is this mainly/entirely the delay between DLC receiving the camera image and outputting key point coordinates?

      Yes, we hope that the additional detail in the Methods and Discussion described above will now clarify the current closed-loop latencies.

      (5) "Mice move at variable speeds": in this sentence, spell out when "speed" refers to mouse and when it refers to hindpaw. Similarly, Fig 2i. The sentence is potentially confusing to general readers (paws stationary although the mouse is moving). Presumably, it's due to gait. I suggest explaining this here.

      The speed values that relate to the mouse body and paws are now clearer in the main text and in the legend for Figure 2I.

      (6) Figure 2k and associated main text. It is not clear what "success/hit rate" means here.

      We have added the following sentence in the main text: “Hit accuracy refers to the percentage of trials in which the laser successfully targeted (‘hit’) the intended hind paw.” and use hit accuracy throughout instead of success rate.

      (7) Figure 2L. All these points are greater than the "average" 0.54 reported in the text. How is this possible?

      The MAE of 0.54 mm refers to the “predicted and actual laser spot locations” (that is, the difference between where the calibration map should place the laser spot and where it actually fell), while Figure 2L MAE values refers to the error between the ground truth keypoint to laser spot (that is, the error between the human-observed paw target and where the laser spot fell). The latter error will include the former error so is expected to be larger. We have clarified this point throughout the text, for example, stating “As laser targeting relies on real-time tracking to direct the laser to the specified body part, this metric inherently accounts for any errors introduced by the tracking and targeting.”. This is also discussed above in response to Reviewer 2.

      (8) "large circular arena". State the size here

      We have added this to the Figure 2 legend.

      (9) Figure 3c-left. Can the contrast between the mouse and floor be increased here?

      We have improved the contrast in this image.

      (10) Figure 5c. It is unclear what C1, C2, etc refers to. Mice?

      Yes, these refer to mice. We have removed reference to these now as they are not needed.

      (11) Discussion. A comment. There is scope for elaborating on the potential for new research by combining it with new methods for measurements of neural activity in freely moving animals in the somatosensory system.

      Thank you. We agree and have added more detail on this in the discussion stating “The system may be combined with existing tools to record neural activity in freely-moving mice, such as fiber photometry, miniscopes, or large-scale electrophysiology, and manipulations of this neural activity, such as optogenetics and chemogenetics. This can allow mechanistic dissection of cell and circuit biology in the context of naturalistic behaviors.”

      Reviewer #3 (Recommendations for the authors):

      (1) Include the number of animals for behavior assays for the panels (e.g., Figures 4G).

      Where missing, we now state the number of animals in panels.

      (2) If representative responses are shown, such as in Figures 3E and 4F, include the average response with standard deviation so readers can appreciate the variation in the responses.

      We appreciate the suggestion to show variability in the responses. We have made several changes to Figures 3 and 4. Specifically, to illustrate the variability across multiple trials more clearly, Figure 3E now shows representative keypoint traces for each body part from two mice during their 5 trials. For Figure 4, we have re-analyzed the thermal stimulation trials and shown a raster plot of keypoint-based local motion energy (Figure 4E) sorted by response latency for hundreds of trials. Figure 4G now presents the cumulative distribution for all trials and animals for thermal (18 wild-type mice, 315 trials) and optogenetic stimulation trials (9 Trpv1::ChR2 mice, 181 trials). We also now provide means ± SD for the key metrics for optogenetic and thermal stimulation trials in Figure 4 in the Results section. This keeps the manuscript focused on the methodological advances while showing the trial variability.

      (3) "optical targeting of freely-moving mice in a large environments" should be "optical targeting of freely-moving mice in a large environment".

      Corrected

      (4) Define fps when you first mention this in the manuscript.

      Added

      (5) Data needs to be shown for the claim "Mice concurrently turned their heads toward the stimulus location while repositioning their bodies away from it".

      We state this observation to qualify that the stimulation of stationary mice resulted in behavioral responses “consistent with previous studies”. It would be redundant to repeat our full analysis and might distract from the novelty of the current manuscript. We have restricted this sentence to make it clearer: “Consistent with previous studies, we observed the whole-body behaviors like head orienting concurrent with local withdrawal (Browne et al., Cell Reports 2017; Blivis et al., eLife, 2017.)”

    1. his view begins with the clinician abandoning the either–or categories of “physical” and “psychological.” This dichotomous framework leads to interactions in which patients feel that they and their symptoms are being discounted by the clinician and conclude, “The doctor’s saying it’s all in my head.”

      .

    1. Reviewer #2 (Public review):

      Summary:

      The authors aimed to dissect the plasticity of circadian outputs by combining evolutionary biology with chronobiology. By utilizing Drosophila strains selected for "Late" and "Early" adult emergence, they sought to investigate whether selection for developmental timing co-evolves with plasticity in daily locomotor activity. Specifically, they examined how these diverse lines respond to complex, desynchronized environmental cues (temperature and light cycles) and investigated the molecular role of the splicing factor Psi and timeless isoforms in mediating this plasticity.

      Major strengths and weaknesses:

      The primary strength of this work is the novel utilization of long-term selection lines to address fundamental questions about how organisms cope with complex environmental cues. The behavioral data are compelling, clearly demonstrating that "Late" and "Early" flies possess distinct capabilities to track temperature cycles when they are desynchronized from light cycles.

      However, a significant weakness lies in the causal links proposed between the molecular findings and these behavioral phenotypes. The molecular insights (Figures 2, 4, 5, and 6) rely on mRNA extracted from whole heads. As head tissue is dominated by photoreceptor cells and glia rather than the specific pacemaker neurons (LNv, LNd) driving these behaviors, this approach introduces a confound. Differential splicing observed here may reflect the state of the compound eye rather than the central clock circuit, a distinction highlighted by recent studies (e.g., Ma et al., PNAS 2023).

      Furthermore, while the authors report that Psi mRNA loses rhythmicity under out-of-sync conditions, this correlation does not definitively prove that Psi oscillation is required for the observed splicing patterns or behavioral plasticity. The amplitude of the reported Psi rhythm is also low (~1.5 fold) and variable, raising questions about its functional significance in the absence of manipulation experiments (such as constitutive expression) to test causality.

      Appraisal of aims and conclusions:

      The authors successfully demonstrate the co-evolution of emergence timing and activity plasticity, achieving their aim on the behavioral level. However, the conclusion that the specific molecular mechanism involves the loss of Psi rhythmicity driving timeless splicing changes is not yet fully supported by the data. The current evidence is correlative, and without spatial resolution (specific clock neurons) or causal manipulation, the mechanistic model remains speculative.

      This study is likely to be of significant interest to the chronobiology and evolutionary biology communities as it highlights the "enhanced plasticity" of circadian clocks as an adaptive trait. The findings suggest that plasticity to phase lags - common in nature where temperature often lags light - may be a key evolutionary adaptation. Addressing the mechanistic gaps would significantly increase the utility of these findings for understanding the molecular basis of circadian plasticity.

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript addresses an important question: how do circadian clocks adjust to a complex rhythmic environment with multiple daily rhythms? The focus is on the temperature and light cycles (TC and LD) and their phase relationship. In nature, TC usually lags the LD cycle, but the phase delay can vary depending on seasonal and daily weather conditions. The authors present evidence that circadian behavior adjusts to different TC/LD phase relationships, that temperature-sensitive tim splicing patterns might underlie some of these responses, and that artificial selection for preferential evening or morning eclosion behavior impacts how flies respond to different LD/TC phase relationship

      Strength:

      Experiments are conducted on control strains and strains that have been selected in the laboratory for preferential morning or evening eclosion phenotypes. This study is thus quite unique as it allows us to probe whether this artificial selection impacted how animals respond to different environmental conditions, and thus gives hints on how evolution might shape circadian oscillators and their entrainment. The authors focused on circadian locomotor behavior and timeless (tim) splicing because warm and cold-specific transcripts have been described as playing an important role in determining temperature-dependent circadian behavior. Not surprisingly, the results are complex, but there are interesting observations. In particular, the "late" strain appears to be able to adjust more efficiently its evening peak in response to changes in the phase relationship between temperature and light cycles, but the morning peak seems less responsive in this strain. Differences in the circadian pattern of expression of different tim mRNA isoforms are found under specific LD/TC conditions.

      We sincerely thank the reviewer for this generous assessment and for recognizing several key strengths of our study. We are particularly gratified that the reviewer values our use of long-term laboratory-selected chronotype lines (350+ generations), which provide a unique evolutionary perspective on how artificial selection reshapes circadian responses to complex LD/TC phase relationships—precisely our core research question.

      Weaknesses:

      These observations are interesting, but in the absence of specific genetic manipulations, it is difficult to establish a causative link between tim molecular phenotypes and behavior. The study is thus quite descriptive. It would be worth testing available tim splicing mutants, or mutants for regulators of tim splicing, to understand in more detail and more directly how tim splicing determines behavioral adaptation to different phase relationships between temperature and light cycles. Also, I wonder whether polymorphisms in or around tim splicing sites, or in tim splicing regulators, were selected in the early or late strains.

      We thank the reviewer for this insightful comment. We agree that our current data do not establish a direct causal link between tim splicing (or Psi) and behaviour, and we appreciate that some of our wording (e.g. “linking circadian gene splicing to behavioural plasticity” or describing tim splicing as a “pivotal node”) may have suggested unintended causal links. In the revision, we will (i) explicitly state in the Abstract, Introduction, and early Discussion that the main aim was to test whether selection for timing of eclosion is accompanied by correlated evolution of temperature‑dependent tim splicing patterns and evening activity plasticity under complex LD/TC regimes, and (ii) consistently describe the molecular findings as correlational and hypothesis‑generating rather than causal. We will also add phrases throughout the text to point the reader more clearly to existing passages where we already emphasize “correlated evolution” and explicitly label our mechanistic ideas as “we speculate” / “we hypothesize” and as future experiments.

      We fully agree that studies using tim splicing mutants or manipulations of splicing regulators under in‑sync and out‑of‑sync LD/TC regimes will be essential to ascertain what role tim variants play under such environmental conditions, and we will highlight this as a key future direction. At the same time, we emphasize that the long‑term selection lines provide a complementary perspective to classical mutant analyses by revealing how behavioural and molecular phenotypes can exhibit correlated evolution under a specific, chronobiologically relevant selection pressure (timing of emergence).

      Finally, we appreciate the suggestion regarding polymorphisms. Whole‑genome analyses of these lines in a PhD thesis from our group (Ghosh, 2022, unpublished, doctoral dissertation) reveal significant SNPs in intronic regions of timeless in both Early and Late populations, as well as SNPs in CG7879, a gene implicated in alternative mRNA splicing, in the Late line. Because these analyses are ongoing and not yet peer‑reviewed, we do not present them as main results.

      I also have a major methodological concern. The authors studied how the evening and morning phases are adjusted under different conditions and different strains. They divided the daily cycle into 12h morning and 12h evening periods, and calculated the phase of morning and evening activity using circular statistics. However, the non-circadian "startle" responses to light or temperature transitions should have a very important impact on phase calculation, and thus at least partially obscure actual circadian morning and evening peak phase changes. Moreover, the timing of the temperature-up startle drifts with the temperature cycles, and will even shift from the morning to the evening portion of the divided daily cycle. Its amplitude also varies as a function of the LD/TC phase relationship. Note that the startle responses and their changes under different conditions will also affect SSD quantifications.

      We thank the reviewer for this perceptive methodological concern, which we had anticipated and systematically quantified but had not included in the original submission. The reviewer is absolutely correct that non-circadian startle responses to zeitgeber transitions could confound both circular phase (CoM) calculations and SSD quantifications, particularly as TC drift creates shifting startle locations across morning/evening windows.

      We will be including startle response quantification (previously conducted but unpublished) as new a Supplementary figure, systematically measuring SSD in 1-hour windows immediately following each of the four environmental transitions (lights-ON, lights-OFF, temperature rise and temperature fall) across all six LDTC regimes (2-12hr TC-LD lags) for all 12 selection lines (early<sub>1-4</sub>, control<sub>1-4</sub>, late<sub>1-4</sub>).

      Author response image 1.

      Startle responses in selection lines under LDTC regimes: SSD calculated to assess startle response to each of the transitions (1-hour window after the transition used for calculations). Error bars are 95% Tukey’s confidence intervals for the main effect of selection in a two-factor ANOVA design with block as a random factor. Non-overlapping error bars indicate significant differences among the values. SSD values between in-sync and out-of-sync regimes for a range of phase relationships between LD and TC cycles (A) LDTC 2-hr, (B) LDTC 4-hr, (C) LDTC 6-hr, (D) LDTC 8-hr, (E) LDTC 10-hr, (F) LDTC 12-hr.

      Key findings directly addressing the reviewer's concerns:

      (1) Morning phase advances in LDTC 8-12hr regimes are explained by quantified nocturnal startle activity around temperature rise transitions occurring within morning windows. Critically, these startles show no selection line differences, confirming they represent equivalent non-circadian confounds across lines.

      (2) Early selection lines exhibit significantly heightened startle responses specifically to temperature rise in LDTC 4hr and 6hr regimes (early > control ≥ late), demonstrating that startle responses themselves exhibit correlated evolution with emergence timing—an important novel finding that strengthens our evolutionary story.

      (3) Startle responses differed among selection lines only for the temperature rise transition under two of the regimes used, LDTC 4 hr and 6 hr regimes. Under LDTC 4 hr, temperature rise transition falls in the morning window and despite early having significantly greater startle than late, the overall morning SSD (over 12 hours morning window) did not differ significantly among the selection lines for this regime. Thus, eliminating the startle window would make the selection lines more similar to one another. On the other hand, under LDTC 6 hour regime, the startle response to temperature rise falls in the evening 12 hour window. In this case too, early showed higher startle than control and late. A higher startle in early would thus, contribute to the observed differences among selection lines. We agree with the reviewer that eliminating this startle peak would lead to a clearer interpretation of the change in circadian evening activity.

      We deliberately preserved all behavioural data without filtering out startle windows since it would require arbitrary cutoffs like 1 hr, 2 hr or 3 hours post transitions or until the startle peaks declines in different selection lines under different regimes. In the revised version, we will add complementary analyses excluding the startle windows to obtain mean phase and SSD values which are unaffected by the startle responses.

      For the circadian phase, these issues seem, for example, quite obvious for the morning peak in Figure 1. According to the phase quantification on panel D, there is essentially no change in the morning phase when the temperature cycle is shifted by 6 hours compared to the LD cycle, but the behavior trace on panel B clearly shows a phase advance of morning anticipation. Comparison between the graphs on panels C and D also indicates that there are methodological caveats, as they do not correlate well.

      Because of the various masking effects, phase quantification under entrainment is a thorny problem in Drosophila. I would suggest testing other measurements of anticipatory behavior to complement or perhaps supersede the current behavior analysis. For example, the authors could employ the anticipatory index used in many previous studies, measure the onset of morning or evening activity, or, if more reliable, the time at which 50% of anticipatory activity is reached. Termination of activity could also be considered. Interestingly, it seems there are clear effects on evening activity termination in Figure 3. All these methods will be impacted by startle responses under specific LD/TC phase relationships, but their combination might prove informative.

      We agree that phase quantification under entrained conditions in Drosophila is challenging and that anticipatory indices, onset/offset measures, and T50 metrics each have particular strengths and weaknesses. In designing our analysis, we chose to avoid metrics that require arbitrary or subjective criteria (e.g. defining activity thresholds or durations for anticipation, or visually marking onset/offset), because these can substantially affect the estimated phase and reduce comparability across regimes and genotypes. Instead, we used two fully quantitative, parameter-free measures applied to the entire waveform within defined windows: (i) SSD to capture waveform change in shape/amplitude and (ii) circular mean phase of activity (CoM) restricted to the 12 h morning and 12 h evening windows. By integrating over the entire window, these measures are less sensitive to the exact choice of threshold and to short-lived, high-amplitude startles at transitions, and they treat all bins within the window in a consistent, reproducible way across all LDTC regimes and lines. Panels C (SSD) and D (CoM) are intentionally complementary, not redundant: SSD reflects how much the waveform changes in shape and amplitude, whereas CoM reflects the timing of the center of mass of activity. Under conditions where masking alters amplitude and introduces short-lived bouts without a major shift of the main peak, it is expected that SSD and CoM will not correlate linearly across regimes.

      We will be including a detailed calculation of how CoM is obtained in our methods for the revised version.  

      Reviewer #2 (Public review):

      Summary:

      The authors aimed to dissect the plasticity of circadian outputs by combining evolutionary biology with chronobiology. By utilizing Drosophila strains selected for "Late" and "Early" adult emergence, they sought to investigate whether selection for developmental timing co-evolves with plasticity in daily locomotor activity. Specifically, they examined how these diverse lines respond to complex, desynchronized environmental cues (temperature and light cycles) and investigated the molecular role of the splicing factor Psi and timeless isoforms in mediating this plasticity.

      Major strengths and weaknesses:

      The primary strength of this work is the novel utilization of long-term selection lines to address fundamental questions about how organisms cope with complex environmental cues. The behavioral data are compelling, clearly demonstrating that "Late" and "Early" flies possess distinct capabilities to track temperature cycles when they are desynchronized from light cycles.

      We sincerely thank the reviewer for this enthusiastic recognition of our study's core strengths. We are particularly gratified that the reviewer highlights our novel use of long-term selection lines (350+ generations) as the primary strength, enabling us to address fundamental evolutionary questions about circadian plasticity under complex environmental cues. We thank them for identifying our behavioral data as compelling (Figs 1, 3), which robustly demonstrate selection-driven divergence in temperature cycle tracking.

      However, a significant weakness lies in the causal links proposed between the molecular findings and these behavioral phenotypes. The molecular insights (Figures 2, 4, 5, and 6) rely on mRNA extracted from whole heads. As head tissue is dominated by photoreceptor cells and glia rather than the specific pacemaker neurons (LNv, LNd) driving these behaviors, this approach introduces a confound. Differential splicing observed here may reflect the state of the compound eye rather than the central clock circuit, a distinction highlighted by recent studies (e.g., Ma et al., PNAS 2023).

      We thank the reviewer for highlighting this important methodological consideration. We fully agree that whole-head extracts do not provide spatial resolution to distinguish central pacemaker neurons (~100-200 total) from compound eyes and glia, and that cell-type-specific profiling represents the critical next experimental step. As mentioned in our response to Reviewer 1, we appreciate the issue with our phrasing and will be revising it accordingly to more clearly describe that we do not claim any causal connections between expression of the tim splice variants in particular circadian neurons and their contribution of the phenotype observed.

      We chose whole-head extracts for practical reasons aligned with our study's specific goals:

      (1) Fly numbers: Our artificially selected populations are maintained at large numbers (~1000s per line). Whole-head extracts enabled sampling ~150 flies per time point = ~600 flies per genotype per environmental, providing means to faithfully sample the variation that may exist in such randomly mating populations.

      (2) Established method for characterizing splicing patterns: The majority of temperature-dependent period/timeless splicing studies have successfully used whole-head extracts (Majercak et al., 1999; Shakhmantsir et al., 2018; Martin Anduaga et al., 2019) to characterize splicing dynamics under novel conditions.

      (3) Novel environmental regimes: Our primary molecular contribution was documenting timeless splicing patterns under previously untested LDTC phase relationships (TC 2-12hr lags relative to LD) and testing whether these exhibit selection-dependent differences consistent with behavioral divergence.

      Furthermore, while the authors report that Psi mRNA loses rhythmicity under out-of-sync conditions, this correlation does not definitively prove that Psi oscillation is required for the observed splicing patterns or behavioral plasticity. The amplitude of the reported Psi rhythm is also low (~1.5 fold) and variable, raising questions about its functional significance in the absence of manipulation experiments (such as constitutive expression) to test causality.

      We thank the reviewer for this insightful comment and appreciate that our phrasing has been misleading. We will especially pay attention to this issue, raised by two reviewers, and clearly highlight our results as correlated evolution and hypothesis-generating.

      We appreciate the reviewer highlighting these points and would like to draw attention to the following points in our Discussion section:

      “Psi and levels of tim-cold and tim-sc (Foley et al., 2019). We observe that this correlation is most clearly upheld under temperature cycles wherein tim-medium and Psi peak in-phase while the cold-induced transcripts start rising when Psi falls (Figure 8A1&2). Under LDTC in-sync conditions this relationship is weaker, even though Psi is rhythmic, potentially due to light-modulated factors influencing timeless splicing (Figure 8B1&2). This is in line with Psi’s established role in regulating activity phasing under TC 12:12 but not LD 12:12 (Foley et al., 2019). This is also supported by the fact that while tim-medium and tim-cold are rhythmic under LD 12:12 (Shakhmantsir et al., 2018), Psi is not (datasets from Kuintzle et al., 2017; Rodriguez et al., 2013). Assuming this to be true across genetic backgrounds and sexes and combined with our similar findings for these three transcripts under LDTC out-of-sync (Figure 2B3, D3&E3), we speculate that Psi rhythmicity may not be essential for tim-medium or tim-cold rhythmicity especially under conditions wherein light cycles are present along with temperature cycles (Figure 8C1&2). Our study opens avenues for future experiments manipulating PSI expression under varying light-temperature regimes to dissect its precise regulatory interactions. We hypothesize that flies with Psi knocked down in the clock neurons should exhibit a less pronounced shift of the evening activity under the range LDTC out-of-sync conditions for which activity is assayed in our study. On the other hand, its overexpression should cause larger delays in response to delayed temperature cycles due to the increased levels of tim-medium translating into delay in TIM protein accumulation.”

      Appraisal of aims and conclusions:

      The authors successfully demonstrate the co-evolution of emergence timing and activity plasticity, achieving their aim on the behavioral level. However, the conclusion that the specific molecular mechanism involves the loss of Psi rhythmicity driving timeless splicing changes is not yet fully supported by the data. The current evidence is correlative, and without spatial resolution (specific clock neurons) or causal manipulation, the mechanistic model remains speculative.

      This study is likely to be of significant interest to the chronobiology and evolutionary biology communities as it highlights the "enhanced plasticity" of circadian clocks as an adaptive trait. The findings suggest that plasticity to phase lags - common in nature where temperature often lags light - may be a key evolutionary adaptation. Addressing the mechanistic gaps would significantly increase the utility of these findings for understanding the molecular basis of circadian plasticity.

      Thank you for this thoughtful appraisal affirming our successful demonstration of co-evolution between emergence timing and circadian activity plasticity.

      Reviewer #3 (Public review):

      Summary:

      This study attempts to mimic in the laboratory changing seasonal phase relationships between light and temperature and determine their effects on Drosophila circadian locomotor behavior and on the underlying splicing patterns of a canonical clock gene, timeless. The results are then extended to strains that have been selected over many years for early or late circadian phase phenotypes.

      Strengths:

      A lot of work, and some results showing that the phasing of behavioural and molecular phenotypes is slightly altered in the predicted directions in the selected strains.

      We thank the reviewer for acknowledging the substantial experimental effort across 7 environmental regimes (6 LDTC phase relationships + LDTC in-phase), 12 replicate populations (early<sub>1-4</sub>, control<sub>1-4</sub>, late<sub>1-4</sub>), and comprehensive behavioural + molecular phenotyping.

      Weaknesses:

      The experimental conditions are extremely artificial, with immediate light and temperature transitions compared to the gradual changes observed in nature. Studies in the wild have shown how the laboratory reveals artifacts that are not observed in nature. The behavioural and molecular effects are very small, and some of the graphs and second-order analyses of the main effects appear contradictory. Consequently, the Discussion is very speculative as it is based on such small laboratory effects.

      We thank the reviewer for these important points regarding ecological validity, effect sizes, and interpretation scope.

      (1) Behavioural effects are robust across population replicates in selection lines (not small/weak)

      Our study assayed 12  populations total (4 replicate populations each of early, control, and late selection lines) under 7 LDTC regimes. Critically, selection effects were consistent across all 4 replicate populations within each selection line for every condition tested. In these randomly mating large populations, the mixed model ANOVA reveals highly significant selection×regime interactions [F(5,45)=4.1, p=0.003; Fig 3E, Table S2], demonstrating strong, replicated evolutionary divergence in evening temperature sensitivity.

      (2) Molecular effects test critical evolutionary hypothesis

      As stated in our Introduction, "selection can shape circadian gene splicing and temperature responsiveness" (Low et al., 2008, 2012). Our laboratory-selected chronotype populations—known to exhibit evolved temperature responsiveness (Abhilash et al., 2019, 2020; Nikhil et al., 2014; Vaze et al., 2012)—provide an apt system to test whether selection for temporal niche leads to divergence in timeless splicing. With ~600 heads per environmental regime per selection line, we detect statistically robust, selection line-specific temporal profiles [early4 advanced timeless phase (Fig 4A4); late4 prolonged tim-cold (Fig 5A4); significant regime×selection×time interactions (Tables S3-S5)], providing initial robust evidence of correlated molecular evolution under novel LDTC regimes.

      (3) Systematic design fills critical field gap

      Artificial conditions like LD/DD have been useful in revealing fundamental zeitgeber principles. Our systematic 2-12hr TC-LD lags directly implement Pittendrigh & Bruce (1959) + Oda & Friesen (2011) validated design, which discuss how such experimental designs can provide a more comprehensive understanding of zeitgeber integration compared to studies with only one phase jump between two zeitgebers.

      (4) Ramping regimes as essential next step

      Gradual ramping regimes better mimic nature and represent critical future experiments. New Discussion addition in the revised version: "Ramping LDTC regimes can test whether selection-specific zeitgeber hierarchy persists under naturalistic gradients." While ramping experiments are essential, we would like to emphasize that we aimed to use this experimental design as a tool to test if evening activity exhibits greater temperature sensitivity and if this property of the circadian system can undergo correlated evolution upon selection for timing of eclosion/emergence.

      (5) New startle quantification addresses masking

      Our startle quantification (which will be added as a new supplementary figure) confirms circadian evening tracking persists despite quantified, selection-independent masking in most of the regimes.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript by Hao Jiang et al described a systematic approach to identify proline hydroxylation proteins. The authors implemented a proteomic strategy with HILIC-chromatographic separation and reported an identification of 4993 sites from HEK293 cells (4 replicates) and 3247 sites from RCC4 sites (3 replicates) with 1412 sites overlapping between the two cell lines. From the analysis, the authors identified 225 sites and 184 sites respectively from 293 and RCC4 cells with HyPro diagnostic ion. The identifications were validated by analyzing a few synthetic peptides, with a specific focus on Repo-man (CDCA2) through comparing MS/MS spectra, retention time, and diagnostic ions. With SILAC analysis and recombinant enzyme assay, the study showed that Repo-man HyPro604 is a target of the PHD1 enzyme.

      Strengths:

      The study involved extensive LC-MS analysis and was carefully implemented. The identification of over 4000 confident proline hydroxylation sites would be a valuable resource for the community. The characterization of Repo-man proline hydroxylation is a novel finding.

      Weaknesses:

      However, as a study mainly focused on methodology, the findings from the experimental data did not convincingly demonstrate the sensitivity and specificity of the workflow for site-specific identification of proline hydroxylation in global studies.

      Proline hydroxylation is an enzymatic post translational protein modification, catalysed by prolyl Hydroxylases (PHDs), which can have profound biological significance, e.g. altering protein half-life and/or the stability of protein-protein interactions. Furthermore, there has been controversy in the field as to the true number of protein targets for PHDs in cells. Thus, there is a clear need for methods to enable the robust identification of genuine PHD targets and to reliably map sites of PHD-catalysed proline hydroxylation in proteins. We believe, therefore, that our methodology, as reported here in Jiang et al., is an important contribution towards this goal. We note that our methodology has already been used successfully by others

      (https://doi.org/10.1016/j.mcpro.2025.100969). While further improvements in this methodology may of course be developed in future, we are not currently aware of any superior methods that have been reported previously in the literature. The criticism made by the reviewer notably does not include reference to any such alternative published methodology that interested researchers can use which would offer superior results to the approach we document in this study.

      Major concerns:

      (1) The study applied HILIC-based chromatographic separation with a goal of enriching and separating hydroxyproline-containing peptides. However, as the authors mentioned, such an approach is not specific to proline hydroxylation. In addition, many other chromatography techniques can achieve deep proteome fractionation such as high pH reverse phase fractionation, strong-cation exchange etc. There was no data in this study to demonstrate that the strategy offered improved coverage of proline hydroxylation proteins, as the identifications of the HyPro sites could be achieved through deep fractionation and a highly sensitive LCMS setup. The data of Figure 2A and S1A were somewhat confusing without a clear explanation of the heat map representations. 

      The data we present in this study demonstrate clearly that peptides with hydroxylated prolines are enriched in specific HILIC fractions (F10-F18), in comparison with total unfractionated peptides derived from cell extracts. We also refer the reviewer to our previously published study by Bensaddek et al (International Journal of Mass Spectrometry: doi:10.1016/j.ijms.2015.07.029), which was reference 41 in this study, in which we compared directly the performance of both HILIC and strong anionic exchange chromatography, (hSAX). This showed that HILIC provided superior enrichment to hSAX for enrichment of peptides containing hydroxylated proline residues. To clarify this point for readers, we have now included a specific reference to our previous study at the start of the Results section in our current revision. Currently, we use HILIC to provide a degree of enrichment for proline hydroxylated peptides because we are not aware of alternative chromatographic methods that in our hands provide better results.

      We have included descriptions of the information shown in the heatmaps in the associated figure legends and captions.

      (2) The study reported that the HyPro immonium ion is a diagnostic ion for HyPro identification. However, the data showed that only around 5% of the identifications had such a diagnostic ion. In comparison, acetyl-lysine immonium ion was previously reported to be a useful marker for acetyllysine peptides (PMID: 18338905), and the strategy offered a sensitivity of 70% with a specificity of 98%. In this study, the sensitivity of HyPro immonium ion was quite low. The authors also clearly demonstrated that the presence of immonium ion varied significantly due to MS settings, peptide sequence, and abundance. With further complications from L/I immonium ions, it became very challenging to implement this strategy in a global LC-MS analysis to either validate or invalidate HyPro identifications.

      The reviewer appears to have misunderstood the point we make with regard to the identification of the immonium ion and its use as a diagnostic marker for proline hydroxylation in MS analyses. We do not claim that this immonium ion is an essential diagnostic marker for proline hydroxylation. As the reviewer notes, with respect to the acetyl-lysine modification, the corresponding immonium ion is often used in MS studies as a diagnostic for identification of specific post translational modifications. Previous studies have reported that the immonium ion for hydroxylated proline is detected when the transcription factor HIF is analysed, but is often absent with other putative PHD targets, which has been used as an argument that these targets are not genuine proline hydroxylation sites. We are not, therefore, introducing the idea in this study that the hydroxy-proline immonium ion is a required diagnostic marker for proline hydroxylation, but instead demonstrating that detection of this ion, at least in some peptide sequences, may require the use of higher MS collision energies than are typically required for routine peptide identification. We believe that this is an interesting observation that can help to clear up discussions in the literature regarding the true prevalence of PHD-catalysed proline hydroxylation in different target proteins. Our data suggest that, in future MS studies analysing suspected PHD target proteins, two different collision energy might need to be used, i.e., normal collision energy for the routine identification of a peptide, combined with use of a higher collision energy if the hydroxy-proline immonium ion was not already detected.

      (3) The study aimed to apply the HILIC-based proteomics workflow to identify HyPro proteins regulated by the PHD enzyme. However, the quantification strategy was not rigorous. The study just considered the HyPro proteins not identified by FG-4592 treatment as potential PHD targeted proteins. There are a few issues. First, such an analysis was not quantitative without reproducibility or statistical analysis. Second, it did not take into consideration that data-dependent LC-MS analysis was not comprehensive and some peptide ions may not be identified due to background interferences. Lastly, FG-4592 treatment for 24 hrs could lead to wide changes in gene expressions and protein abundances. Therefore, it is not informative to draw conclusions based on the data for bioinformatic analysis.

      We refer the reviewer to the data we present in this study using SILAC analysis, combined with our MS workflow. to achieve a more accurate quantitative picture of proline hydroxylation levels. While we agree that the point the reviewer makes is valid, regarding our data dependent LC-MS/MS analysis potentially not being comprehensive, this means, however, that we are potentially underestimating the true prevalence of proline hydroxylated peptides, not overestimating the level of these modified peptides. We also refer the reviewer to the accompanying study by Druker et al., (eLife 2025; doi.org/10.7554/eLife.108131.1)  in which we present a detailed follow-on study demonstrating the functional significance of the novel proline hydroxylation site we detected in the protein RepoMan (CDCA2). Therefore, even if we have not achieved a fully comprehensive analysis of all proline hydroxylated peptides catalysed by PHD enzymes, we believe that we have advanced the field by documenting a workflow that is able to identify and validate novel PHD targets.

      (4) The authors performed an in vitro PHD1 enzyme assay to validate that Repo-man can be hydroxylated by PHD1. However, Figure 9 did not show quantitatively PHD1-induced increase in Repo-man HyPro abundance and it is difficult to assess its reaction efficiency to compare with HIF1a HyPro.

      The analysis shown in Figure 9 was not intended to quantify the efficiency of in vitro hydroxylation of RepoMan by PHD1, but rather to answer the question, ‘Can recombinant PHD1 alone hydroxylate P604 on RepoMan in vitro, yes or no?’. The data show that the answer here is ‘yes’. Clearly, the HIF peptide is a more efficient substrate in vitro for recombinant PHD1 than the RepoMan peptide and we have now included a statement in the Discussion that addresses the significance of this observation more directly.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Jiang et al. developed a robust workflow for identifying proline hydroxylation sites in proteins. They identified proline hydroxylation sites in HEK293 and RCC4 cells, respectively. The authors found that the more hydrophilic HILIC fractions were enriched in peptides containing hydroxylated proline residues. These peptides showed differences in charge and mass distribution compared to unmodified or oxidized peptides. The intensity of the diagnostic hydroxyproline iminium ion depended on parameters including MS collision energy, parent peptide concentration, and the sequence of amino acids adjacent to the modified proline residue. Additionally, they demonstrate that a combination of retention time in LC and optimized MS parameter settings reliably identifies proline hydroxylation sites in peptides, even when multiple proline residues are present.

      Strengths:

      Overall, the manuscript presents an advanced, standardized protocol for identifying proline hydroxylation. The experiments were well designed, and the developed protocol is straightforward, which may help resolve confusion in the field.

      Weaknesses:

      (1) The authors should provide a summary of the standard protocol for identifying proline hydroxylation sites in proteins that can easily be followed by others.

      This is a good suggestion and we have now included a figure (Figure 10) with a summary of our workflow in the current revision.

      (2) Cockman et al. proposed that HIF-α is the only physiologically relevant target for PHDs. Their approach is considered the gold standard for identifying PHD targets. Therefore, the authors should discuss the major progress they made in this manuscript that challenges Cockman's conclusion.

      While we had mentioned the Cockman et al., paper in the Introduction, we had not focussed on this somewhat controversial issue. However, in response to the Reviewer’s request, we have now added a comment in the Discussion section in the current revision of how our new data address the proposal discussed previously by Cockman et al. In brief, we believe that our findings are not consistent with a model in which PHDs have no protein targets other than HIFs.

      Reviewer #3 (Public review): 

      Summary:

      The authors present a new method for detecting and identifying proline hydroxylation sites within the proteome. This tool utilizes traditional LC-MS technology with optimized parameters, combined with HILIC-based separation techniques. The authors show that they pick up known hydroxy-proline sites and also validate a new site discovered through their pipeline.

      Strengths:

      The manuscript utilizes state-of-the-art mass spectrometric techniques with optimized collision parameters to ensure proper detection of the immonium ions, which is an advance compared to other similar approaches before. The use of synthetic control peptides on the HILIC separation step clearly demonstrates the ability of the method to reliably distinguish hydroxy-proline from oxidized methionine - containing peptides. Using this method, they identify a site on CDCA2, which they go on to validate in vitro and also study its role in regulation of mitotic progression in an associated manuscript.

      Weaknesses:

      Despite the authors' claim about the specificity of this method in picking up the intended peptides, there is a good amount of potential false positives that also happen to get picked (owing to the limitations of MS-based readout), and the authors' criteria for downstream filtering of such peptides require further clarification. In the same vein, greater and more diverse cell-based validation approach will be helpful to substantiate the claims regarding enrichment of peptides in the described pathway analyses.

      We of course agree that false positives may arise, as is true for essentially all PTM studies. There are two issues here; first, are identified sites technically correct? (i.e. not misidentifications from the MS data) and second, are the identified modifications of biological significance? We have addressed this using the popular MaxQuant software suite to evaluate the modifications identified and to control the false discovery rate (FDR) at both the precursor and protein level, as described in the manuscript. We are aware that false positives could arise from confusing oxidation of methionine with hydroxylation of proline. Therefore, to address the issue as to whether we could identify bona fide PHD protein targets outside of the HIF family, we adopted a conservative approach by simply filtering out peptides where there was a methionine residue within three amino acids of the predicted proline hydroxylation site. This was a pragmatic decision made to reduce the likelihood of false positives in our dataset and we recognise that this likely results in us overlooking some genuine proline hydroxylation sites that occur nearby methionine residues. To address the potential biological relevance of the proline hydroxylation sites identified, we analysed extracts from cells treated with FG inhibitors. Of course a detailed understanding of biological significance relies upon follow-on experimental analyses for each site, which we have performed for P604 on RepoMan in accompanying study by Druker et al., (eLife 2025; doi.org/10.7554/eLife.108131.1).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The finding that the immonium ion intensities of L/I did not increase with increasing collision energy was surprising. Was this specific to this synthetic peptide?

      We agree this is an interesting and unexpected finding. We have no reason to believe that it is specific to synthetic peptides per se, but rather think this reflects an effect of amino acid composition in the peptides analysed. It will be interesting to explore this phenomenon in more detail in future.

      (2) The sequence logos in Figure 4 seemed to lack any amino acid enrichment in most positions except for collagen peptides. Have these findings been tested with statistical analysis?

      The results we show for sequence logo analysis were generated using WebLogo (10.1101/gr.849004) and correspond to an analysis of all proline hydroxylated peptides we detected across all cell lines and replicates analysed. The fact that collagens are highly abundant proteins with very high levels of proline hydroxylation likely explains why collagen peptides dominated the outcome of the sequence logo analysis. There is clearly scope for more detailed follow up analysis in future of the sequence specificity of proline hydroxylation sites in no- collagen proteins that are validated PHD targets.

      (3) Overall figure quality was not ideal. The resolution and font sizes of figures should be carefully evaluated and adjusted. The figure legend should contain a title for the figure. Annotations of the figures were somewhat confusing. 

      We agree with the criticism of the figure resolution in the review copies - the lower resolution appears to have been generated after we had uploaded higher resolution original images. We are providing again higher resolution versions of all figures for the current revision.

      Reviewer #3 (Recommendations for the authors):

      Certain concerns regarding portions of the manuscript that need addressing:

      (1) " These data show that two different cell lines show unique profiles of proteins with hydroxylated peptides." - It is difficult to conclusively say this statement after profiling the prolyl hydroxy proteome from just two cell lines, especially since the amino acids with the highest frequency in the most enriched peptides are similar in both cell lines.

      We agree with this point and have changed the current revision to state instead, “This shows that each of the two cell lines analysed have distinct profiles.”

      (2) "We noted that there was a high frequency of a methionine residues appearing either at the first, second, or even third positions after the HyPro site.." - according to the authors, claim, the advantage of their method was that they were able to overcome the limitation of older methods that couldn't separate methionine oxidation from proline hydroxylation. However, in this statement, they say that the high frequency of methionine residues may be because of the similar mass shift. These statements are contradictory. The authors should either tone down the claim or prove that those are indeed hydroxyproline sites. Is it possible that in the filtering step of excluding these high-frequency of methionine - containing peptides, we are losing potential positive hits for hydroxy-proline sites? What is the authors' take on this?

      We respectfully do not agree that our, “statements are contradictory”, with respect to the potential confusion between identification of methionine oxidation and proline hydroxylation, but acknowledge that we have not explained this issue clearly enough. It is a fact that the similar mass shift resulting from proline hydroxylation and methionine oxidation is a technical challenge that can potentially lead to misidentifications in MS studies and that is what we state clearly in the manuscript. We have addressed this issue head on experimentally in this study and show using synthetic peptides how detailed analysis of specific proline hydroxylation sites in target proteins can be distinguished from methionine oxidation, based upon differential chromatographic behaviour of peptides with either hydroxylated proline or oxidised methionine, as well as by detailed analysis of fragmentation spectra. However, in the case of our global analysis, as we were not able to perform synthetic peptide comparisons for every putative site identified, we took the pragmatic approach of filtering out examples of peptides where a methionine residue was present within three residues of a potential proline hydroxylation site. This was done simply to reduce the possibility of misidentification in the set of novel proline hydroxylated peptides identified and we accept that as a consequence we are likely filtering out peptides that include bona fide proline hydroxylation sites. We have clarified this point in the current revision and hope to be able to address this issue more comprehensively in future studies.

      (3) "Accordingly, a score cut-off of 40 for hydroxylated peptides and a localisation probability cut-off of more than 0.5 for hydroxylated peptides was performed." Could the authors shed more light and clarify what was the basis for this value of cut-off to be used in this filtering step? Is this sample dependent? What should be the criteria to determine this value?

      We used MaxQuant software (10.1016/j.cell.2006.09.026), for PTM analysis, in which a localization probability score of 0.75 and score cut-off of 40 is a commonly used threshold to define high confidence. The reason that we used 0.5 at the first step was to investigate how likely it might be that the misassignment of delta m/z +16 Da (oxidation) on Methionine would affect the identification of hydroxylation on Proline. However, we note that in the final results set used for analysis, all putative proline hydroxylated peptides that had a Methionine residue near to the hydroxylated proline were disregarded as a pragmatic step to reduce the probability of false identifications.

      (4) The authors are requested to kindly make the HPLC and MS traces more legible and use highresolution images, with clearly labeled values on the peaks. Kindly extract coordinates from the underlying data files to plot the curves if needed to make it clearer.

      We have reviewed the clarity of all images and figures in the current revision.

      (5) There seems to be no error bars in Figure 3, Figure 7E, and panels of Figure 8 with bar graphs. Are those single replicate data?

      These specific figures are from single replicate data.

      (6) For Figure 9C, the control with only PHD1 (no peptide) is missing. 

      The ‘no peptide control’ was not included in the figure because it is simply a blank lane and there is nothing to see.

    1. except for a few damp kimonos flapping on balcony clotheslines.

      I like this visual. Having wet clothes drying on a line outside is something everyone recognizes, whether it is something they grew up doing, or they simply saw in multiple pieces of media. Thus it is very easy to picture that scene in your head. At the same time, the author manages to make it hint at the location by specifying that kimonos are drying as opposed to just leaving it at a general statement of clothes are drying.

    2. “I believe I should learn to be more patient.”

      The descriptions of the scenes and the people the author talks to are so well done. The dialogue adds to it as well. When I read this part, I could picture Mr. Watanabe in my head and could almost hear him talking.

    3. but anyway then a man went climbing Mount Fuji

      I'm hoping the long sentences are just done on purpose now that I'm reading more. I'm reading this and feeling the same way I do whenever a child tries to tell me what they did on the weekend. A few times feels purposeful, but the lack of a break makes my head spin.

    1. from a financial perspective you should do all you can to graduate as soon as you can.

      With so much weighing on a college student , I think it not fair to have this thought of finances lingering over your head. Focusing on classes, studying and just surviving is enough.

    1. A typical sound recording device would not be able to recordthe microwave sound because it is generated inside the sub-ject’s head. This includes any commercially available cellularmobile telephone. On the contrary, if it was easy to record thesound with a typical sound recording device, most personsin the same room or environment should be able to hear theloud sound.

      Illustrates the feasibility of the phenomenon

    2. The symptomsinclude headaches, severe hearing loss, ringing in the ears,nausea, and problems with balance or vertigo, and are sug-gestive of a connection to the inner ear apparatus within thehuman head

      Symptoms

    Annotators

    1. Moreour on the third day of Aprill wee heard that after theis Rogues had gotten the Pynnace, and had taken all furnitures as peeces, sword, armour, Coat of male. Powder, shot and all the thinges that they had to trade withall, they killed the Captaine, and Cut of his head, and rowing with the taile of the boat formost they set vp a pole and put the Captaines head vpon it, and so rowed home, then the Deuill set them on againe, so that they furnished about 200 Canoes with aboue 1000 Indians, and came and thought to haue taken the shipp, but sheewas too quicke for them wch thing was very much talked of, for they alwayes feared a ship, but now the Rogues growe verie bold, and can vse peeces, some of them, as well or better then an Englishman, ffor an Indian did shoote with Mr Charles my Mrs Kindsman at a m

      April 3 letter - Indian victory and accelerating catastrophe: the Pinnace capture arms/ammunition, kill the captain, and muster 200 canoes with 1000 warriors. The English are outnumbered and outgunned. This escalation shows the colony is losing control and facing near-total annihilation.

    1. Reviewer #1 (Public review):

      Summary:

      Renard, Ukrow et al. applied their recently published computational pipeline (CHROMAS) to the skin of Euprymna berryi and Sepia officinalis to track the dynamics of cephalopod chromatophore expansion. By segmenting each chromatophore into radial slices and analyzing the co-expansion of slices across regions of the skin, they inferred the motor control underlying chromatophore groups.

      Strengths:

      The authors demonstrate that most motor units of cephalopod skin include a subregion of multiple chromatophores, creating "virtual chromatophores" in between the fixed chromatophores. This is an interesting concept that challenges prevailing models of chromatophore organization, and raises interesting possibilities for how chromatophore arrays may be patterned during development.

      This study introduces new analyses of cephalopod skin that will be valuable for the quantitative study of cephalopod behavior.

      Weaknesses:

      The authors chose to image spontaneous skin changes in sedated animals, rather than visually-evoked skin changes in awake, freely-moving animals. Spontaneous chromatophore changes tend to be small shimmers of expansion and contraction, rather than obvious, sizable expansions. This may make it more challenging to distinguish truly co-occurring expansions from background activity. The authors don't provide any raw data (videos) of the skin, so it is difficult to independently assess the robustness of the inferred chromatophore groupings.

      The patch-clamp experiments in E. berryi are used to test the validity of their approach for inferring motor units. The stimulations evoke expansions of sub-regions of each chromatophore, creating "virtual chromatophores" as predicted from the behavioral analysis. However, the authors were not able to predict these specific motor units from behavioral analysis before confirming them with patch-clamp, limiting the strength of the validation. It would be informative to quantify the results of the patch-clamp experiments - are the inferred motor units of similar sizes to those predicted from behavior?

      The authors report testing multiple experimental conditions (e.g., age, size, behavioral stimuli, sedation, head-fixation, and lighting), but only a small subset of these data are presented. It is difficult to determine which conditions were used for which experiments, and the manuscript would benefit from pooling data from multiple experiments to draw general conclusions about the motor control of cephalopod skin.

      The authors use a different clustering algorithm for E. berryi and S. officinalis, but do not discuss why different clustering approaches were required for the two species.

      Impact:

      The authors use their computational pipeline to generate a number of interesting predictions about chromatophore control, including motor unit size, their spatial distribution within the skin, and the independent control of subregions within individual chromatophores by putatively distinct motor neurons. While these observations are interesting, the current data do not yet fully support them.

      The CHROMAS tool is likely to be valuable to the field, given the need for quantitative frameworks in cephalopod biology. The predictions outlined here provide a useful foundation for future experimental investigation.

    1. I remember my father saying,“Que bueno, mi’ha, that’s good.” That meant alot to me, especially since my brothersthought the idea hilarious. What I didn’trealize was that my father thought collegewas good for girls—good for finding ahusband. After four years in college and twomore in graduate school, and still nohusband, my father shakes his head evennow and says I wasted all that education.

      This part really played with my emotions because at first I thought her father was supportive, but later I realized he only saw college as a tool for her to find someone to marry instead of getting an education.

    1. I’ve been crossing things out every time I take a moment to think.

      Is showing that this type of action from other does affect her like she has cross some of her own dialogue even in her own head to make it make sense to the others.

  4. Jan 2026
    1. Calendar Planners and To-Do Lists Calendar planners and to-do lists are effective ways to organize your time. Many types of academic planners are commercially available (check your college bookstore), or you can make your own. Some people like a page for each day, and some like a week at a time. Some use computer calendars and planners. Almost any system will work well if you use it consistently. Some college students think they don’t need to actually write down their schedule and daily to-do lists. They’ve always kept it in their head before, so why write it down in a planner now? Some first-year students were talking about this one day in a study group, and one bragged that she had never had to write down her calendar because she never forgot dates. Another student reminded her how she’d forgotten a preregistration date and missed taking a course she really wanted because the class was full by the time she went online to register. “Well,” she said, “except for that time, I never forget anything!” Of course, none of us ever forgets anything—until we do. Calendars and planners help you look ahead and write in important dates and deadlines so you don’t forget. But it’s just as important to use the planner to schedule your own time, not just deadlines. For example, you’ll learn later that the most effective way to study for an exam is to study in several short periods over several days. You can easily do this by choosing time slots in your weekly planner over several days that you will commit to studying for this test. You don’t need to fill every time slot, or to schedule every single thing that you do, but the more carefully and consistently you use your planner, the more successfully will you manage your time. But a planner cannot contain every single thing that may occur in a day. We’d go crazy if we tried to schedule every telephone call, every e-mail, every bill to pay, every trip to the grocery store. For these items, we use a to-do list, which may be kept on a separate page in the planner. Check the example of a weekly planner form in Figure 2.5 “Weekly Planner”. (You can copy this page and use it to begin your schedule planning. By using this first, you will find out whether these time slots are big enough for you or whether you’d prefer a separate planner page for each day.) Fill in this planner form for next week. First write in all your class meeting times; your work or volunteer schedule; and your usual hours for sleep, family activities, and any other activities at fixed times. Don’t forget time needed for transportation, meals, and so on. Your first goal is to find all the blocks of “free time” that are left over. Remember that this is an academic planner. Don’t try to schedule in everything in your life—this is to plan ahead to use your study time most effectively. Next, check the syllabus for each of your courses and write important dates in the planner. If your planner has pages for the whole term, write in all exams and deadlines. Use red ink or a highlighter for these key dates. Write them in the hour slot for the class when the test occurs or when the paper is due, for example. (If you don’t yet have a planner large enough for the whole term, use Figure 2.5 “Weekly Planner” and write any deadlines for your second week in the margin to the right. You need to know what’s coming next week to help schedule how you’re studying this week.)

      Calendar planners and to-do lists help students organize their time and avoid forgetting important dates. Writing schedules down is more reliable than keeping everything in your head, because everyone forgets things sometimes. Planners are not only for deadlines but also for scheduling study time in advance so work is spread out and less stressful. To-do lists are useful for smaller daily tasks that don’t fit into a planner, helping you stay organized without feeling overwhelmed.

    2. Procrastination is a way of thinking that lets one put off doing something that should be done now. This can happen to anyone at any time. It’s like a voice inside your head keeps coming up with these brilliant ideas for things to do right now other than studying: “I really ought to get this room cleaned up before I study” or “I can study anytime, but tonight’s the only chance I have to do X.” That voice is also very good at rationalizing: “I really don’t need to read that chapter now; I’ll have plenty of time tomorrow at lunch.…” Procrastination is very powerful. Some people battle it daily, others only occasionally. Most college students procrastinate often, and about half say they need help avoiding procrastination. Procrastination can threaten one’s ability to do well on an assignment or test. People procrastinate for different reasons. Some people are too relaxed in their priorities, seldom worry, and easily put off responsibilities. Others worry constantly, and that stress keeps them from focusing on the task at hand. Some procrastinate because they fear failure; others procrastinate because they fear success or are so perfectionistic that they don’t want to let themselves down. Some are dreamers. Many different factors are involved, and there are different styles of procrastinating. Just as there are different causes, there are different possible solutions for procrastination. Different strategies work for different people. The time management strategies described earlier can help you avoid procrastination. Because this is a psychological issue, some additional psychological strategies can also help: Since procrastination is usually a habit, accept that and work on breaking it as you would any other bad habit: one day at a time. Know that every time you overcome feelings of procrastination, the habit becomes weaker—and eventually you’ll have a new habit of being able to start studying right away. Schedule times for studying using a daily or weekly planner. Carry it with you and look at it often. Just being aware of the time and what you need to do today can help you get organized and stay on track. If you keep thinking of something else you might forget to do later (making you feel like you “must” do it now), write yourself a note about it for later and get it out of your mind. Counter a negative with a positive. If you’re procrastinating because you’re not looking forward to a certain task, try to think of the positive future results of doing the work. Counter a negative with a worse negative. If thinking about the positive results of completing the task doesn’t motivate you to get started, think about what could happen if you keep procrastinating. You’ll have to study tomorrow instead of doing something fun you had planned. Or you could fail the test. Some people can jolt themselves right out of procrastination. On the other hand, fear causes procrastination in some people—so don’t dwell on the thought of failing. If you’re studying for a test, and you’re so afraid of failing it that you can’t focus on studying and you start procrastinating, try to put things in perspective. Even if it’s your most difficult class and you don’t understand everything about the topic, that doesn’t mean you’ll fail, even if you may not receive an A or a B. Study with a motivated friend. Form a study group with other students who are motivated and won’t procrastinate along with you. You’ll learn good habits from them while getting the work done now. Keep a study journal. At least once a day write an entry about how you have used your time and whether you succeeded with your schedule for the day. If not, identify what factors kept you from doing your work. (Use the form at the end of this chapter.) This journal will help you see your own habits and distractions so that you can avoid things that lead to procrastination. Get help. If you really can’t stay on track with your study schedule, or if you’re always putting things off until the last minute, see a college counselor. They have lots of experience with this common student problem and can help you find ways to overcome this habit.

      Procrastination is a common habit where people delay important tasks by making excuses to do something else. It affects many students for different reasons, such as stress, fear of failure, or poor time management, and can hurt academic performance. However, with planning, positive thinking, and the right strategies, procrastination can be reduced and overcome over time.

    1. In 1449, one out of everythree immigrant New Yorkers was Caribbean born, with Dominicans head-ing the list (see tables 2 and 6).7 In fact, they are the largest new immigrantgroup in the city, accounting for just over 299,999, or about 12 percent

      Why did New York City attract such a high Caribbean population when Miami is much closer geographically and seemingly more convenient? Perhaps it is because NYC is well-known as an immigrant city/hub, with a much more diverse scene and also many economic opportunities.

    1. For instance, a computationally sophisticated MIT student who is also an expert gamer instanced a particular dramatic moment from the text-based Zork II as among his lifetime favorites: The story involves a dragon that is slow to rouse but always lethal if you persist in fighting him. Elsewhere in the dungeon is a wall of ice that is impossible to pass. What you must do is attack the dragon enough to get his attention—but not so much that he “toasts” you—and then run and head for the wall of ice. The dragon follows, sees his reflection in the ice, and thinks it is another dragon. He rears up and breathes fire at his enemy; as he does so, the ice melts, drowning the dragon and eliminating the obstructing wall

      Through the medium of a game, agency is limited but the environment is engineered in a way that makes helplessness as just another step or another obstacle adding to the enjoyment. Much like Odysseus, you're placed in a situation where there's not much to work with but through your agency you can make what there is of the situation to get out of it.

    1. Taliban gunman shot her in the head because she had spoken publicly in favor of girls’ right to education.

      this just goes to show that freedom of speech isnt all over the world and people die everyday for what they say

    2. Taliban gunman shot her in the head because she had spoken publicly in favor of girls’ right to education.

      I find Malala to be a huge inspiration to me, and I feel that she is strong for what she has done and I look up to her.

    1. Poor Germans that we are--we have been deceived! We were born to be masters, and we have been compelled to bow the head beneath the yoke of our tyrants, and to become slaves.

      I think this is Luther trying to express the corruption and authority the catholic church by comparing it to slavery.

    1. e need fast thinking to get through the day—emotions, gut instincts, reflexes, intuitions serve us w

      I appreciate how the importance of both fast and slow thinking is mentioned. Fast thinking is important in busy environments and setting require you to have commonsense or knowledge on the top of your head. While slow thing is needed when you encounter a situation that has high risk, give you time to thing, and require research.

    1. “And the onlyreason for making honey is so as I can eat it.

      Children have difficulty imagining the world outside of their perspective. E.g a child nodding or shaking their head while on the phone with someone who cannot see them.

    Annotators

    1. He can have no true regard for me, or he would not have listened to her; and she, with her little rebellious heart and indelicate feelings, to throw herself into the protection of a young man with whom she has scarcely ever exchanged two words before! I am equally confounded at her impudence and his credulity. How dared he believe what she told him in my disfavour! Ought he not to have felt assured that I must have unanswerable motives for all that I had done? Where was his reliance on my sense and goodness then? Where the resentment which true love would have dictated against the person defaming me—that person, too, a chit, a child, without talent or education, whom he had been always taught to despise?

      Lady Susan is an absolutely intriguing character, albeit heavily dislikeable to me personally. That being said, she's spectacularly written; the characterization within this passage alone is outstanding! It goes without saying that Susan does not like being played with, and that is exactly what is being described here. She already has a large amount of unwarranted distaste for her own daughter, of which she convinced Reginald to believe, and is now infuriated by the mere thought of her daughter going behind her back and, in her head, twisting what she has convinced Reginald of. In short, she does not like being made out to be the bad guy. Though a very heated letter, I did sense some interesting contesting feelings towards her daughter, Frederica. Though Susan's dislike of her child is palpable, there seems to still be a sense of caring. After all, she is knowledgeable enough to realize Frederica is romantically pursuing Reginald, but Susan is angered by Reginald toying with this likely one-sided romantic interest from her daughter. It is an incredibly interesting family dynamic between mother and daughter, especially with Reginald involved.

    1. Where do you see parasocial relationships on social media? In what ways are you in parasocial relationships? What are the ways in which a parasocial relationship can be authentic or inauthentic? both for the celebrity and for the viewer/follower

      I am very deep into fandom spaces, and I see very frequently parasocial relationships festering. Typically, I see parasocial relationships forming in children and teenagers the most in these spaces, typically also consisting of fandoms with a large neurodivergent population. Usually this happens with creators that have their face on screen or a lot of time to communicate with their viewers, the main example being streamers and their chats. Being able to receive a reply in real time is like a conversation, which can lead to someone believing they are friends with the streamer they are typing to. I'm sure I have some parasocial relationships currently and don't really realize it. I know in the past I had a few with streamers I watched, however I always kept a degree of separation in my head: I do not truly know these people.

    1. Reviewer #3 (Public review):

      The goal of the work is to establish the linkage between the spatial transcription factors (STF's) that function transiently to establish the identities of the individual NBs and the terminal selector genes (typically homeodomain genes) that appear in the new-born post-mitotic neurons. How is the identity of the NB maintained and carried forward after the spatial genes have faded away? Focusing on a single neuroblast (NB 7-1), the authors present evidence that the fork-head transcription factor, fd4, provides a bridge linking the transient spatial cues that initially specified neuroblast identity with the terminal selector genes that establish and maintain the identity of the stem cell's progeny.

      The study is systematic, concise and takes full advantage of 40+ years of work on the molecular players that establish neuronal identities in the Drosophila CNS. In the embryonic VNC, fd4 is expressed only in the NB 7-1 and its lineage. They show that Fd4 appears in the NB while the latter is still expressing the Spatial Transcription Factors and continues after the expression of the latter fades out. Fd4 is maintained through the early life of the neuronal progeny but then declines as the neurons turn on their terminal selector genes. Hence, fd4 expression is compatible with it being a bridging factor between the two sets of genes.

      Experimental support for the "bridging" role of Fd4 comes from set of loss-of-function and gain-of-function manipulations. The loss of function of fd4, and the partially redundant gene fd5, from lineage 7-1 does not affect the size of the lineage, but terminal markers of late-born neuronal phenotypes, like Eve and Dbx, are reduced or missing. By contrast, ectopic expression of fd4, but not fd5, results in ectopic expression of the terminal markers eve and dbx throughout diverse VNC lineages.

      A detailed test of fd4's expression was then carried out using lineages 7-3 and 5-6, two well characterized lineages in Drosophila. Lineage 7-3 is much smaller that 7-1 and continues to be so when subjected to fd4 misexpression. However, under the influence of ectopic fd4 expression, the lineage 7-3 neurons lost their expected serotonin and corazonin expression and showed Eve expression as well as motoneuron phenotypes that partially mimic the U motoneurons of lineage 7-1.

      Ectopic expression of Fd4 also produced changes in the 5-6 lineage. Expression of apterous, a feature of lineage 5-6 was suppressed, and expression of the 7-1 marker, Eve, was evident. Dbx expression was also evident in the transformed 5-6 lineages but extremely restricted as compared to a normal 7-1 lineage. Considering the partial redundancy of fd4 and fd5, it would have been interesting to express both genes in the 5-6 lineage. The anatomical changes that are exhibited by motoneurons in response to fd4 expression confirms that these cells do, indeed, show a shift in their cellular identity.

      Comments on revisions:

      The authors adequately addressed all of the issues that I had with the original submission.

      Their responses to the other reviewers are also well-reasoned

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Lai and Doe address the integration of spatial information with temporal patterning and genes that specify cell fate. They identify the Forkhead transcription factor Fd4 as a lineage-restricted cell fate regulator that bridges transient spatial transcription factors to terminal selector genes in the developing Drosophila ventral nerve cord. The experimental evidence convincingly demonstrates that Fd4 is both necessary for lateborn NB7-1 neurons, but also sufficient to transform other neural stem cell lineages toward the NB7-1 identity. This work addresses an important question that will be of interest to developmental neurobiologists: How can cell identities defined by initial transient developmental cues be maintained in the progeny cells, even if the molecular mechanism remains to be investigated? In addition, the study proposes a broader concept of lineage identity genes that could be utilized in other lineages and regions in the Drosophila nervous system and in other species.

      Thanks for the accurate summary and positive comments!

      While the spatial factors patterning the neuroepithelium to define the neuroblast lineages in the Drosophila ventral nerve cord are known, these factors are sometimes absent or not required during neurogenesis. In the current work, Lai and Doe identified Fd4 in the NB7-1 lineage that bridges this gap and explains how NB7-1 neurons are specified after Engrailed (En) and Vnd cease their expression. They show that Fd4 is transiently co-expressed with En and Vnd and is present in all nascent NB7-1 progenies. They further demonstrate that Fd4 is required for later-born NB7-1 progenies and sufficient for the induction of NB7-1 markers (Eve and Dbx) while repressing markers of other lineages when force-expressed in neural progenitors, e.g., in the NB56 lineage and in the NB7-3 lineage. They also demonstrate that, when Fd4 is ectopically expressed in NB7-3 and NB5-6 lineages, this leads to the ectopic generation of dorsal muscle-innervating neurons. The inclusion of functional validation using axon projections demonstrates that the transformed neurons acquire appropriate NB7-1 characteristics beyond just molecular markers. Quantitative analyses are thorough and well-presented for all experiments.

      Thanks for the positive comments!

      (1) While Fd4 is required and sufficient for several later-born NB7-1 progeny features, a comparison between early-born (Hb/Eve) and later-born (Run/Eve) appears missing for pan-progenitor gain of Fd4 (with sca-Gal4; Figure 4) and for the NB7-3 lineage (Figure 6). Having a quantification for both could make it clearer whether Fd4 preferentially induces later-born neurons or is sufficient for NB7-1 features without temporal restriction.

      We quantified the percentage of Hb+ and Runt+ cells among Eve+ cells with sca-gal4, and the results are shown in Figure 4-figure supplement 1. We found that the proportion of early-born cells is slightly reduced but the proportion of later-born cells remain similar. Interestingly, we also found a subset of Eve+ cells with a mixed fate (Hb+Runt+) but the reason remains unclear.

      (2) Fd4 and Fd5 are shown to be partially redundant, as Fd4 loss of function alone does not alter the number of Eve+ and Dbx+ neurons. This information is critical and should be included in Figure 3.

      Because every hemisegment in an fd4 single mutant is normal, we just added it as the following text: “In fd4 mutants, we observe no change in the number of Eve+ neurons or Dbx+ neurons (n=40 hemisegments).”

      (3) Several observations suggest that lineage identity maintenance involves both Fd4dependent and Fd4-independent mechanisms. In particular, the fact that fd4-Gal4 reporter remains active in fd4/fd5 mutants even after Vnd and En disappear indicates that Fd4's own expression, a key feature of NB7-1 identity, is maintained independently of Fd4 protein. This raises questions about what proportion of lineage identity features require Fd4 versus other maintenance mechanisms, which deserves discussion.

      We agree, thanks for raising this point. We add the following text to the Discussion. “Interestingly, the fd4 fd5 mutant maintains expression of fd4:gal4, suggesting that the fd4/fd5 locus may have established a chromatin state that allows “permanent” expression in the absence of Vnd, En, and Fd4/Fd5 proteins.”

      (4) Similarly, while gain of Fd4 induces NB7-1 lineage markers and dorsal muscle innervation in NB5-6 and NB7-3 lineages, drivers for the two lineages remain active despite the loss of molecular markers, indicating some regulatory elements retain activity consistent with their original lineage identity. It is therefore important to understand the degree of functional conversion in the gain-of-function experiments. Sparse labeling of Fd4 overexpressing NB5-6 and NB7-3 progenies, as was done in Seroka and Doe (2019), would be an option.

      We agree it is interesting that the NB7-3 and NB5-6 drivers remain on following Fd4 misexpression. To explore this, we used sca-gal4 to overexpress Fd4 and observed that Lbe expression persisted while Eg was largely repressed (Author response image 1). The results show that Lbe and Eg respond differently to Fd4. A non-mutually exclusive possibility is that the continued expression of lbe-Gal4 UAS-GFP or eg-Gal4 UAS-GFP may be due to the lengthy perdurance of both Gal4 and GFP.

      Author response image 1.

      (5) The less-penetrant induction of Dbx+ neurons in NB5-6 with Fd4-overexpression is interesting. It might be worth the authors discussing whether it is an Fd4 feature or an NB56 feature by examining Dbx+ neuron number in NB7-3 with Fd4-overexpression.

      In the NB7-3 lineages misexpressing Fd4, only 5 lineages generated Dbx+ cells (0.1±0.4, n=64 hemisegments), suggesting that the low penetrance of Dbx+ induction is an intrinsic feature of Fd4 rather than lineage context. We have added this information in the results section.

      (6) It is logical to hypothesize that spatial factors specify early-born neurons directly, so only late-born neurons require Fd4, but it was not tested. The model would be strengthened by examining whether Fd4-Gal4-driven Vnd rescues the generation of laterborn neurons in fd4/fd5 mutants.

      When we used en-gal4 driver to express UAS-vnd in the fd4/fd5 mutant background, we found an average 7.4±2.2 Eve+ cells per hemisegment (n=36), significantly higher than fd4/fd5 mutant alone (3.9±0.8 cells, n=52, p=2.6x10<sup>-11</sup>) (Figure 3J). In addition, 0.2±0.5 Eve+ cells were ectopic Hb+ (excluding U1/U2), indicating that Vnd-En integration is sufficient to generate both early-born and late-born Eve+ cells in the fd4/fd5 mutants. We have added the results to the text.

      (7) It is mentioned that Fd5 is not sufficient for the NB7-1 lineage identity. The observation is intriguing in how similar regulators serve distinct roles, but the data are not shown. The analysis in Figure 4 should be performed for Fd5 as supplemental information.

      Thanks for the suggestion. Because the results are exactly the same as the wild type, we don’t think it is necessary to provide an additional images or analysis as supplemental information.

      Reviewer #2 (Public review):

      Via a detailed expression analysis, they find that Fd4 is selectively expressed in embryonic NB7-1 and newly born neurons within this lineage. They also undertake a comprehensive genetic analysis to provide evidence that fd4 is necessary and sufficient for the identity of NB7-1 progeny.

      Thanks for the accurate summary!

      The analysis is both careful and rigorous, and the findings are of interest to developmental neurobiologists interested in molecular mechanisms underlying the generation of neuronal diversity. Great care was taken to make the figures clear and accessible. This work takes great advantage of years of painstaking descriptive work that has mapped embryonic neuroblast lineages in Drosophila.

      Thanks for the positive comments!

      The argument that Fd4 is necessary for NB7-1 lineage identity is based on a Fd4/Fd5 double mutant. Loss of fd4 alone did not alter the number of NB7-1-derived Eve+ or Dbx+ neurons. The authors clearly demonstrate redundancy between fd4 and fd5, and the fact that the LOF analysis is based on a double mutant should be better woven through the text.The authors generated an Fd5 mutant. I assume that Fd5 single mutants do not display NB7-1 lineage defects, but this is not stated. The focus on Fd4 over Fd5 is based on its highly specific expression profile and the dramatic misexpression phenotypes. But the LOF analysis demonstrates redundancy, and the conclusions in the abstract and through the results should reflect the existence of Fd5 in the conclusions of this manuscript.

      We agree, and have added new text to clarify the single mutant phenotypes (there are none) and the double mutant phenotype (loss of NB7-1 molecular and morphological features. The following text is added to the manuscript: “Not surprisingly, we found that fd4 single mutants or fd5 single mutants had no phenotype (Eve+ neurons were all normal). Thus, to assess their roles, we generated a fd4 and fd5 double mutant. Because many Eve+ and Dbx+ cells are generated outside of NB7-1 lineage, it was also essential to identify the Eve+ or Dbx+ cells within NB7-1 lineage in wild type and fd4 mutant embryos. To achieve this, we replaced the open reading frame of fd4 with gal4 (called fd4-gal4) (see Methods); this stock simultaneously knocked out both fd4 and fd5 (called fd4/fd5 mutant hereafter) while specifically labeling the NB7-1 lineage. For the remainder of this paper we use the fd4/fd5 double mutant to assay for loss of function phenotypes.”

      It is notable that Fd4 overexpression can rewire motor circuits. This analysis adds another dimension to the changes in transcription factor expression and, importantly, demonstrates functional consequences. Could the authors test whether U4 and U5 motor axon targeting changes in the fd4/fd5 double mutant? To strengthen claims regarding the importance of fd4/fd5 for lineage identity, it would help to address terminal features of U motorneuron identity in the LOF condition.

      Thanks for raising this important point. We examined the axon targeting on body wall muscles in both wild type and in fd4/fd5 mutant background and added the results in Figure 3-figure supplement 2. We found that the axon targeting in the late-born neuron region (LL1) is significantly reduced, suggesting that the loss of late-born neurons in fd4/fd5 mutant leads to the absence of innervation of corresponding muscle targets.

      Reviewer #3 (Public review):

      The goal of the work is to establish the linkage between the spatial transcription factors (STFs) that function transiently to establish the identities of the individual NBs and the terminal selector genes (typically homeodomain genes) that appear in the newborn postmitotic neurons. How is the identity of the NB maintained and carried forward after the spatial genes have faded away? Focusing on a single neuroblast (NB 7-1), the authors present evidence that the fork-head transcription factor, fd4, provides a bridge linking the transient spatial cues that initially specified neuroblast identity with the terminal selector genes that establish and maintain the identity of the stem cell's progeny.

      Thanks for the positive comments!

      The study is systematic, concise, and takes full advantage of 40+ years of work on the molecular players that establish neuronal identities in the Drosophila CNS. In the embryonic VNC, fd4 is expressed only in the NB 7-1 and its lineage. They show that Fd4 appears in the NB while the latter is still expressing the Spatial Transcription Factors and continues after the expression of the latter fades out. Fd4 is maintained through the early life of the neuronal progeny but then declines as the neurons turn on their terminal selector genes. Hence, fd4 expression is compatible with it being a bridging factor between the two sets of genes.

      Thanks for the accurate summary!

      Experimental support for the "bridging" role of Fd4 comes from a set of loss-of-function and gain-of-function manipulations. The loss of function of Fd4, and the partially redundant gene Fd5, from lineage 7-1 does not aoect the size of the lineage, but terminal markers of late-born neuronal phenotypes, like Eve and Dbx, are reduced or missing. By contrast, ectopic expression of fd4, but not fd5, results in ectopic expression of the terminal markers eve and Dbx throughout diverse VNC lineages.

      Thanks for the accurate summary!

      A detailed test of fd4's expression was then carried out using lineages 7-3 and 5-6, two well-characterized lineages in Drosophila. Lineage 7-3 is much smaller than 7-1 and continues to be so when subjected to fd4 misexpression. However, under the influence of ectopic Fd4 expression, the lineage 7-3 neurons lost their expected serotonin and corazonin expression and showed Eve expression as well as motoneuron phenotypes that partially mimic the U motoneurons of lineage 7-1.

      Thanks for the positive comments!

      Ectopic expression of Fd4 also produced changes in the 5-6 lineage. Expression of apterous, a feature of lineage 5-6, was suppressed, and expression of the 7-1 marker, Eve, was evident. Dbx expression was also evident in the transformed 5-6 lineages, but extremely restricted as compared to a normal 7-1 lineage. Considering the partial redundancy of fd4 and fd5, it would have been interesting to express both genes in the 5-6 lineage. The anatomical changes that are exhibited by motoneurons in response to Fd4 expression confirm that these cells do, indeed, show a shift in their cellular identity.

      We appreciate the positive comments. We agree double misexpression of Fd4 and Fd5 might give a stronger phenotype (as the reviewer says) but the lack of this experiment does not change the conclusions that Fd4 can promote NB7-1 molecular and morphological aspects at the expense of NB5-6 molecular markers.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The title of Figure 4 may be intended to include the term "Widespread", not "Wild spread". (Though the expansion of the Eve and Dbx with Fd4 is quite remarkable…).

      Done!

      Reviewer #3 (Recommendations for the authors):

      (1) Line 138. Is part of the sentence missing? Did the authors mean to say "that fd5 is coexpressed with fd4 in NB7-1 and its .....".

      Done!

      (2) ln 237: In trying to explain the "U-like" phenotype of the transformed motoneurons in lineage 7-3, the authors speculate that "perhaps their late birth did not give them time to extend to the most distant dorsal muscles ". It is very difficult to convince a motoneuron to stop growing in the absence of a target! An alternate possibility is that since there is only one or two U neurons made instead of the normal five, the growing motoneuron has enough information to direct them to the dorsal domain, but they lack the specification that allows them to recognize a specific muscle target.

      We agree there are additional possibilities, and now update the text to say: “We observed that these transformed neurons did not innervate the dorsal muscles, perhaps their late birth did not give them time to extend to the most distant dorsal muscles, or they were incompletely specified.”

      (3) In the References, I think that the Anderson et al. reference should also include "BioRxiv" before the DOI.

      Done!

      (4) Figure 6A for wild-type 7-3 lineage. The corazonin expression appears to be expressed in EW2 as well as EW3. This should be explained.

      We agree it looks that way, due to the 3D rotation used; we now replace it with a more representative image. Note that our quantification always shows a single Cor+ neuron per hemisegment.

      (5) Figure 7: Issues of terminology. The designation of "longitudinal" for muscles is traditionally in reference to the body axis, such as the Dorsal Longitudinal Muscles (DLM) of the adult thorax. The "longitudinal" muscles in the figure are really "transverse" muscles. I also suggest using "axon" or "neurites" rather than "filament". For the middle and bottom parts of E and F, are these lateral and ventral views? They should be designated as such.

      Thanks, we agree and have made the changes, using Axon instead of Filament, and labeling the views (lateral and ventro-lateral).

    1. How does your message impact your receiver? Or, in other words, what’s in it for them?

      My message may impact the way the receiver views me, however, it gives them a head start on finding an alternative way of getting the meeting set up as well as the other responsibilities that were assigned to me.

    1. Ronald Reagan was a actor, not at all a factorJust an employee of the country’s real mastersJust like the Bushes, Clinton and ObamaJust another talking head telling lies on telepromptersIf you don’t believe the theory, then argue with this logicWhy did Reagan and Obama both go after Gaddafi?We invaded sovereign soil, going after oilTaking countries is a hobby paid for by the oil lobbySame as in Iraq and AfghanistanAnd Ahmadinejad say they coming for Iran

      Moving from the theme of equality in the United States to its foreign policy, Killer Mike directly accuses the Presidents of the United States, from Reagan to Obama, of being actors and puppets of the country real masters.

      His argument in support of this is the common political position shared by presidents on opposite sides of the political spectrum, such as Reagan and Obama, toward North Africa and the Middle East, and especially Gaddafi.

      Gaddafi was the authoritarian leader of Lybia for 42 years, from 1969 to 2011, and was bombed by the United States both in 1986 and in 2011 in attempts to kill him, with the latter being succesful.

      In his arguments against U.S. foreing policy, he also refers to the wars in the Middle East. He argues that the wars in Iraq and Afghanistan during the 2000s and 2010s were actually motivated by a colonial quest for oil and natural resources, adding how Ahmadinejad, then the 6th President of Iran, warned that they coming for Iran.

    2. The end of the Reagan era, I’m like eleven or twelve orOld enough to understand that shit had changed foreverThey declared the war on drugs, like a war on terrorBut what it really did was let the police terrorize whoeverBut mostly black boys, but they would call us niggersAnd lay us on our belly, while they fingers on they triggersBoots was on our head, they dogs was on our crotchesAnd they would beat us up if we had diamonds on our watchesAnd they would take our and moneys as they pick our pocketsI guess that that’s the privilege of policing for some profitsBut thanks to Reaganomics, prison turned to profits‘Cause free labor’s the cornerstone of US economics‘Cause slavery was abolished, unless you are in prisonYou think I am bulshittin’, then read the 13th AmendmentInvoluntary servitude and slavery it prohibitsThat’s why they givin’ drug offenders time in double digits

      In his second verse, Killer Mike essentially exposes the same theory as Michelle Alexander in "The New Jim Crow". According to him, the War on Drugs and the War on Terror were instruments to control the population and let the police terrorize whoever.

      The main targets, though, were Black boys, many of whom ended up as victims of police brutality or, in the worst case, as "new slaves" in the prison system boosted by Reaganomics. Under Reagan, in fact, many prisons ended up privatized and became a profitable business; and, according to the 13th Amendment:

      Neither slavery nor involuntary servitude, except as a punishment for crime whereof the party shall have been duly convicted, shall exist within the United States, or any place subject to their jurisdiction.

      In the words of Killer Mike, this is the reason why they givin' drug offenders time in double digits. Harsh punishments for drug trafficking and possession, allow the 13th Amendment to become a way to enfroce a new form of slavery through mass incarceration, of which, Black boys are the main and disproportionate targets.

      In the America that Killer Mike and Alexander are painting, equality is prevented at every turn by a system designed to keep a racial caste order in place while, at the same time, profiting from it in an indirect and subtle way.

    1. If the truth is told, the youth can growThey learn to survive until they gain controlNobody says you have to be gangstas, hoesRead more, learn more, change the globeGhetto children, do your thingHold your head up, little man, you’re a kingYoung princess, when you get your wedding ringYour man will sing, “She’s my Queeeeeen”

      In the closing lines of the third verse, Nas is expressing his hope for the future, in which he envisions the possibility of ghetto children changing the world.

      Nas is arguing for the freedom of the African-American people from the darker aspects of their history, by remembering their forgotten one. In doing so, even their future can shine brightly through the means of education and ambition to reclaim control of their life.

    1. With sloping masts and dipping prow, As who pursued with yell and blow Still treads the shadow of his foe, And forward bends his head, The ship drove fast, loud roared the blast, And southward aye we fled.

      I enjoy how they describe a what i assume to be a dangerous storm,

    1. One of the early ways of social communication across the internet was with Email [e5], which originated in the 1960s and 1970s. These allowed people to send messages to each other, and look up if any new messages had been sent to them.

      The invention of Email can be seen as one of the great step of how people communicate. The time to transport an information was significantly reduced within few minutes. This reminds me a fun fact that Her Majesty Queen Elizabeth II, the queen of Great Britain was one of the earliest adopters of email, probably the first ever email from a head of the state!

    1. Richard, the son of Martha Carrier, having some differencewith him, pulled him down by the hair of the head. When he rose again, he was going to strike atRichard Carrier, but fell down flat on his back to the ground and had not power to stir hand or footuntil he told Carrier he yielded: and then he saw the shape of Martha Carrier go off his breast.

      Could have just been scared and/or lying.

    Annotators

    1. Maybe the hairs of my head could be counted,”she said, “but no one could ever count my love for you

      Author highlighting how Della's love was greater than her hair (of which the author gives great importance)

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer # 1 (Public review):

      Significance:

      While most MAVEs measure overall function (which is a complex integration of biochemical properties, including stability), VAMP-seqtype measurements more strongly isolate stability effects in a cellular context. This work seeks to create a simple model for predicting the response for a mutation on the "abundance" measurement of VAMPseq.

      We thank the reviewer for their evaluation of our work and for their comments and feedback below.

      Of course, there is always another layer of the onion, VAMP-seq measures contributions from isolated thermodynamic stability, stability conferred by binding partners (small molecule and protein), synthesis/degradation balance (especially important in "degron" motifs), etc. Here the authors' goal is to create simple models that can act as a baseline for two main reasons:

      (1) how to tell when adding more information would be helpful for a global model;

      (2) how to detect when a residue/mutation has an unusual profile indicative of an unbalanced contribution from one of the factors listed above.

      As such, the authors state that this manuscript is not intended to be a state-of-the-art method in variant effect prediction, but rather a direction towards considering static structural information for the VAMP-seq effects. At its core, the method is a fairly traditional asymmetric substitution matrix (I was surprised not to see a comparison to BLOSUM in the manuscript) - and shows that a subdivision by burial makes the model much more predictive. Despite only having 6 datasets, they show predictive power even when the matrices are based on a smaller number. Another success is rationalizing the VAMPseq results on relevant oligomeric states.

      We thank the reviewer for their summary of the main points of our work. Based on the suggestion by the reviewer, we have added a comparison to predictions with BLOSUM62 to our revised manuscript, noting that we have previously compared the BLOSUM62 matrix to a broader and more heterogeneous set of scores generated by MAVEs (Høie et al, 2022).

      Specific Feedback:

      Major points:

      The authors spend a good amount of space discussing how the six datasets have different distributions in abundance scores. After the development of their model is there more to say about why? Is there something that can be leveraged here to design maximally informative experiments?

      We believe that these effects arise from a combination of intrinsic differences between the systems and assay-specific effects. For example, biophysical differences between the systems, such as differences in absolute folding stabilities or melting temperatures, will play a role, as will the fact that some proteins contain multiple domains.

      Also, the sequencing-based score for an individual variant in a sort-seq experiment (such as VAMP-seq) depends both on the properties of that variant and on the composition of the entire FACS-sorted cell library. This is because cells are sorted into bins depending on the composition of the entire library, which means that library-to-library composition differences can contribute to the differences between VAMP-seq score distributions. 

      From our developed models and outliers in predictions from these, it is difficult to tell which of the several possible underlying reasons cause the differences. We have briefly expanded the discussion of these points in the manuscript, and we have moreover elaborated on this in subsequent work (Schulze et al., 2025).

      They compare to one more "sophisticated model" - RosettaddG - which should be more correlated with thermodynamic stability than other factors measured by VAMP-seq. However, the direct head-tohead comparison between their matrices and ddG is underdeveloped. How can this be used to dissect cases where thermodynamics are not contributing to specific substitution patterns OR in specific residues/regions that are predicted by one method better than the other? This would naturally dovetail into whether there is orthogonal information between these two that could be leveraged to create better predictions.

      We thank the reviewer for this suggestion and indeed had spent substantial effort trying to gain additional biological insights from variants for which MAVE scores or MAVE predictions do not match predicted ∆∆G values. One major caveat in this analysis is that the experimental MAVE scores, MAVE predictions and the predicted ∆∆G values are rather noisy, making it difficult to draw conclusions based on individual variants or even small subsets of variants.

      In our revised manuscript, we have added an analysis to discover residue substitution profiles that are predicted most accurately either by a ∆∆G model or by our substitution matrix model, thereby avoiding analysis of individual variant effect scores. 

      We find that many substitution profiles are predicted equally well by the two model types, but also that there are residues for which one method predicts substitution effects better than the other method. We have added an analysis of the characteristics of the residues and variants for which either the ∆∆G model or the substitution matrix model is most useful to rank variants. Since we only find relatively few residues for which this is the case, we do not expect a model that leverages predicted scores from both methods to perform better than ThermoMPNN across variants. 

      Perhaps beyond the scope of this baseline method, there is also ThermoMPNN and the work from Gabe Rocklin to consider as other approaches that should be more correlated only with thermodynamics.

      We acknowledge that there are other approaches to predict ∆∆G beyond Rosetta including for example ThermoMPNN and our own method called RaSP (Blaabjerg et al, eLIFE, 2023), and we have added comparisons to ThermoMPNN and RaSP in the revised manuscript. We are unsure how one would use the data from Rocklin and colleagues directly, but we note that e.g. RaSP has been benchmarked on this data and other methods have been trained on this data. We originally used Rosetta since the Rosetta model is known to be relatively robust and because it has never seen large databases during training (though we do not think that training of ThermoMPNN and RaSP would be biased towards the VAMP-seq data). We note also that we have previously compared both Rosetta calculations and RaSP with VAMP-seq data for TPMT, PTEN and NUDT15 (Blaabjerg et al, eLIFE, 2023)

      I find myself drawn to the hints of a larger idea that outliers to this model can be helpful in identifying specific aspects of proteostasis. The discussion of S109 is great in this respect, but I can't help but feel there is more to be mined from Figure S9 or other analyses of outlier higher than predicted abundance along linear or tertiary motifs.

      We agree with these points and have previously spent substantial time trying to make sense of outliers in Figure S9 and Figure S18 (Figure S8 and Figure S18 of revised manuscript). The outlier analysis was challenging, in part due to the relatively high noise levels in both experimental data and predictions, and we did not find any clear signals. Some outliers in e.g. Figure S9 are very likely the result of dataset-specific abundance score distributions, which further complicates the outlier analysis. We now note this in the revised paper and hope others will use the data to gain additional insights on proteostasis-specific effects.  

      Reviewer # 2 (Public review):

      Summary:

      This study analyzes protein abundance data from six VAMP-seq experiments, comprising over 31,000 single amino acid substitutions, to understand how different amino acids contribute to maintaining cellular protein levels. The authors develop substitution matrices that capture the average effect of amino acid changes on protein abundance in different structural contexts (buried vs. exposed residues). Their key finding is that these simple structure-based matrices can predict mutational effects on abundance with accuracy comparable to more complex physics-based stability calculations (ΔΔG).

      Major strengths:

      (1) The analysis focuses on a single molecular phenotype (abundance) measured using the same experimental approach (VAMP-seq), avoiding confounding factors present when combining data from different phenotypes (e.g., mixing stability, activity, and fitness data) or different experimental methods.

      (2) The demonstration that simple structural features (particularly solvent accessibility) can capture a significant portion of mutational effects on abundance.

      (3) The practical utility of the matrices for analyzing protein interfaces and identifying functionally important surface residues.

      We thank the reviewer for the comments above and the detailed assessment of our work.

      Major weaknesses:

      (1) The statistical rigor of the analysis could be improved. For example, when comparing exposed vs. buried classification of interface residues, or when assessing whether differences between prediction methods are significant.

      We agree with the reviewer that it is useful to determine if interface residues (or any of the residues in the six proteins) can confidently be classified as buried- or exposed-like in terms of their substitution profiles. Thus, we have expanded our approach to compare individual substitution profiles to the average profiles of buried and exposed residues to now account for the noise in the VAMP-seq data. In our updated approach, we resample the abundance score substitution profile for every residue several thousand times based on the experimental VAMP-seq scores and score standard deviations, and we then compare every resampled profile to the average profiles for buried and exposed residues, thereby obtaining residue-specific distributions of RMSD<sub>buried</sub> and RMSD<sub>exposed</sub> values. These RMSD distributions are typically narrow, since many variants in several datasets have small standard deviations. In the revised manuscript, we report a residue to have e.g. a buried-like substitution profile if RMSD<sub>buried</sub> <RMSD<sub>exposed</sub> for at least 95% of the resampled profiles. We do not recalculate average scores in substitution matrices for this analysis. 

      Moreover, to illustrate potential overlap in predictive performance between prediction methods more clearly than in our preprint, we have added confidence intervals in Fig. 2 and Fig. 3 of the revised manuscript. We note that the analysis in Fig. 2 is performed using a leave-one-protein-out approach, which we believe provides the cleanest assessment of how well the different models perform.

      (2) The mechanistic connection between stability and abundance is assumed rather than explained or investigated. For instance, destabilizing mutations might decrease abundance through protein quality control, but other mechanisms like degron exposure could also be at play.

      We agree that we have not provided much description of the relation between stability and abundance in our original preprint. In the revised manuscript, we provide some more detail as well as references to previous literature explaining the ways in which destabilising mutations can cause degradation. We have moreover performed and added additional analyses of the relationship between thermodynamic stability and abundance through comparisons of stability predictions and predictions performed with our substitution matrix models.

      (3) The similar performance of simple matrix-based and complex physics-based predictions calls for deeper analysis. A systematic comparison of where these approaches agree or differ could illuminate the relationship between stability and abundance. For instance, buried sites showing exposed-like behavior might indicate regions of structural plasticity, while the link between destabilization and degradation might involve partial unfolding exposing typically buried residues. The authors have all the necessary data for such analysis but don't fully exploit this opportunity.

      This is similar to a point made by reviewer 1, and our answer is similar. We were indeed hoping that our analyses would have revealed clearer differences between effects on thermodynamic protein stability and cellular abundance and have tried to find clear signals. One major caveat in performing the suggested analysis is that both the experimental MAVE scores, ∆∆G predictions and our simple matrix-based predictions are rather noisy, making it difficult to make conclusions based on individual variants or even small subsets of variants. 

      To address this point, we have added an analysis to discover residue substitution profiles that are predicted most accurately either by a ∆∆G model or by our substitution matrix model, thereby avoiding analysis of individual variant effect scores. We find that many substitution profiles are predicted equally well by the two model types, but we also, in particular, find solvent-exposed residues for which the substitution matrix model is the better predictor. These residues are often aspartate, glutamate and proline, suggesting that surface-level substitutions of these amino acid types often can have effects that are not captured well by a thermodynamical model, either because this model does not describe thermodynamic effects perfectly, or because in-cell effects are necessary to account for to provide an accurate description.

      (4) The pooling of data across proteins to construct the matrices needs better justification, given the observed differences in score distributions between proteins (for example, PTEN's distribution is shifted towards high abundance scores while ASPA and PRKN show more binary distributions).

      We agree with the reviewer that the differences between the score distributions are important to investigate further and keep in mind when analysing e.g. prediction outliers. However, our results show that the pooling of VAMP-seq scores across proteins does result in substitution matrices that make sense biochemically and can identify outlier residues with proteostatic functions. As we also respond to a related point by reviewer 1, the differences in score distributions likely have complex origins. In that sense, we also hope that our results can inspire experimentalists to design methods to generate data that are more comparable across proteins.

      For example, biophysical differences between the systems, such as differences in absolute folding stabilities or melting temperatures will play a role, as will the fact that some proteins contain multiple domains. Also, the sequence-based score for an individual variant in a sort-seq experiment (such as VAMP-seq) depends both on the properties of that variant and from the composition of the entire FACS-sorted cell library. This is because cells are sorted into bins depending on the composition of the entire library, which means that library-to-library composition can contribute to the differences between VAMP-seq score distributions. From our developed models and outliers in predictions from these, it is difficult to tell which of the several possible underlying reasons cause the differences.

      Thus, even when experiments on different proteins are performed using the same technique (VAMP-seq), quantifying the same phenomenon (cellular abundance) and done in similar ways (saturation mutagenesis, sort-seq using four FACS bins), there can still be substantial differences in the results across different systems. An interesting side result of our work is to highlight this including how such variation makes it difficult to learn across experiments. We now elaborate on these points in the revised manuscript.

      (5) Some key methodological choices require better justification. For example, combining "to" and "from" mutation profiles for PCA despite their different behaviors, or using arbitrary thresholds (like 0.05) for residue classification.

      We hope we have explained our methodological choices clearer in the revised paper.

      We removed the dependency of the threshold of 0.05 used for residue classification in Fig. S19 of the original manuscript; in the revised manuscript we only report a residue to have e.g. a buried-like substitution profile if RMSD<sub>buried</sub> <RMSD<sub>exposed</sub> for at least 95% of the abundance score profiles that we resampled according to VAMP-seq score noise levels, as explained above.

      With respect to combining “to” and “from” mutational profiles for PCA, we could have also chosen to analyse these two sets of profiles separately to take potentially different behaviours along the two mutational axes into account. We do not think that there should be anything wrong with concatenating the two sets of profiles in a single analysis, since the analysis on the concatenated profiles simply expresses amino acid similarities and differences in a more general manner.

      The authors largely achieve their primary aim of showing that simple structural features can predict abundance changes. However, their secondary goal of using the matrices to identify functionally important residues would benefit from more rigorous statistical validation. While the matrices provide a useful baseline for abundance prediction, the paper could offer deeper biological insights by investigating cases where simple structure-based predictions differ from physics-based stability calculations.

      This work provides a valuable resource for the protein science community in the form of easily applicable substitution matrices. The finding that such simple features can match more complex calculations is significant for the field. However, the work's impact would be enhanced by a deeper investigation of the mechanistic implications of the observed patterns, particularly in cases where abundance changes appear decoupled from stability effects.

      We agree that disentangling stability and other effects on cellular abundance is one of the goals of this work. As discussed above, it has been difficult to find clear cases where amino acid substitutions affect abundance without stability beyond for example the (rare) effects of creating surface exposed degrons. Our new analysis, in which we compare substitution matrix-based predictions to stability predictions, does offer deeper insight into the relationship between the two predictor types and hence possibly between folding stability and abundance. 

      Reviewer #3 (Public review): 

      "Effects of residue substitutions on the cellular abundance of proteins" by Schulze and Lindorff-Larsen revisits the classical concept of structure-aware protein substitution matrices through the scope of modern protein structure modelling approaches and comprehensive phenotypic readouts from multiplex assays of variant effects (MAVEs). The authors explore 6 unique protein MAVE datasets based on protein abundance (and thus stability) by utilizing structural information, specifically residue solvent accessibility and secondary structure type, to derive combinations of context-specific substitution matrices predicting variant abundance. They are clear to outline that the aim of the study is not to produce a new best abundance predictor but to showcase the degree of prediction afforded simply by utilizing information on residue accessibility. The performance of their matrices is robustly evaluated using a leave-one-out approach, where the abundance effects for a single protein are predicted using the remaining datasets. Using a simple classification of buried and solvent-exposed residues, and substitution matrices derived respectively for each residue group, the authors convincingly demonstrate that taking structural solvent accessibility contexts into account leads to more accurate performance than either a structureunaware matrix, secondary structure-based matrix, or matrices combining both solvent accessibility or secondary structure. Interestingly, it is shown that the performance of the simple buried and exposed residue substitution matrices for predicting protein abundance is on par with Rosetta, an established and specialized protein variant stability predictor. More importantly, the authors finish off the paper by demonstrating the utility of the two matrices to identify surface residues that have buried-like substitution profiles, that are shown to correspond to protein interface residues, posttranslational modification sites, functional residues, or putative degrons.

      Strengths:

      The paper makes a strong and well-supported main point, demonstrating the utility of the authors' approach through performance comparisons with alternative substitution matrices and specialized methods alike. The matrices are rigorously evaluated without introducing bias, exploring various combinations of protein datasets. Supplemental analyses are extremely comprehensive and detailed. The applicability of the substitution matrices is explored beyond abundance prediction and could have important implications in the future for identifying functionally relevant sites.

      We thank the reviewer for the supportive comments on our work. 

      Comments:

      (1) A wider discussion of the possible reasons why matrices for certain proteins seem to correlate better than others would be extremely interesting, touching upon possible points like differences or similarities in local environments, degradation pathways, posttranslation modifications, and regulation. While the initial data structure differences provide a possible explanation, Figure S17A, B correlations show a more complicated picture.

      We agree with the reviewer that biochemical and biophysical differences between the proteins might contribute to the fact that some matrices correlate better than others. We also agree that it would be very interesting to understand these differences better. While it might be possible to examine some of the suggested causes of the differences, like differences or similarities in local environments, we have generally found that noise and differences in score distributions make such analyses difficult (see also responses to reviewers 1 and 2). For now, we will defer additional analyses to future work.

      (2) The performance analysis in Figure 2D seems to show that for particular proteins "less is more" when it comes to which datasets are best to derive the matrix from (CYP2C9, ASPA, PRKN). Are there any features (direct or proxy), that would allow to group proteins to maximize accuracy? Do the authors think on top of the buried vs exposed paradigm, another grouping dimension at the protein/domain level could improve performance?

      We don’t currently know if any protein- or domain-level features could be used to further split residues into useful categories for constructing new substitution matrices, but it is an interesting suggestion. We note that every substitution matrix consists of 380 averages, and creating too many residue groupings will cause some matrix entries to be averaged over very few abundance scores, at least with the current number of scores in the pooled VAMP-seq dataset. For example, while previous work has shown different mutational effects e.g. in helices and sheets (as one would expect), we find that a model with six matrices ({buried,exposed}x{helix,sheet,other}) does not lead to improved predictions (Fig. 2C), presumably because of an unfavourable balance between parameters and data.

      (3) While the matrices and Rosetta seem to show similar degrees of correlation, do the methods both fail and succeed on the same variants? Or do they show a degree of orthogonality and could potentially be synergistic?

      These are good questions and are related to similar questions from reviewers 1 and 2. In the revised manuscript, we have added additional analyses of differences between predictions from our substitution matrix model and a stability model, and we indeed find that the two methods show a degree of orthogonality. However, since we identify only relatively few residues for which one method performs better than the other, we don’t expect a synergistic model to outperform the stability predictor across all variants in any of the six proteins.  

      Overall, this work presents a valuable contribution by creatively utilizing a simple concept through cutting-edge datasets, which could be useful in various.

      Reviewing Editor:

      As discussed in more detail below, to strengthen the assessment, the authors are encouraged to:

      (1) Include more thorough statistical analyses, such as confidence intervals or standard errors, to better validate key claims (e.g., RMSD comparisons).

      (2) Perform a deeper comparison between substitution response matrices and ΔΔG-based predictions to uncover areas of agreement or orthogonality

      (3) Clarify the relationship between structural features, stability, and abundance to provide more mechanistic insights.

      As discussed above and below, we have added new analyses and clarifications to the revised manuscript.

      Reviewer #1 (Recommendations for the authors):

      Minor points:

      Why is a continuous version of the contact number used here, instead of a discrete count of neighbouring residues? WCN values of the residues in the core domain can be affected by residues far away (small contribution but not strictly zero; if there are many of them, it adds up).

      We have previously found WCN, which quantifies residue contact numbers in a continuous manner, to be a useful input feature for a classifier that determines whether individual residues are important for maintaining protein abundance or function (Cagiada et al, 2023). We have also found WCN and the cellular abundance of single substitution variants to correlate well in individual analyses of different proteins (Grønbæk-Thygesen et al., 2024; Gersing et al., 2024; Clausen et al., 2024).

      We have calculated the WCN as well as a contact number based on discrete counts of neighbouring residues for the six proteins in our dataset. When distances between residues are evaluated in the same way (i.e. using the shortest distance between any pair of heavy atoms in the side chains), and when the cutoff value used for the discrete count is equal to the r<sub>0</sub> of the WCN function, the continuous and discrete evaluations of residue contact numbers are highly and linearly correlated, and their rank correlation with the VAMP-seq data are very similar. We only observe minor contributions from residues far away in the structure on the WCN.

      Typos in SI figure captions e.g. Figure S8-11 "All predictions were performed using using...."

      Thank you for pointing this out. We have corrected the typos in Figure S8-11 (Figure S7-S10 in the revised manuscript).

      Personally, I'd appreciate a definition of these new substitution matrices under the constraints of rASA/WCN values. It was unclear to me until I read the code but we think that the definition is averaging the substitution matrix based on the clusters they are assigned to. If so, this could be straightforwardly defined in the method section with a heaviside step function.

      We have added a definition of the “buried” and “exposed” substitution matrices as a function of rASA in the methods section (“Definitions of buried and exposed residues” and “Definition of substitution matrices”) of the manuscript, as well as a definition of how we classified residues as either buried or exposed using both rASA and WCN as input. Our final substitution matrices, as shown in e.g. Fig. 2, do not depend on the WCN; only the substitution matrix results in Figure S6 (Figure S20 in the revised manuscript) depend on both WCN and rASA.

      Reviewer #2 (Recommendations for the authors):

      The following suggestions aim to strengthen the analysis and clarify the presentation of your findings:

      (1) Specific analyses to consider:

      (1.1) Analyze buried positions where the exposed matrix performs better. Understanding these cases might reveal properties of protein core regions that show unexpected mutational tolerance.

      We agree with the reviewer that a more detailed analysis of buried residues with exposed-like substitution profiles would be very interesting.

      We note that for proteins where the VAMP-seq score distribution is shifted towards high values (as it is the case for PTEN, TPMT and CYP2C9), our identification of such residues may be a result of the score distribution differences between the six datasets. To confidently identify mutationally tolerant core regions, it would be best to (a) correct for the distribution differences prior to the analysis or (b) focus the analysis on residues that fall far below the diagonal in Figure S18.

      In additional data (which can be found at https://github.com/KULL-Centre/_2024_Schulze_abundance-analysis)) ,we provide, for each of the proteins, a list of buried residues for which RMSD<sub>exposed</sub> <RMSD<sub>buried</sub> (for more than 95% of resampled substitution profiles, as described under 1.6). We have not analysed these residues further.

      (1.2) A systematic comparison of matrix-based vs. ΔΔG-based predictions could help understand both exposed sites that behave as buried (as analyzed in the paper) and buried sites that behave as exposed (1.1), potentially revealing mechanisms underlying abundance changes.

      In our revised manuscript, we have added additional analyses to compare matrixbased and ΔΔG-based predictions, focusing on exposed sites for which one prediction method captures variant effects on abundance considerably better the other prediction method. We have not investigated buried sites with exposed-like behaviour any further in this work.

      (1.3) Explore different normalization approaches when pooling data across proteins. In particular, consider using log(abundance score): if the experimental error in abundance measurements is multiplicative (which can be checked from the reported standard errors), then log transformation would convert this into a constant additive error, making the analysis more statistically sound.

      As we answer below to point 2.2, the abundance scores are, within each dataset, min-max normalised to nonsense and synonymous variant scores, and the score scale is thus in this way consistent across the six datasets. We have explained above and in the revised manuscript that abundance score distribution differences across datasets are likely partially a result of the FACS binning of assay-specific variant libraries. Using only the VAMP-seq scores (that is, without further information about the individual experiments), we cannot correct for the influence of the sorting strategy on the reported scores. A score normalisation across datasets that places all data points on a single scale would require inter-dataset references variant scores, which we do not have. We note that in a subsequent manuscript (Schulze et al, bioRxiv, 2025) we have attempted to take system- and experimentspecific score distributions into account. We now refer to this work in the revised manuscript.

      (1.4) Consider using correlation coefficients between predicted and observed abundance profiles as an alternative to RMSD, which is sensitive to the absolute values of the scores.

      We agree with the reviewer that using correlation coefficients to compare substitution profiles might also be useful, in particular for datasets with relatively unique VAMP-seq score distributions, such as the ASPA dataset. To explore this idea, we have repeated the analysis presented in Fig. S18 using the Pearson correlation coefficient r rather than the RMSD.

      As in Fig. S18, we derive r<sub>buried</sub> and r<sub>exposed</sub> for every residue in the six proteins, specifically by calculating r between the abundance score substitution profile of every individual residue and the average abundance score substitution profiles of buried and exposed residues. VAMP-seq data for the protein for which r<sub>buried</sub> and r<sub>exposed</sub> are evaluated is omitted from the calculation of average abundance score substitution profiles, and we use only monomer structures to determine whether residues are buried or exposed. 

      We show the results of this analysis in an Author response image 1 below. In each panel of the figure, r<sub>buried</sub> and r<sub>exposed</sub> are shown for individual residues of a single protein. Blue datapoints indicate residues that are solvent-exposed in the wild-type protein structures, and yellow datapoints indicate residues that are buried in the wild-type structures. Residues for which it is not the case that r<sub>buried</sub> < r<sub>exposed</sub> or r<sub>exposed</sub><r<sub>buried</sub> in more than 95% of 1000 resampled residue substitution profiles (see explanation of resampling method above) are coloured grey. “Acc.” is the balanced classification accuracy, calculated using all non-grey datapoints, indicating how many buried residues have buried-like substitution profiles (r<sub>exposed</sub><r<sub>buried</sub>) and how many solvent-exposed residues have exposed-like substitution profiles (r<sub>buried</sub> < r<sub>exposed</sub>). The classification accuracy per protein in this figure cannot be compared to the classification accuracy of the same protein in Fig. S18, since the number of datapoints used in the accuracy calculation differ between the r- and RMSD-based analyses. 

      Author response image 1.

      Comparing the r-based approach to the RMSD-based approach (Fig. S18), it is clear that the r-based method is less robust than the RMSD-based method for noisy and incomplete datasets. For the noisiest and most mutationally incomplete VAMP-seq datasets (i.e., PTEN, TPMT and CYP2C9) (Fig. 1), there are relatively few residues for which we with high confidence can determine if the substitution profile is more buried- or more exposed-like. When the VAMP-seq data is less noisy and has high mutational completeness, the r-based method becomes more robust and may thus be relevant in potential future work on new VAMP-seq data with small error bars.

      In conclusion, we find that RMSD-based approach to compare substitution profiles is more robust than an r-based approach for several of the VAMP-seq datasets that are included in our analysis. We do believe than an approach based on the correlation coefficient, or potentially several metrics, could be relevant to use, since abundance score distributions from VAMP-seq datasets can differ significantly across datasets. So as not to increase the length of the main text of our manuscript, we have not added this analysis to the revised manuscript.

      (1.5) Consider treating missing abundance scores as zero values, as they might indicate variants with very low abundance, rather than omitting them from the analysis.

      This suggestion would be most relevant for the PTEN, TPMT and CYP2C9 datasets, which all have a relatively small average mutational depth and completeness, as shown in Fig. 1B and 1C. To assess if setting missing abundance scores as zero values would be reasonable, we have compared the distributions of predicted ΔΔG values (from RaSP and ThermoMPNN) and of predicted abundance scores (from our exposure-based substitution matrices) for variants with reported and missing VAMP-seq data. We show the result in Author response image 2, with data aggregated across the six protein systems:

      Author response image 2.

      We find that variants with and without VAMP-seq data have similar ΔΔG score distributions and similar predicted abundance score distributions, and there is thus no clear enrichment of predicted loss of abundance for variants with missing VAMP-seq scores. This suggests that missing abundance scores do not necessarily indicate very low abundance. One cause of missing data might instead be problems with library generation (Matreyek et al, 2018, 2021).

      We show in Fig. S9 (Fig. S8 of the revised manuscript) that predicted scores for variants with experimental abundance scores of 0 are often overestimated for NUDT15, ASPA and PRKN, but this is not so much a problem for PTEN, TMPT and CYP2C9, the datasets with most missing scores. The lack of an enrichment of low abundance variants from the various predictors would thus still support that missing scores do not necessarily indicate low abundance.

      (1.6) Develop a proper statistical framework for comparing buried vs exposed predictions (whether using RMSD or correlations), including confidence intervals, rather than using arbitrary thresholds.

      As explained above and in the methods section of our revised manuscript, we have expanded our approach to compare the substitution profile of a residue to the average profiles of buried and exposed residues, and our method now accounts for the noise in the VAMP-seq data, making the analysis more statistically rigorous. In our expanded approach, we compare the substitution profiles of individual residues to the average profiles for buried and exposed residues 10,000 times per residue to get a residue-specific distribution of RMSD<sub>buried</sub> and RMSD<sub>exposed</sub> values. Individual RMSD<sub>buried</sub> and RMSD<sub>exposed</sub> values are calculated by resampling abundance scores from a Gaussian distribution defined by the experimentally reported abundance score and abundance score standard deviation per variant. We now only report a residue to have e.g. a buried-like substitution profile if RMSD<sub>buried</sub> < RMSD<sub>exposed</sub> in at least 95% of our samples. We do not recalculate average scores in substitution matrices for this analysis. We have updated the plots in our manuscript, e.g. in Fig. S18 and S19 of the revised version, to indicate which residues are confidently classified as buried- or exposed-like.

      (2) Presentation improvements:

      (2.1) In Figure 4, consider removing the average abundance scores, which are not directly related to the RMSD comparison being shown.

      We have decided to keep the average abundance scores in Fig. 4 (now Fig. 5), as we find the average abundance scores useful for guiding interpretation of the RMSD values. For example, an unusually small average abundance score with a relatively small standard deviation may explain a case where RMSD<sub>buried</sub> and RMSD<sub>exposed</sub> are both large. This is for example the case for residue G185 in ASPA. 

      In our preprint, the error bars on the average abundance scores in Fig. 4 (now Fig. 5) indicated the standard deviation across the abundance scores that were used to calculate the average per position. We have removed these error bars in the revised manuscript, as we realised that these were not necessarily helpful to the reader.

      (2.2) I am assuming that abundance scores are defined as the ratio abundance_variant/abundance_wt throughout the analysis, but I don't think this has been explicitly defined. If this is correct, please state it explicitly. In such case, log(abundance_score) would have a simple interpretation as the difference in abundance between variant and wild-type.

      Abundance scores are defined throughout the manuscript as sequence-based scores that have been min-max normalised to the abundance of nonsense and synonymous variants, i.e. abundance_score = (abundance_variant abundance_nonsense)/(abundance_wt–abundance_nonsense). We have described the normalisation of scores to wild-type and nonsense variant abundance in lines 164-166 of the original manuscript. We have now added additional information about the normalisation scheme in the methods section. We note that we did not ourselves apply this normalisation to the data; the scores were reported in this manner in the original publications that reported the VAMP-seq experiments for the six proteins.

      (2.3) Consider renaming "rASA" to the more commonly used "RSA" for relative solvent accessibility.

      We have decided to keep using “rASA” throughout the manuscript.

      (2.4) The weighted contact number function used differs from the established WCN measure (Σ1/rij²) introduced by Lin et al. (2008, Proteins). This should be acknowledged and the choice of alternative weighting scheme justified.

      As we have also responded to the first minor point of reviewer 1, we have previously found WCN, as it is defined in our manuscript, to be a useful input feature for a classifier that determines whether individual residues are important for maintaining protein abundance or function (Cagiada et al, 2023). We have also previously found this type of WCN to correlate well with variant abundance of individual proteins, as measured with VAMP-seq or protein fragment complementation assays (Grønbæk-Thygesen et al., 2024; Clausen et al., 2024; Gersing et al., 2024). We acknowledge that residue contact numbers or weighted contact numbers could also be expressed in other ways and that alternative contact number definitions would likely also produce values that correlate well with VAMP-seq data. Since the WCN, as defined in our manuscript, already correlates relatively well with abundance scores, we have not explored whether alternative definitions produce better correlations.  

      (2.5) Replace the phrase "in the above" with specific references to sections or simply "above" where appropriate. Also, consider replacing many instances of "moreover" with simpler alternatives such as "also" or "in addition" to improve readability.

      We have changed several sentences according to this suggestion and hope that we have improved the readability of our manuscript.

      Reviewer #3 (Recommendations for the authors):

      (1) It should be explicitly confirmed earlier that complex structures are used for NUDT15 and ASPA when assessing rASA/WCN. Additionally, it would be interesting to see the effect that deriving the matrices using NUDT15 and ASPA monomers would have.

      We have commented on the use of NUDT15 and ASPA homodimer structures earlier in the revised manuscript (specifically already in the subsection Abundance scores correlate with the degree of residue solvent-exposure section).

      When residues are classified using monomer rather than dimer structures of NUDT15 and ASPA, there is a small effect on the resulting “buried” and “exposed” substitution matrices. Entries in this set of substitution matrices calculated using either monomer or dimer structures typically differ by less than 0.05, and only a single entry differ by more than 0.1. As expected, the “exposed” matrix tend to contain slightly larger numbers when derived from dimer structures than when derived from monomer structures, meaning that when the interface residues are included in the exposed residue category, the average abundance scores of the “exposed” matrix are lowered. For buried residues, the picture is more mixed, although the overall tendency is that the interface residues make the “buried” matrix contain smaller average abundance scores for dimer compared to monomer structures. These results generally support the use of dimer structures for the residue classification.

      We here show the differences between the substitution matrices calculated with dimer or monomer structures of NUDT15 and ASPA and using data for all six proteins in our combined VAMP-seq dataset (average_abundance_score_differece = average_abundance_score_dimers – average_abundance_score _monomers):

      Author response image 3.

      We have not explored these alternative matrices further.

      (2) While the supplemental analyses are rigorous, the abundance of various metrics being presented can be confusing, especially when they seem to differ in their result. For instance, the discussion of Figure S17 (paragraph starting 428) contains mentions of mean differences but then switches to correlations, while both are presented for all panels. The claim "The datasets thus mainly differ due to differences in substitution effects in buried environments. " is well supported by the observed mean differences, but for Pearson's correlations the average panel A ,B values of buried 0.421 vs exposed 0.427 are hardly different. Which of the metrics is more meaningful, and are both needed?

      We agree with the reviewer that the claim that “The datasets thus mainly differ due to differences in substitution effects in buried environments” is not well-supported by the r between the substitution matrices, and we have removed this claim from the text.

      Since some datasets share VAMP-seq score distribution features, while others do not, the absolute difference between scores or matrices may be relevant to check for some dataset pairs, while the r may be more relevant to check for other dataset pairs. Hence, we have included both metrics in Fig S17 (Fig S11 in the revised manuscript).

      (3) Lines 337-340 - does not feel like S7 is the topic, perhaps the authors meant Figure 2A, B? In general, the supplemental figure references are out of order and panel combinations are sometimes confusing.

      We have corrected figures references to now be correct and changed the arrangement of supplemental figures so that they now occur in the correct order. We have looked through the panel combinations with clarity in mind, and hope that the current set of main and supplementary figures balances overview and detail.

      (4) Line 363 "are also are also".

      We have corrected this typo.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This is an excellent study by a superb investigator who discovered and is championing the field of migrasomes. This study contains a hidden "gem" - the induction of migrasomes by hypotonicity and how that happens. In summary, an outstanding fundamental phenomenon (migrasomes) en route to becoming transitionally highly significant.

      Strengths:

      Innovative approach at several levels. Migrasomes - discovered by Dr Yu's group - are an outstanding biological phenomenon of fundamental interest and now of potentially practical value.

      Weaknesses:

      I feel that the overemphasis on practical aspects (vaccine), however important, eclipses some of the fundamental aspects that may be just as important and actually more interesting. If this can be expanded, the study would be outstanding.

      We sincerely thank the reviewer for the encouraging and insightful comments. We fully agree that the fundamental aspects of migrasome biology are of great importance and deserve deeper exploration.

      In line with the reviewer’s suggestion, we have expanded our discussion on the basic biology of engineered migrasomes (eMigs). A recent study by the Okochi group at the Tokyo Institute of Technology demonstrated that hypoosmotic stress induces the formation of migrasome-like vesicles, involving cytoplasmic influx and requiring cholesterol for their formation (DOI: 10.1002/1873-3468.14816, February 2024). Building on this, our study provides a detailed characterization of hypoosmotic stressinduced eMig formation, and further compares the biophysical properties of natural migrasomes and eMigs. Notably, the inherent stability of eMigs makes them particularly promising as a vaccine platform.

      Finally, we would like to note that our laboratory continues to investigate multiple aspects of migrasome biology. In collaboration with our colleagues, we recently completed a study elucidating the mechanical forces involved in migrasome formation (DOI: 10.1016/j.bpj.2024.12.029), which further complements the findings presented here.

      Reviewer #2 (Public review):

      Summary:

      The authors' report describes a novel vaccine platform derived from a newly discovered organelle called a migrasome. First, the authors address a technical hurdle in using migrasomes as a vaccine platform. Natural migrasome formation occurs at low levels and is labor intensive, however, by understanding the molecular underpinning of migrasome formation, the authors have designed a method to make engineered migrasomes from cultured, cells at higher yields utilizing a robust process. These engineered migrasomes behave like natural migrasomes. Next, the authors immunized mice with migrasomes that either expressed a model peptide or the SARSCoV-2 spike protein. Antibodies against the spike protein were raised that could be boosted by a 2nd vaccination and these antibodies were functional as assessed by an in vitro pseudoviral assay. This new vaccine platform has the potential to overcome obstacles such as cold chain issues for vaccines like messenger RNA that require very stringent storage conditions.

      Strengths:

      The authors present very robust studies detailing the biology behind migrasome formation and this fundamental understanding was used to form engineered migrasomes, which makes it possible to utilize migrasomes as a vaccine platform. The characterization of engineered migrasomes is thorough and establishes comparability with naturally occurring migrasomes. The biophysical characterization of the migrasomes is well done including thermal stability and characterization of the particle size (important characterizations for a good vaccine).

      Weaknesses:

      With a new vaccine platform technology, it would be nice to compare them head-tohead against a proven technology. The authors would improve the manuscript if they made some comparisons to other vaccine platforms such as a SARS-CoV-2 mRNA vaccine or even an adjuvanted recombinant spike protein. This would demonstrate a migrasome-based vaccine could elicit responses comparable to a proven vaccine technology. 

      We thank the reviewer for the thoughtful evaluation and constructive suggestions, which have helped us strengthen the manuscript. 

      Comparison with proven vaccine technologies:

      In response to the reviewer’s comment, we now include a direct comparison of the antibody responses elicited by eMig-Spike and a conventional recombinant S1 protein vaccine formulated with Alum. As shown in the revised manuscript (Author response image 1), the levels of S1-specific IgG induced by the eMig-based platform were comparable to those induced by the S1+Alum formulation. This comparison supports the potential of eMigs as a competitive alternative to established vaccine platforms. 

      Author response image 1.

      eMigrasome-based vaccination showed similar efficacy compared with adjuvanted recombinant spike protein The amount of S1-specific IgG in mouse serum was quantified by ELISA on day 14 after immunization. Mice were either intraperitoneally (i.p.) immunized with recombinant Alum/S1 or intravenously (i.v.) immunized with eM-NC, eM-S or recombinant S1. The administered doses were 20 µg/mouse for eMigrasomes, 10 µg/mouse (i.v.) or 50 µg/mouse (i.p.) for recombinant S1 and 50 µl/mouse for Aluminium adjuvant.

      Assessment of antigen integrity on migrasomes:

      To address the reviewer’s suggestion regarding antigen integrity, we performed immunoblotting using antibodies against both S1 and mCherry. Two distinct bands were observed: one at the expected molecular weight of the S-mCherry fusion protein, and a higher molecular weight band that may represent oligomerized or higher-order forms of the Spike protein (Figure 5b in the revised manuscript).

      Furthermore, we performed confocal microscopy using a monoclonal antibody against Spike (anti-S). Co-localization analysis revealed strong overlap between the mCherry fluorescence and anti-Spike staining, confirming the proper presentation and surface localization of intact S-mCherry fusion protein on eMigs (Figure 5c in the revised manuscript). These results confirm the structural integrity and antigenic fidelity of the Spike protein expressed on eMigs.

      Recommendations for the authors

      Reviewer #1 (Recommendations For The Authors):

      I feel that the overemphasis on practical aspects (vaccine), however important, eclipses some of the fundamental aspects that may be just as important and actually more interesting. If this can be expanded, the study would be outstanding.

      I know that the reviewers always ask for more, and this is not the case here. Can the abstract and title be changed to emphasize the science behind migrasome formation, and possibly add a few more fundamental aspects on how hypotonic shock induces migrasomes?

      Alternatively, if the authors desire to maintain the emphasis on vaccines, can immunological mechanisms be somewhat expanded in order to - at least to some extent - explain why migrasomes are a better vaccine vehicle?

      One way or another, this reviewer is highly supportive of this study and it is really up to the authors and the editor to decide whether my comments are of use or not.

      My recommendation is to go ahead with publishing after some adjustments as per above.

      We’d like to thank the reviewer for the suggestion. We have changed the title of the manuscript and modified the abstract, emphasizing the fundamental science behind the development of eMigrasome. To gain some immunological information on eMig illucidated antibody responses, we characterized the type of IgG induced by eM-OVA in mice, and compared it to that induced by Alum/OVA. The IgG response to Alum/OVA was dominated by IgG1. Quite differently, eM-OVA induced an even distribution of IgG subtypes, including IgG1, IgG2b, IgG2c, and IgG3 (Figure 4i in the revised manuscript). The ratio between IgG1 and IgG2a/c indicates a Th1 or Th2 type humoral immune response. Thus, eM-OVA immunization induces a balance of Th1/Th2 immune responses.

      Reviewer #2 (Recommendations For The Authors):

      The study is a very nice exploration of a new vaccine platform. This reviewer believes that a more head-to-head comparison to the current vaccine SARS-CoV-2 vaccine platform would improve the manuscript. This comparison is done with OVA antigen, but this model antigen is not as exciting as a functional head-to-head with a SARS-CoV-2 vaccine.

      I think that two other discussion points should be included in the manuscript. First, was the host-cell protein evaluated? If not, I would include that point on how issues of host cell contamination of the migrasome could play a role in the responses and safety of a vaccine. Second, I would discuss antigen incorporation and localization into the platform. For example, the full-length spike being expressed has a native signal peptide and transmembrane domain. The authors point out that a transmembrane domain can be added to display an antigen that does not have one natively expressed, however, without a signal peptide this would not be secreted and localized properly. I would suggest adding a discussion of how a non-native signal peptide would be necessary in addition to a transmembrane domain.

      We thank the reviewer for these thoughtful suggestions and fully agree that the points raised are important for the translational development of eMig-based vaccines.

      (1) Host cell proteins and potential immunogenicity:

      We appreciate the reviewer’s suggestion to consider host cell protein contamination. Considering potential clinical application of eMigrasomes in the future, we will use human cells with low immunogenicity such as HEK-293 or embryonic stem cells (ESCs) to generate eMigrasomes. Also, we will follow a QC that meets the standard of validated EV-based vaccination techniques. 

      (2) Antigen incorporation and localization—signal peptide and transmembrane domain:

      We also agree with the reviewer’s point that proper surface display of antigens on eMigs requires both a transmembrane domain and a signal peptide for correct trafficking and membrane anchoring. For instance, in the case of full-length Spike protein, the native signal peptide and transmembrane domain ensure proper localization to the plasma membrane and subsequent incorporation into eMigs. In case of OVA, a secretary protein that contains a native signal peptide yet lacks a transmembrane domain, an engineered transmembrane domain is required. For antigens that do not naturally contain these features, both a non-native signal peptide and an artificial transmembrane domain are necessary. We have clarified this point in the revised discussion and explicitly noted the requirement for a signal peptide when engineering antigens for surface display on migrasomes.

    1. "fall line;"

      The Atlantic Seaboard Fall Line is a 900-mile escarpment where the Piedmont and Atlantic coastal plain meet in the eastern United States. Before navigation improvements, such as locks, the fall line was generally the head of navigation on rivers due to their rapids or waterfalls, and the necessary portage around them.

    1. A common way to solve this problem is to use interrupt chaining, in which each element in the interrupt vector points to the head of a list of interrupt handlers.

      Interrupt chaining is a mechanism of better utilizing several interrupt handlers. In this scheme, the interrupt vector will hold pointers to a list containing the interrupt handlers, each handler will accept a certain type of an interrupt. Once an interrupt is received the CPU jumps through the chain of handlers, and they are executed one after another. This enables interrupts to be better organized, and prioritized and all the appropriate handlers are called in the right sequence without any overlapping or missing any valuable service routines.

    1. Even the best students, however, need to make big adjustments to learn the conventions of academic writing. College-level writing obeys different rules, and learning them will help you hone your writing skills. Think of it as ascending another step up the writing ladder.

      this is so true because maybe in high school the writing prompts weren't so long and take so much time and effort, college has a different level of showing, I could say I was a better writer in elementary school, I used to write pages and pages teachers would tell me I was going to be a great novel writer, but that wasn't my plan to become one, writing was and is one of the things I like to do. I was reading the chapter 5 of the AI article about how I could even go to bed thinking about one thing I also related to that because sometimes I can't even go to bed because my head is always running and thinking what I could do tomorrow. sounds crazy but when I was reading this article i could just imagine everything, interesting article by the way :)

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      One possible remaining conceptual concern that might require future work is determining whether STN primarily mediates higher-level cognitive avoidance or if its activation primarily modulates motor tone.

      Our results using viral and electrolytic lesions (Fig. 11) and optogenetic inhibition of STN neurons (Fig. 10) show that signaled active avoidance is virtually abolished, and this effect is reproduced when we selectively inhibit STN fibers in the midbrain (Fig. 12). Inhibition of STN projections in either the substantia nigra pars reticulata (SNr) or the midbrain reticular tegmentum (mRt) eliminates cued avoidance responses while leaving escape responses intact. Importantly, mice continue to escape during US presentation after lesions or during photoinhibition, demonstrating that basic motor capabilities and the ability to generate rapid defensive actions are preserved.

      These findings argue against the idea that STN’s role in avoidance reflects a nonspecific suppression or facilitation of motor tone, even if the STN also contributes to general movement control. Instead, they show that STN output is required for generating “cognitively” guided cued actions that depend on interpreting sensory information and applying learned contingencies to decide when to act. Thus, while STN activity can modulate movement parameters, the loss-of-function results point to a more selective role in supporting cued, goal-directed avoidance behavior rather than a general adjustment of motor tone.

      Reviewer #2 (Public review):

      All previous weaknesses have been addressed. The authors should explain how inhibition of the STN impairing active avoidance is consistent with the STN encoding cautious action. If 'caution' is related to avoid latency, why does STN lesion or inhibition increase avoid latency, and therefore increase caution? Wouldn't the opposite be more consistent with the statement that the STN 'encodes cautious action'?

      The reviewer’s interpretation treats any increase in avoidance latency as evidence of “more caution,” but this holds only when animals are performing the avoidance behavior normally. In our intact animals, avoidance rates remain high across AA1 → AA2 → AA3, and the active avoidance trials (CS1) used to measure latency are identical across tasks (e.g., in AA2 the only change is that intertrial crossings are punished). Under these conditions, changes in latency genuinely reflect adjustments in caution, because the behavior itself is intact, actions remain tightly coupled to the cue, and the trials are identical.

      This logic does not apply when STN function is disrupted. STN inhibition or lesions reduce avoidance to near chance levels; the few crossings that do occur are poorly aligned to the CS and many likely reflect random movement rather than a cued avoidance response. Once performance collapses, latency can no longer be assumed to reflect the same cognitive process. Thus, interpreting longer latencies during STN inactivation as “more caution” would be erroneous, and we never make that claim.

      A simple analogy may help clarify this distinction. Consider a pedestrian deciding when to cross the street after a green light. If the road is deserted (like AA1), the person may step off the curb quickly. If the road is busy with many cars that could cause harm (like AA2), they may wait longer to ensure that all cars have stopped. This extra hesitation reflects caution, not an inability to cross. However, if the pedestrian is impaired (e.g., cannot clearly see the light, struggles to coordinate movements, or cannot reliably make decisions), a delayed crossing would not indicate greater caution—it would reflect a breakdown in the ability to perform the behavior itself. The same principle applies to our data: we interpret latency as “caution” only when animals are performing the active avoidance behavior normally, success rates remain high, and the trial rules are identical. Under STN inhibition or lesion, when active avoidance collapses, the latency of the few crossings that still occur can no longer be interpreted as reflecting caution. We have added these points to the Discussion.

      Reviewer #3 (Public review):

      Original Weaknesses:

      I found the experimental design and presentation convoluted and some of the results over-interpreted.

      We appreciate the reviewer’s comment, but the concern as stated is too general for us to address in a concrete way. The revised manuscript has been substantially reorganized, with simplified terminology, streamlined figures, and removal of an entire set of experiments to avoid over-interpretation. We are confident that the experimental design and results are now presented clearly and without extrapolation beyond the data. If there are specific points the reviewer finds convoluted or over-interpreted, we would be happy to address them directly.

      As presented, I don't understand this idea that delayed movement is necessarily indicative of cautious movements. Is the distribution of responses multi-modal in a way that might support this idea; or do the authors simply take a normal distribution and assert that the slower responses represent 'caution'? Even if responses are multi-modal and clearly distinguished by 'type', why should readers think this that delayed responses imply cautious responding instead of say: habituation or sensitization to cue/shock, variability in attention, motivation, or stress; or merely uncertainty which seems plausible given what I understand of the task design where the same mice are repeatedly tested in changing conditions. This relates to a major claim (i.e., in the title).

      We appreciate the reviewer’s question and address each component directly.

      (1) What we mean by “caution” and how it is operationalized

      In our study, caution is defined operationally as a systematic increase in avoidance latency when the behavioral demand becomes higher, while the trial structure and required response remain unchanged. Specifically, CS1 trials are identical in AA1, AA2, and AA3. Thus, when mice take longer to initiate the same action under more demanding contexts, the added time reflects additional evaluation before acting—consistent with longestablished interpretations of latency shifts in cognitive psychology (see papers by Donders, Sternberg, Posner) and interpretations of deliberation time in speed-accuracy tradeoff literature.

      (2) Why this interpretation does not rely on multi-modal response distributions We do not claim that “cautious” responses form a separate mode in the latency distribution. The distributions are unimodal, and caution is inferred from conditiondependent shifts in these distributions across identical trials, not from the existence of multiple peaks (see Zhou et al, 2022). Latency shifts across conditions with identical trial structure are widely used as behavioral indices of deliberation or caution.

      (3) Why alternative explanations (habituation/sensitization, motivation, attention, stress, uncertainty) do not account for these latency changes

      Importantly, nothing changes in CS1 trials between AA1 and AA2 with respect to the cue, shock, or required response. Therefore:

      - Habituation/sensitization to the cue or shock cannot explain the latency shift (the stimuli and trial type are unchanged). We have previously examined cue-evoked orienting responses and their habituation in detail (Zhou et al., 2023), and those measurements are dissociable from the latency effects described here.

      - Motivation or attention are unlikely to change selectively for identical CS1 trials when the task manipulation only adds a contingency to intertrial crossings.

      - Uncertainty also does not increase for CS1 trials, they remain fully predictable and unchanged between conditions.

      - Stress is too broad a construct to be meaningful unless clearly operationalized; moreover, any stress differences that arise from task structure would covary with caution rather than replace the interpretation.

      (4) Clarifying “types” of responses

      The reviewer’s question about “response types” appears to conflate behavioral latencies with the neuronal response “types” defined in the manuscript. The term “type” in this paper refers to neuronal activation derived from movement-based clustering, not to distinct behavioral categories of avoidance, which we term modes.

      In sum, we interpret increased CS1 latency as “caution” only when performance remains intact and trial structure is identical between conditions; under those criteria, latency reliably reflects additional cognitive evaluation before acting, rather than nonspecific changes in sensory processing, motivation, etc.

      Related to the last, I'm struggling to understand the rationale for dividing cells into 'types' based their physiological responses in some experiments.

      There is longstanding precedent in systems neuroscience for classifying neurons by their physiological response patterns, because neurons that respond similarly often play similar functional roles. For example, place cells, grid cells, direction cells, in vivo, and regular spiking, burst firing, and tonic firing in vitro are all defined by characteristic activity patterns in response to stimuli rather than anatomy or genetics alone. In the same spirit, our classifications simply reflect clusters of neurons that exhibit similar ΔF/F dynamics around behaviorally relevant events, such as movement sensitivity or avoidance modes. This is a standard analytic approach used in many studies. Thus, our rationale is not arbitrary: the “classes” and “types” arise from data-driven clustering of physiological responses, consistent with widespread practice, and they help reveal functional distinctions within the STN that would otherwise remain obscured.

      In several figures the number of subjects used was not described. This is necessary. Also necessary is some assessment of the variability across subjects.

      All the results described include the number of animals. To eliminate uncertainty, we now also include this information in figure legends.

      The only measure of error shown in many figures relates trial-to-trial or event variability, which is minimal because in many cases it appears that hundreds of trials may have been averaged per animal, but this doesn't provide a strong view of biological variability (i.e., are results consistent across animals?).

      The concern appears to stem from a misunderstanding of what the mixed-effects models quantify. The figure panels often show session-averaged traces for clarity, all statistical inferences in the paper are made at the level of animals, not trials. Mixed-effects modeling is explicitly designed for hierarchical datasets such as ours, where many trials are nested within sessions, which are themselves nested within animals.

      In our models, animal is the clustering (random) factor, and sessions are nested within animals, so variability across animals is directly estimated and used to compute the population-level effects. This approach is not only appropriate but is the most stringent and widely recommended method for analyzing behavioral and neural data with repeated measures. In other words, the significance tests and confidence intervals already fully incorporate biological variability across animals.

      Thus, although hundreds of trials per animal may be illustrated for visualization, the inferences reflect between-animal consistency, not within-animal trial repetition. The fact that the mixed-effects results are robust across animals supports the biological reliability of the findings.

      It is not clear if or how spread of expression outside of target STN was evaluated, and if or how or how many mice were excluded due to spread or fiber placements. Inadequate histological validation is presented and neighboring regions that would be difficult to completely avoid, such as paraSTN may be contributing to some of the effects.

      The STN is a compact structure with clear anatomical boundaries, and our injections were rigorously validated to ensure targeting specificity. As detailed in the Methods, every mouse underwent histological verification, and injections were quantified using the Brain Atlas Analyzer app (available on OriginLab), which we developed to align serial sections to the Allen Brain Atlas. This approach provides precise, slice-by-slice confirmation of viral spread. We have performed thousands of AAV injections and probe implants in our lab, incorporating over the years highly reliable stereotaxic procedures with multiple depth and angle checks and tools. For this study specifically, fewer than 10% of mice were excluded due to off-target expression or fiber/lesion placement. None of the included cases showed spread into adjacent structures.

      Regarding paraSTN: anatomically, paraSTN is a very small extension contiguous with STN. Our study did not attempt to dissociate subregions within STN, and the viral expression patterns we report fall within the accepted boundaries of STN. Importantly, none of our photometry probes or miniscope lenses sampled paraSTN, so contributions from that region are extremely unlikely to account for any of our neural activity results.

      Finally, our paper employs five independent loss-of-function approaches—optogenetic inhibition of STN neurons, selective inhibition of STN projections to the midbrain (in two sites: SNr and mRt), and STN lesions (electrolytic and viral). All methods converge on the same conclusion, providing strong evidence that the effects we report arise from manipulation of STN itself rather than from neighboring regions.

      Raw example traces are not provided.

      We do not think raw traces are useful here. All figures contain average traces to reflect the average activity of the estimated populations, which are already clustered per classes and types.

      The timeline of the spontaneous movement and avoidance sessions were not clear, nor the number of events or sessions per animal and how this was set. It is not clear if there was pre-training or habituation, if many or variable sessions were combined per animal, or what the time gaps between sessions was, or if or how any of these parameters might influence interpretation of the results.

      As noted, we have enhanced the description of the sessions, including the number of animals and sessions, which are daily and always equal per animals in each group of experiments. The sessions are part of the random effects in the model. In addition, we now include schematics to facilitate understanding of the procedures.  

      Comments on revised version:

      The authors removed the optogenetic stimulation experiments, but then also added a lot of new analyses. Overall the scope of their conclusions are essentially unchanged. Part of the eLife model is to leave it to the authors discretion how they choose to present their work. But my overall view of it is unchanged. There are elements that I found clear, well executed, and compelling. But other elements that I found difficult to understand and where I could not follow or concur with their conclusions.

      We respectfully disagree with the assertion that the scope of our conclusions remains unchanged. The revised manuscript differs in several fundamental ways:

      (1) Removal of all optogenetic excitation experiments

      These experiments were a substantial portion of the original manuscript, and their removal eliminated an entire set of claims regarding the causal control of cautious responding by STN excitation. The revised manuscript no longer makes these claims.

      (2) Addition of analyses that directly address the reviewers’ central concerns The new analyses using mixed-effects modeling, window-specific covariates, and movement/baseline controls were added precisely because reviewers requested clearer dissociation of sensory, motor, and task-related contributions. These additions changed not only the presentation but the interpretation of the neural signals. We now conclude that STN encodes movement, caution, and aversive signals in separable ways—not that it exclusively or causally regulates caution.

      (3) Clear narrowing of conclusions

      Our current conclusions are more circumscribed and data-driven than in the original submission. For example, we removed all claims that STN activation “controls caution,” relying instead on loss-of-function data showing that STN is necessary for performing cued avoidance—not for generating cautious latency shifts. This is a substantial conceptual refinement resulting directly from the review process.

      (4) Reorganization to improve clarity

      Nearly every section has been restructured, including terminology (mode/type/class), figure organization, and explanations of behavioral windows. These revisions were implemented to ensure that readers can follow the logic of the analyses.

      We appreciate the reviewer’s recognition that several elements were clear and compelling. For the remaining points they found difficult to understand, we have addressed each one in detail in the response and revised the manuscript accordingly. If there are still aspects that remain unclear, we would welcome explicit identification of those points so that we can clarify them further.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Show individual data points on bar plots

      - partially addressed. Individual data points are still not shown.

      Wherever feasible, we display individual data points (e.g., Figures 1 and 2) to convey variability directly. However, in cases where figures depict hundreds of paired (repeatedmeasures) data points, showing all points without connecting them would not be appropriate, while linking them would make the figures visually cluttered and uninterpretable. All plots and traces include measures of variability (SEM), and the raw data will be shared on Dryad. When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Also, to minimize visual clutter, only a subset of relevant comparisons is highlighted with asterisks, whereas all relevant statistical results, comparisons, and mouse/session numbers are fully reported in the Results section, with statistical analyses accounting for the clustering of data within subjects and sessions.

      (2) The active avoidance experiments are confusing when they are introduced in the results section. More explanation of what paradigms were used and what each CS means at the time these are introduced would add clarity. For example AA1, AA2 etc are explained only with references to other papers, but a brief description of each protocol and a schematic figure would really help.

      - partially addressed. A schematic figure showing the timeline would still be helpful.

      As suggested, we have added an additional panel to Fig. 5A with a schematic describing

      AA1-3 tasks. In addition, the avoidance protocols are described briefly but clearly in the Results section (second paragraph of “STN neurons activate during goal-directed avoidance contingencies”) and in greater detail in the Methods section. As stated, these tasks were conducted sequentially, and mice underwent the same number of sessions per procedure, which are indicated. All relevant procedural information has been included in these sections. Mice underwent daily sessions and learnt these tasks within 1-2 sessions, progressing sequentially across tasks with an equal number of sessions per task (7 per task), and the resulting data were combined and clustered by mouse/session in the statistical models.

      (3) How do the Class 1, 2, 3 avoids relate to Class 1 , 2, 3 neural types established in Figure 3? It seems like they are not related, and if that is the case they should be named something different from each other to avoid confusion.

      -not sufficiently addressed. The new naming system of neural 'classes' and 'types' helps with understanding that these are completely different ways of separating subpopulations within the STN. However, it is still unclear why the authors re-type the neurons based on their relation to avoids, when they classify the neurons based on their relationship to speed earlier. And it is unclear whether these neural classes and neural types have anything to do with each other. Are the neural Types related to the neural classes in any way? and what is the overlap between neural types vs classes? Which separation method is more useful for functionally defining STN populations?

      The remaining confusion stems from treating several independent analyses as if they were different versions of the same classification. In reality, each analysis asks a distinct question, and the resulting groupings are not expected to overlap or correspond. We clarify this explicitly below.

      - Movement onset neuron classes (Class A, B, C; Fig. 3):

      These classes categorize neurons based on how their ΔF/F changes around spontaneous movement onset. This analysis identifies which neurons encode the initiation and direction of movement. For instance, Class B neurons (15.9%) were inhibited as movement slowed before onset but did not show sharp activation at onset, whereas Class C neurons (27.6%) displayed a pronounced activation time-locked to movement initiation. Directional analyses revealed that Class C neurons discharged strongly during contraversive turns, while Class B neurons showed a weaker ipsiversive bias. Because neurons were defined per session and many of these recordings did not include avoidance-task sessions, these movement-onset classes were not used in the avoidance analyses.

      - Movement-sensitivity neuron classes (Class 1, 2, 3, 4; Fig. 7):

      These classes categorize neurons based on the cross-correlation between ΔF/F and head speed, capturing how each neuron’s activity scales with movement features across the entire recording session. This analysis identifies neurons that are strongly speed-modulated, weakly speed-modulated, or largely insensitive to movement. These movement-sensitivity classes were then carried forward into the avoidance analyses to ask how neurons with different kinematic relationships participate during task performance; for example, whether neurons that are insensitive to movement nonetheless show strong activation during avoidance actions.

      - Avoidance modes (Mode 1, 2, 3; Fig. 8)

      Here we classify actions, not neurons. K-means clustering is applied to the movementspeed time series during CS1 active avoidance trials only, which allows us to identify distinct action modes or variants—fast-onset versus delayed avoidance responses. This action-based classification ensures that we compare neural activity across identical movements, eliminating a major confound in studies that do not explicitly separate action variants. First, we examine how population activity differs across these avoidance modes, reflecting neural encoding of the distinct actions themselves. Second, within each mode, we then classify neurons into “types,” which simply describes how different neurons activate during that specific avoidance action (as noted next).

      - Neuron activation types within each mode (Type a, b, c; Fig.9)

      This analysis extends the mode-based approach by classifying neuronal activation patterns only within each specific avoidance mode. For each mode, we apply k-means clustering to the ΔF/F time series to identify three activation types—e.g., neurons showing little or no response, neurons showing moderate activation, and neurons showing strong or sharply timed activation. Because all trials within a mode have identical movement profiles, these activation types capture the variability of neural responses to the same avoidance behavior. Importantly, these activation “types” (a, b,

      c) are not global neuron categories. They do not correspond to, nor are they intended to map onto, the movement-based neuron classes defined earlier. Instead, they describe how neurons differ in their activation during a particular behavioral mode—that is, within a specific set of behaviorally matched trials. Because modes are defined at the trial level, the neurons contributing to each mode can differ: some neurons have trials belonging to one mode, others to two or all three. Thus, Type a/b/c groupings are not fixed properties of neurons. To prevent confusion, we refer to them explicitly as neuronal activation types, emphasizing that they characterize mode-specific response patterns rather than global cell identities.

      In conclusion, the categorizations serve entirely different analytical purposes and should not be interpreted as competing classifications. The mode-specific “types” do not reclassify or replace the movement-sensitivity classes; they capture how neurons differ within a single, well-defined avoidance action, while the movement classes reflect how neurons relate to movements in general. Each classification relates to different set of questions and overlap between them is not expected.

      To make this as clear as possible we added the following paragraph to the Results:  

      “To avoid confusion between analyses, it is important to note that the movement-sensitivity classes defined here (Class 1–4; Fig. 7) are conceptually distinct from both the movementonset classes (Class A–C; Fig. 3) and the neuronal activation “types” introduced later in the avoidance-mode analysis. The Class 1–4 grouping reflects how neurons relate to movement across the entire session, based on their cross-correlation with speed. The onset classes A–C capture neural activity specifically around spontaneous movement initiation during general exploration. In contrast, the later activation “types” are derived within each avoidance mode and describe how neurons differ in their activation patterns during identical CS1 avoidance responses. These classifications answer different questions about STN function and are not intended to correspond to one another.”

      (4) Similarly having 3 different cell types (a,b,c) in the active avoidance seems unrelated to the original classification of cell types (1,2,3), and these are different for each class of avoid. This is very confusing and it is unclear how any of these types relate to each other. Presumable the same mouse has all three classes of avoids, so there are recording from each cell during each type of avoid. So the authors could compare one cell during each avoid and determine whether it relates to movement or sound or something else. It is interesting that types a,b,c have the exact same proportions in each class of avoid, and really makes it important to investigate if these are the exact same cells or not. Also, these mice could be recorded during open field so the original neural classification (class 1, 2,3) could be applied to these same cells and then the authors can see whether each cell type defined in the open field has different response to the different avoid types. As it stands, the paper simply finds that during movement and during avoidance behaviors different cells in the STN do different things. - Similarly, the authors somewhat addressed the neural types issue, but figure 9 still has 9 different neural types and it is unclear whether the same cells that are type 'a' in mode 1 avoids are also type 'a' in mode 2 avoids, or do some switch to type b? Is there consistency between cell types across avoid modes? The authors show that type 'c' neurons are differentially elevated in mode 3 vs 2, but also describes neurons as type '2c' and statistically compare them to type '1c' neurons. Are these the same neurons? or are type 2c neurons different cells vs type 1c neurons? This is still unclear and requires clarification to be interpretable.

      We believe the remaining confusion arises from treating the different classification schemes as if they were alternative labels applied to the same neurons, when in fact they serve entirely separate analytical purposes and may not include the same neurons (see previous point). Because these classifications answer different questions, they are not expected to overlap, nor is overlap required for the interpretations we draw. It is therefore not appropriate to compare a neuron’s “type” in one avoidance mode to its movement class, or to ask whether types a/b/c across different modes are “the same cells,” since modes are defined by trial-level movement clustering rather than by neuron identity. Importantly, Types a/b/c are not intended as a new global classification of neurons; they simply summarize the variability of neuronal responses within each behaviorally matched mode. We agree that future studies could expand our findings, but that is beyond the already wide scope of the present paper. Our current analyses demonstrate a key conceptual point: when movement is held constant (via modes), STN neurons still show heterogeneous, outcome- and caution-related patterns, indicating encoding that cannot be reduced to movement alone.

      Relatedly, was the association with speed used to define each neural "class" done in the active avoidance context or in a separate (e.g. open field) experiment? This is not clear in the text.

      The cross-correlation classes were derived from the entire recording session, which included open-field and avoidance tasks recordings. The tasks include long intertrial periods with spontaneous movements. We found no difference in classes when we include only a portion of the session, such as the open field or if we exclude the avoidance interval where actions occur.

      Finally, in figure 7, why is there a separate avoid trace for each neural class? With the GRIN lens, the authors are presumably getting a sample of all cell types during each avoid, so why do the avoids differ depending on the cell type recorded?

      The entire STN population is not recorded within a single session; each session contributes only a subset of neurons to the dataset. Consequently, each neural class is composed of neurons drawn from partially non-overlapping sets of sessions, each with its own movement traces. For this reason, we plot avoidance traces separately for each neural class to maintain strict within-session correspondence between neural activity and the behavior collected in the same sessions. This prevents mixing behavioral data across sessions that did not contribute neurons to that class and ensures that all neural– behavioral comparisons remain appropriately matched. We have clarified this rationale in the revised manuscript. We note that averaging movement across classes—as is often done—would obscure these distinctions and would not preserve the necessary correspondence between neural activity and behavior. This is also clarified in Results.

      (5) The use of the same colors to mean two different things in figure 9 is confusing. AA1 vs AA2 shouldn't be the same colors as light-naïve vs light signaling CS.

      -addressed, but the authors still sometimes use the same colors to mean different things in adjacent figures (e.g. the red, blue, black colors in figure 1 and figure 2 mean totally different things) and use different colors within the same figure to represent the same thing (Figure 9AB vs Figure 9CD). This is suboptimal.

      Following the reviewer’s suggestion, in Figure 2, we changed the colors, so readers do not assume they are related to Fig. 1.

      In Figure 9, we changed the colors in C,D to match the colors in A,B.

      (6) The exact timeline of the optogenetics experiments should be presented as a schematic for understandability. It is not clear which conditions each mouse experienced in which order. This is critical to the interpretation of figure 9 and the reduction of passive avoids during STN stimulation. Did these mice have the CS1+STN stimulation pairing or the STN+US pairing prior to this experiment? If they did, the stimulation of the STN could be strongly associated with either punishment or with the CS1 that predicts punishment. If that is the case, stimulating the STN during CS2 could be like presenting CS1+CS2 at the same time and could be confusing. The authors should make it clear whether the mice were naïve during this passive avoid experiment or whether they had experienced STN stimulation paired with anything prior to this experiment.

      -addressed

      (7) Similarly, the duration of the STN stimulation should be made clear on the plots that show behavior over time (e.g. Figure 9E).

      -addressed

      (8) There is just so much data and so many conditions for each experiment here. The paper is dense and difficult to read. It would really benefit readability if the authors put only the key experiments and key figure panels in the main text and moved much of the repetative figure panels to supplemental figures. The addition of schematic drawings for behavioral experiment timing and for the different AA1, AA2, AA3 conditions would also really improve clarity.

      -partially addressed. The paper is still dense and difficult to read. No experimental schematics were added.

      As suggested, we now added the schematic to Fig. 5A.  

      New Comments:

      (9) Description of the animals used and institutional approval are missing from the methods.

      The information on animal strains and institutional approval is already included in the manuscript. The first paragraph of the Methods section states:

      “… All procedures were reviewed and approved by the institutional animal care and use committee and conducted in adult (>8 weeks) male and female mice. …”

      Additionally, the next subsection, “Strains and Adeno-Associated Viruses (AAVs),” fully specifies all mouse lines used. We therefore believe that the required descriptions of animals and institutional approval are already present and meet standard reporting.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors assess the role of map3k1 in adult Planaria through whole body RNAi for various periods of time. The authors' prior work has shown that neoblasts (stem cells that can regenerate the entire body) for various tissues are intermingled in the body. Neoblasts divide to produce progenitors that migrate within a "target zone" to the "differentiated target tissues" where they differentiate into a specific cell type. Here the authors show that map3k1-i animals have ectopic eyes that form along the "normal" migration path of eye progenitors (Fig. 1), ectopic neurons and glands along the AP axis (Fig. 2) and pharynx in ectopic anterior positions (Fig. 3). The rest of the study show that positional information is largely unaffected by loss of map3k1 (Fig. 4,5). However, loss of map3k1 leads to premature differentiated of progenitors along their normal migratory route (Fig. 6). They also show that an ill-defined "long-term" whole body depletion of map3k1 results in mis-specified organs and teratomas.

      Strengths:

      (1) The study has appropriate controls, sample sizes and statistics.

      (2) The work appears to be high-quality.

      (3) The conclusions are supported by the data.

      (4) Planaria is a good system to analyze the function of map3k1, which exists in mammals but not in other invertebrates.

      Weaknesses:

      (1) The paper is largely descriptive with no mechanistic insights. 

      The mechanistic insights we aim to address are primarily at the cellular systems level – how adult progenitor cells produce pattern. Specifically, we uncovered strong evidence that regulation of differentiation is an active process occurring in migratory progenitors and that this regulation is a major component of pattern formation during the adult processes of tissue turnover and regeneration. The map3k1 phenotype provided a tool used to reveal the existence of this regulation, and to understand the patterning abnormalities prevented by this regulatory mechanism. We updated the text in several places to make clearer some of this emphasis. For example, in the Discussion: "We suggest that differentiation is restricted during migratory targeting as an essential component of pattern formation, with the map3k1 RNAi phenotype indicating the existence and purpose of this element of patterning." 

      Naturally, identifying a particular molecule involved in this process is of interest for understanding molecular mechanism; this would allow for comparison to other cellular systems in other organisms and would focus future molecular inquiry. Future molecular studies into the mechanism of Map3k1 regulation and its downstream signaling will be fascinating as next steps towards understanding the process at the molecular level more deeply. We also added some discussion considering the types of upstream activation cues that could potentially be associated with Map3k1 regulation to suppress differentiation. 

      (2) Given the severe phenotypes of long-term depletion of map3k1, it is important that this exact timepoint is provided in the methods, figures, figure legends and results. 

      We removed the use of the term “long-term” and instead added timepoints used to all figure legends. We also added a summary of timepoints used in the methods section and included RNAi timepoint labels in figures where a phenotype progression over time is relevant to interpretation. For timecourses, we also added suitable time information to text in the results. 

      (3) Figure 1C, the ectopic eyes are difficult to see, please add arrows. 

      To improve visualization, we replaced the example animal in the original Figure 1C with one that has a stronger phenotype, including arrows pointing to every ectopic event. Additionally, we included magnified images of optic cup cells and photoreceptor neurons in the trunk and tail region. This is now Figure 1B.

      (4) line 217 - why does the n=2/12 animals not match the values in Figure 3B, which is 11/12 and 12/12. The numbers don't add up. Please correct/explain. 

      In Figure 3B in the submitted version (3/18 had cells in the tail) had more animals scored (6 animals from a replicate experiment where 1/6 showed the cells in the tail) than the total scored (2/12 had cells in the tail) in the text, which did not have the animals from the replicate added during writing. The results are the same, just different sample sizes were noted in those locations and we fixed this issue. In the updated Figure 3, the order of presentation has shifted (e.g., prior 3B is now in 3C and Figure 3_figure supplement 1). We made sure to include numbers to all figure panels. 

      (5) Figure panels do not match what is written in the results section. There is no Figure 6E. Please correct.

      Thank you for catching this. We have gone through figures and text after editing to make sure that text callouts are appropriately matched to the figures. 

      Reviewer #2 (Public review):

      Summary:

      The flatworm planarian Schmidtea mediterranea is an excellent model for understanding cell fate specification during tissue regeneration and adult tissue maintenance. Planarian stem cells, known as neoblasts, are continuously deployed to support cellular turnover and repair tissues damaged or lost due to injury. This reparative process requires great precision to recognize the location, timing, and cellular fate of a defined number of neoblast progeny. Understanding the molecular mechanisms driving this process could have important implications for regenerative medicine and enhance our understanding of how form and function are maintained in long-lived organisms such as humans. Unfortunately, the molecular basis guiding cell fate and differentiation remains poorly understood.

      In this manuscript, Canales et al. identified the role of the map3k1 gene in mediating the differentiation of progenitor cells at the proper target tissue. The map3k1 function in planarians appears evolutionarily conserved as it has been implicated in regulating cell proliferation, differentiation, and cell death in mammals. The results show that the downregulation of map3k1 with RNAi leads to spatial patterning defects in different tissue types, including the eye, pharynx, and the nervous system. Intriguingly, long-term map3k1-RNAi resulted in ectopic outgrowths consistent with teratomas in planarians. The findings suggest that map3k1 mediates signaling, regulating the timing and location of cellular progenitors to maintain correct patterning during adult tissue maintenance.

      Strengths:

      The authors provide an entry point to understanding molecular mechanisms regulating progenitor cell differentiation and patterning during adult tissue maintenance.

      The diverse set of approaches and methods applied to characterize map3k1 function strengthens the case for conserved evolutionary mechanisms in a selected number of tissue types. The creativity using transplantation experiments is commendable, and the findings with the teratoma phenotype are intriguing and worth characterizing.

      Thank you to the reviewer for the positive feedback

      Weaknesses:

      The article presents a provocative idea related to the importance of positional control for organs and cells, which is at least in part regulated by map3k1. Nonetheless, the role of map3k1 or its potential interaction with regulators of the anterior-posterior, mediolateral axes, and PCGs is somewhat superficial. The authors could elaborate or even speculate more in the discussion section and the different scenarios incorporating these axial modulators into the map3k1 model presented in Figure 8 

      First, to strengthen the support for our finding that positional information is largely unaffected in map3k1 RNAi animals, we added data regarding the expression of additional relevant position control genes (PCGs) –ndl-4, ptk7, sp5, and wnt11-1 – to the PCG panel in Figure 5. The expression domain of ndl-4, an FGF receptor-like protein family member that contributes to head patterning and anterior pole maintenance, was normal in map3k1 RNAi. wnt11-1, a PCG with expression concentrated in the posterior end of the animal and with expression dependent on general Wnt activity, was also normal in map3k1 RNAi animals. ptk7, RNAi of which can result in supernumerary pharynges, also showed normal expression in map3k1 RNAi animals. Finally, sp5, a Wnt-activated gene with expression in the tail, also showed normal expression in map3k1 RNAi animals. 

      Second, to further support the conclusion that cells are not suitably responding to positional information after map3k1 RNAi, which we argue normally dictates where differentiation should occur, we added examples of differentiated cell types that are ectopically positioned within an atypical PCG expression domain for that cell type (Figure 5C). This underscores that following map3k1 RNAi the PCG expression domains do not change, but the pattern of differentiated cell types relative to these domains does shift. We also added data showing that regenerating tails had a proper wntP-2 gradient, but an anterior regenerating pharynx appeared outside of this wntP-2<sup>+</sup> zone and inside of an ndl-5<sup>+</sup> zone (Figure 5- figure supplement 1E). We added some discussion of these new data in the Figure 5 results section. We also noted, regarding independent recent map3k1 work (Lo, 2025), some evidence exists that a minor posterior shift in ndl-5 expression can occur after map3k1 RNAi.

      Next, we added a new element to the model figure (Figure 8B) depicting that PCG expression domains remain normal after map3k1 RNAi, with ectopic differentiation occurring in an incorrect positional information environment. We refer to this new panel in the discussion: "We suggest that map3k1 is not required for the spatial distribution of progenitor-extrinsic differentiation-promoting cues themselves, but for progenitors to be restricted from differentiating until these cues are received (Figure 8B)."; we then follow this statement with a summary in the Discussion of six pieces of evidence that support this model.

      Finally, we added some additional text to the discussion section about candidate mechanisms by which extrinsic cues could potentially regulate Map3k1, pointing to potential future inquiry directions. We suggest that map3k1 is not involved in regulating PCG activity domains themselves, but instead acts as a brake on differentiation within migratory progenitors through active signaling. This brake is then lifted when the progenitors hit their correct PCG-based migratory target, or when they hit their target tissue. How that occurs mechanistically is unknown. One scenario is that each progenitor is tuned to respond to a particular PCG-regulated environment (such as a particular ECM or signaling environment) to generate a molecular change that inactivates Map3K1 signaling, such as by inactivating or disengaging an RTK signal. Alternatively, the migratory process in progenitors could engage the Map3K1 signal, enabling signal cessation with arrival at a target location. When Map3K1 is active it could result in a transcriptional state that prevents full expression of differentiated factors required for maturation, tissue incorporation, and cessation of migration. These considerations are now added to the discussion.

      The article can be improved by addressing inconsistencies and adding details to the results, including the main figures and supplements. This represents one of the most significant weaknesses of this otherwise intriguing manuscript. Below are some examples of a few figures, but the authors are expected to pay close attention to the remaining figures in the paper.

      Details associated with the number of animals per experiment, statistical methods used, and detailed descriptions of figures appear inconsistent or lacking in almost all figures. In some instances, the percentage of animals affected by the phenotype is shown without detailing the number of animals in the experiment or the number of repeats. Figures and their legends throughout the paper lack details on what is represented and sometimes are mislabeled or unrelated. 

      We endeavored to ensure that these noted details are present throughout the legends and figures for all figure panels.

      Specifically, the arrows in Figure 1A are different colors. Still, no reasoning is given for this, and in the exact figure, the top side (1A) shows the percentages and the number of animals below. 

      The only reason for the different colored arrows was for visibility purposes. To avoid confusion, we now use white arrows for all FISH images in figure 1, and where ever else possible. We also replaced the percentages with n numbers in the bottom left corner of the live images in Figure 1A. 

      Conversely, in Figures 1B, C, and D, no details on the number of animals or percentages are shown, nor an explanation of why opsin was used in Figure 1A but not 1B. 

      The original Figure 1B represented a few different examples of ectopic eye/eye cell patterns in the map3k1 RNAi animals to demonstrate the variable and disorganized nature of the phenotype. To address this, we added further explanation in the legend. We also merged 1A and 1B for simplicity of interpretation. opsin was used in Figure 1A to label cell bodies of photoreceptors. anti-Arrestin was used in the example FISH images to see if these cells were interconnected via projections, which we now clarify in the legend and in the text. 

      Is Figure 1B missing an image for the respective control? Figure 1C needs details regarding what the two smaller boxes underneath are. 

      The control for Figure 1B was in Figure 1A; the merger of Figures 1A/B should address this. Boxes in Figure 1C were labelled with numbers corresponding to the image above them.

      Figure 1C could use an AP labeling map in 10 days (e.g., AP6 has one optic cup present). Figure 1C and F counts do not match. 

      We added a cartoon to 1C to show the region imaged. Note that the 36d trunk image (now Fig. 1B) has now been replaced with a full animal image and magnified boxes that show locations of example ectopic cells. That cell in 1C was categorized as in AP5. Note that additional animals were analyzed and data added to the distribution (now Fig. 1D). 

      In Figure 1C, we do not know the number of animals tested, controls used, the scale bar sizes in the first two images, nor the degree of magnification used despite the pharynx region appearing magnified in the second image.  Figure 1C is also shown out of chronological order; 36 days post RNAi is shown before 10 days post RNAi. Moreover, the legends for Figures 1C and 1D are swapped.

      We have endeavored to ensure sample numbers, control images, and appropriate scale bar notation in legends are present for all images. Figure 1C has now been split into two panels: Figure 1B and Figure 1C. It does not follow a chronological order in presentation for the following logic flow: we uncover and describe the phenotype; then, with knowledge of the defect, we walk back to see how early the phenotype starts after RNAi and what the pattern of ectopic cell distribution is when the phenotype starts to emerge (using the knowledge of which cells are affected from the overt phenotype described in 1A/B). 

      Additionally, Figure 1F and many other figures throughout the paper lack overall statistical considerations. Furthermore, Figure 1F has three components, but only one is labeled. Labeling each of them individually and describing them in the corresponding figure legend may be more appropriate.

      The main point of the graphs in 1F (now 1D) was the overt overall pattern difference with the wild-type, which never has ectopic eye cells in the midbody or tail, and that the ectopic eye cells progress throughout the entire AP axis. However, we concur that a statistical test is a reasonable thing to show here and that is now included in the legend. The 3 components (in Figure 1F, now Figure 1D) where kept together with one figure label (D) for simplicity, but were rearranged (top and bottom) with a cartoon to the side and with modified labeling for extra clarity. 

      Figure 2C shows images of gene expression for two genes, but the counts are shown for only one in Figure 2D. It is challenging to follow the author's conclusions without apparent reasoning and by only displaying quantitative considerations for one case but not the other. These inconsistencies are also observed in different figures. 

      In Figure 2C, FISH images of cintillo+ and dd_17258+ neurons are shown to display the specificity of this effect to some neurons and not others. Because cintillo+ cells did not expand at all (n=24/24 animals), the counts for them would all be zero values. We only counted data for dd_17258 cells because it was the neuron that expanded compared to the control animals. We have now added a note in the legend explaining this.

      In Figure 2D, 24/24 animals were reported to show the phenotype, but only eight were counted (is there a reason for this?).

      8 animals were used to quantitatively characterize the spread of cells along the AP axis, as it was deemed an adequate sample size to capture the change in distribution of 17258+ cells from being head restricted to being present throughout the body. Through multiple cohorts of animals in replicates, a total of 24/24 examined animals showed this expansion phenotype. Double FISH experiments were additionally carried out using dd_17258 and various PCGs; these data are now included in Figure 5C, and these animals were added to the total counts regarding quantitative analysis of the phenotype in Figure 2D. 

      In Figure 2E, the expression for three genes is shown, with some displaying anterior and posterior regions while others only show the anterior picture. Is there a particular reason for this? 

      The original first panel in Figure 2E showed an example of a non-expanding gland cell type, dd_9223, which is very restricted to the head in both control and map3k1 RNAi animals. Because we did not observe a phenotype for this cell type (no cells in all control and map3k1 RNAi animal tails), we only included tail images of cell types that showed an abnormal phenotype with clear expanded to the posterior (dd_8476 and dd_7131). However, we have now included tail images of dd_9223 cells and added data for dd_9223 to the graph in Figure 2E. 

      Also, in Figure 2F, the counts are shown for only the posterior region of two genes out of the three displayed in Figure 2E. It is unclear why the authors do not show counts for the anterior areas considered in Figure 2E. Furthermore, the legend for Figure 2D is missing, and the legend for 2F is mislabeled as a description for Figure 2D.

      We now include tail images for dd_9223 in Figure 2E to show that there are no ectopic cells in tails. We did not originally include counts of dd_9223 because there was no phenotype observed. dd_7131 and dd_8476 cell types appeared in the posterior of even control animals at a low frequency, unlike dd_9223 cells. However, we did now add counts for dd_9223 tail regions in the graph. We did not count the anterior regions of the animal because our goal was to show data for the visible phenotype (ectopic cells in the tail) not only with an example image, but also by showing the number of cells in the tail with a graph and statistical test. Legends have been updated with correct details.

      Supplement Figure 1 B reports data up to 6 weeks, but no text in the manuscript or supplement mentions any experiment going up to 6 weeks. There are no statistics for data in Supplement Figure 1E. Any significance between groups is unclear.

      More details about the RNAi feeding schedules have been added in the methods section. All RNAi timepoints are now specified specifically in the legends. The Figure 1F and Figure 1- figure supplement 1E (additional data: ovo<sup>+</sup>; smedwi-1<sup>-</sup> cell counts) and legends now mention the statistical tests performed and annotations (not significant *ns) or p values have been added to the graphs. For simplicity, we decided to include all smedwi-1+ counts together rather than splitting them into low and high smedwi-1+ cells, because we weren't really making any claims about low and high cells. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      It would be good to acknowledge in the discussion the recent paper from the Petersen lab on map3k1, published in PLoS Genet 2025, especially if the results differ between the two labs.

      We added reference/discussion regarding the recent PLoS Genetics Lo, 2025 map3k1 paper at several suitable points in the manuscript.

      Reviewer #2 (Recommendations for the authors):

      Please pay close attention to the description of experimental details and the consistency throughout the paper. It seems like the reader has to assume or come across information that is not readily available from the text or the legends in the paper. This is an interesting paper with intriguing findings. However, the version presented here appears rushed or put together on the flight.

      Thank you for your thorough feedback. We have endeavored to ensure all appropriate details are present in figures and/or figure legends.

    1. Sita's purity departs from Vildi's verison (can't remember his name off the top of my head, supposedly the original version)

      *found it... Valmiki is the name

    1. As this chapter noted at the start, the Democratic Party’s embrace of civilrights in the latter decades of the twentieth century is key to understandingparty transformation in New England.

      Again progressivism rears its beautiful head

    Annotators

    1. Reviewer #2 (Public review):

      Summary:

      In this work, Gupta & Murphy present several parallel efforts. On one side, they present the hardware and software they use to build a head-fixed mouse experimental setup that they use to track in "real-time" the calcium activity in one or two spots at the surface of the cortex. On the other side, they present another setup that they use to take advantage of the "real-time" version of DeepLabCut with their mice. The hardware and software that they used/develop is described at length, both in the article and in a companion GitHub repository. Next, they present experimental work that they have done with these two setups, training mice to max out a virtual cursor to obtain a reward, by taking advantage of auditory tone feedback that is provided to the mice as they modulate either (1) their local cortical calcium activity, or (2) their limb position.

      Strengths:

      This work illustrates the fact that thanks to readily available experimental building blocks, body movement and calcium imaging can be carried out using readily available components, including imaging the brain using an incredibly cheap consumer electronics RGB camera (RGB Raspberry Pi Camera). It is a useful source of information for researchers that may be interested in building a similar setup, given the highly detailed overview of the system. Finally, it further confirms previous findings regarding the operant conditioning of the calcium dynamics at the surface of the cortex (Clancy et al. 2020) and suggests an alternative based on deeplabcut to the motor tasks that aim to image the brain at the mesoscale during forelimb movements (Quarta et al. 2022).

      Weaknesses:

      This work covers 3 separate research endeavors: (1) The development of two separate setups, their corresponding software. (2) A study that is highly inspired from the Clancy et al. 2021 paper on the modulation of the local cortical activity measured through a mesoscale calcium imaging setup. (3) A study of the mesoscale dynamics of the cortex during forelimb movements learning. Sadly, the analyses of the physiological data appears incomplete, and more generally, the paper shows weaknesses regarding several points:

      The behavioral setups that are presented are representative of the state of the art in the field of mesoscale imaging/head fixed behavior community, rather than a highly innovative design. Still, they definitely have value as a starting point for laboratories interested in implementing such approaches.

      Throughout the paper, there are several statements that point out how important it is to carry out this work in a closed-loop setting with an auditory feedback, but sadly there is no "no feedback" control in cortical conditioning experiments, while there is a no-feedback condition in the forelimb movement study, which shows that learning of the task can be achieved in the absence of feedback.

      The analysis of the closed-loop neuronal data behavior lacks controls. Increased performance can be achieved by modulating actively only one of the two ROIs, this is not really analyzed, while this finding which does not match previous reports (Clancy et al. 2020) would be important to further examine.

    2. Reviewer #3 (Public review):

      Summary:

      The study demonstrates the effectiveness of a cost-effective closed-loop feedback system for modulating brain activity and behavior in head-fixed mice. Authors have tested real-time closed-loop feedback system in head-fixed mice two types of graded feedback: 1) Closed-loop neurofeedback (CLNF), where feedback is derived from neuronal activity (calcium imaging), and 2) Closed-loop movement feedback (CLMF), where feedback is based on observed body movement. It is a python based opensource system, and the authors call it CLoPy. Authors also claim to provide all software, hardware schematics, and protocols to adapt it to various experimental scenarios. This system is capable and can be adapted for a wide use case scenarios.

      Authors have shown that their system can control both positive (water drop) and negative reinforcement (buzzer-vibrator). This study also shows that using the closed-loop system, mice have shown to better performance, learnt arbitrary tasks and can adapt to changes in the rules as well. By integrating real-time feedback based on cortical GCaMP imaging and behavior tracking authors have provided strong evidence that such closed-loop systems can be instrumental in exploring the dynamic interplay between brain activity and behavior.

      Strengths:

      Simplicity of feedback systems design. Simplicity of implementation and potential adoption.

      Weaknesses:

      Long latencies, due to slow Ca2+ dynamics and slow imaging (15 FPS), may limit the application of the system.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1 (Public review):

      Summary: 

      The authors provide a resource to the systems neuroscience community, by offering their Python-based CLoPy platform for closed-loop feedback training. In addition to using neural feedback, as is common in these experiments, they include a capability to use real-time movement extracted from DeepLabCut as the control signal. The methods and repository are detailed for those who wish to use this resource. Furthermore, they demonstrate the efficacy of their system through a series of mesoscale calcium imaging experiments. These experiments use a large number of cortical regions for the control signal in the neural feedback setup, while the movement feedback experiments are analyzed more extensively.

      Strengths:

      The primary strength of the paper is the availability of their CLoPy platform. Currently, most closed-loop operant conditioning experiments are custom built by each lab and carry a relatively large startup cost to get running. This platform lowers the barrier to entry for closed-loop operant conditioning experiments, in addition to making the experiments more accessible to those with less technical expertise.

      Another strength of the paper is the use of many different cortical regions as control signals for the neurofeedback experiments. Rodent operant conditioning experiments typically record from the motor cortex and maybe one other region. Here, the authors demonstrate that mice can volitionally control many different cortical regions not limited to those previously studied, recording across many regions in the same experiment. This demonstrates the relative flexibility of modulating neural dynamics, including in non-motor regions.

      Finally, adapting the closed-loop platform to use real-time movement as a control signal is a nice addition. Incorporating movement kinematics into operant conditioning experiments has been a challenge due to the increased technical difficulties of extracting real-time kinematic data from video data at a latency where it can be used as a control signal for operant conditioning. In this paper they demonstrate that the mice can learn the task using their forelimb position, at a rate that is quicker than the neurofeedback experiments.

      Weaknesses:

      There are several weaknesses in the paper that diminish the impact of its strengths. First, the value of the CLoPy platform is not clearly articulated to the systems neuroscience community. Similarly, the resource could be better positioned within the context of the broader open-source neuroscience community. For an example of how to better frame this resource in these contexts, I recommend consulting the pyControl paper. Improving this framing will likely increase the accessibility and interest of this paper to a less technical neuroscience audience, for instance by highlighting the types of experimental questions CLoPy can enable.

      We appreciate the editor’s feedback regarding the clarity of the CLoPy platform's value and its positioning within the broader neuroscience community. We agree and understand the importance of effectively communicating the utility of CLoPy to both the systems neuroscience field and the wider open-source neuroscience community.

      To address this, we have revised the introduction and discussion sections of the manuscript to more clearly articulate the unique contributions of the CLoPy platform. Specifically:

      (1) We have emphasized how CLoPy can address experimental questions in systems neuroscience by highlighting its ability to enable real-time closed-loop experiments, such as investigating neural dynamics during behavior or studying adaptive cortical reorganization after injury. These examples are aimed at demonstrating its practical utility to the neuroscience audience.

      (2) We have positioned CLoPy within the broader open-source neuroscience ecosystem, drawing comparisons to similar resources like pyControl. We describe how CLoPy complements existing tools by focusing on real-time optical feedback and integration with genetically encoded indicators, which are becoming increasingly popular in systems neuroscience. We also emphasize its modularity and ease of adoption in experimental settings with limited resources.

      (3) To make the manuscript more accessible to a less technically inclined audience, we have restructured certain sections to focus on the types of experiments CLoPy enables, rather than the technical details of the implementation.

      We have consulted the pyControl paper, as suggested, and have used it as a reference point to improve the framing of our resource. We believe these changes will increase the accessibility and appeal of the paper to a broader neuroscience audience.

      While the dataset contains an impressive amount of animals and cortical regions for the neurofeedback experiment, and an analysis of the movement-feedback experiments, my excitement for these experiments is tempered by the relative incompleteness of the dataset, as well as its description and analysis in the text. For instance, in the neurofeedback experiment, many of these regions only have data from a single mouse, limiting the conclusions that can be drawn. Additionally, there is a lack of reporting of the quantitative results in the text of the document, which is needed to better understand the degree of the results. Finally, the writing of the results section could use some work, as it currently reads more like a methods section.

      Thank you for your thoughtful and constructive feedback on our manuscript. We appreciate the time and effort you took to review our work and provide detailed suggestions for improvement. Below, we address the key points raised in your review:

      (1) Dataset Completeness: We acknowledge that some of the neurofeedback experiments include data from only a single mouse for some cortical regions while for some cortical regions, there are several animals. This was due to practical constraints during the study, and we understand the limitations this poses for drawing broad conclusions. We felt it was still important to include these data sets with smaller sample sizes as they might be useful for others pursuing this direction in the future. To address this, we have revised the text to explicitly acknowledge these limitations and clarify that the results for some regions are exploratory in nature. We believe our flexible tool will provide a means for our lab and others include more animals representing additional cortical regions in future studies. Importantly, we have included all raw and processed data as well as code for future analysis.

      (2) Quantitative Results: We recognize the importance of reporting quantitative results in the text for better clarity and interpretation. In response, we have added more detailed description of the quantitative findings from both the neurofeedback and movement-feedback experiments. This will include effect sizes, statistical measures, and key numerical results to provide a clearer understanding of the degree and significance of the observed effects.

      (3) Results Section Writing: We appreciate your observation that parts of the results section read more like a methods section. To improve clarity and focus, we have restructured the results section to present the findings in a more concise and interpretative manner, while moving overly detailed descriptions of experimental procedures to the methods section.

      Suggestions for improved or additional experiments, data or analyses:

      Not necessary for this paper, but it would be interesting to see if the CLNF group could learn without auditory feedback.

      This is a great suggestion and certainly something that could be done in the future.

      There are no quantitative results in the results section. I would add important results to help the reader better interpret the data. For example, in: "Our results indicated that both training paradigms were able to lead mice to obtain a significantly larger number of rewards over time," You could show a number, with an appropriate comparison or statistical test, to demonstrate that learning was observed.

      Thank you for pointing this out. We have mentioned quantification values in the results now, along with being mentioned in the figure legends, and we are quoting it in following sentences. “A ΔF/F0 threshold value was calculated from a baseline session on day 0 that would have allowed 25% performance. Starting from this basal performance of around 25% on day 1, mice (CLNF No-rule-change, N=23, n=60 and CLNF Rule-change, N=17, n=60) were able to discover the task rule and perform above 80% over ten days of training (Figure 4A, RM ANOVA p=2.83e-5), and Rule-change mice even learned a change in ROIs or rule reversal (Figure 4A, RM ANOVA p=8.3e-10, Table 5 for different rule changes). There were no significant differences between male and female mice (Supplementary Figure 3A).”

      For: "Performing this analysis indicated that the Raspberry Pi system could provide reliable graded feedback within ~63 {plus minus} 15 ms for CLNF experiments." The LED test shows the sending of the signal, but the actual delay for the audio generation might be longer. This is also longer than the 50 ms mentioned in the abstract.

      We appreciate the reviewer’s insightful comment. The latency reported (~63ms) was measured using the LED test, which captures the time from signal detection to output triggering on the Raspberry Pi GPIO. We agree that the total delay for auditory feedback generation could include an additional latency component related to the digital-to-analog conversion and speaker response. In our setup, we employ a fast Audiostream library written in C to generate the audio signal and expect the delay contribution to be negligible compared to the GPIO latency. Though we did not do this, it can be confirmed by an oscilloscope-based pilot measurement (for additional delay calculation). We have updated the manuscript to clarify that the 63 ± 15 ms value reflects the GPIO-triggered output latency, and we have revised the abstract to accurately state the delay as “~63 ms” rather than 50 ms. This ensures consistency and avoids underestimation of the latency. We have corrected the LED latency for CLNF and CLMF experiments in the abstract as well.

      It could be helpful to visualize an individual trial for each experiment type, for instance how the audio frequency changes as movement speed / calcium activity changes.

      We have added Supplementary Figure 8 that contains this data where you can see the target cortical activity trace, target paw speed, rewards, along with the audio frequency generated.

      The sample sizes are small (n=1) for a few groups. I am excited by the variety of regions recorded, so it could be beneficial for the authors to collect a few more animals to beef up the sample sizes.

      We've acknowledged that some of the sample sizes are small. Importantly, we have included raw and processed data as well as code for future analysis. We felt it was still important to still include these data sets with smaller sample sizes as they might be useful for others pursuing this direction in the future.

      I am curious as to why 60 trials sessions were used. Was it mostly for the convenience of a 30 min session, or were the animals getting satiated? If the former, would learning have occurred more rapidly with longer sessions?

      This is a great observation and the answer is it was mostly due to logistical reasons. We tried to not keep animals headfixed for more than 45 minutes in each session as they become less engaged with long duration headfixed sessions. After headfixing them, it takes about 15 minutes to get the experiment going and therefore 30 - 40 minutes long recorded sessions seemed appropriate before they stop being engaged or before they get satiated in the task. We provided supplemental water after the sessions and we observed that they consumed water after the sessions so they were not fully satiated during the sessions even when they performed well in the task and got maximum rewards. We also had inter-trial rest periods of 10s that elongated the session duration. We think it would be interesting to explore the relationship between session duration(number of trials) and task learning progression over the days in a separate study.

      Figure 4E is interesting, it seems like the changes in the distribution of deltaF was in both positive and negative directions, instead of just positive. I'd be curious as to the author's thoughts as to why this is the case. Relatedly, I don't see Figure 4E, and a few other subplots, mentioned in the text. As a general comment, I would address each subplot in the text.

      We have split Figure 4 into two to keep the figures more readable. Previous Figure 4E-H are now Figure 5A-D in the revised manuscript. The online real-time CLNF sessions were using a moving window average to calculate ΔF/F<sub>0</sub>  and the figures were generated by averaging the whole recorded sessions. We have added text in Methods under “Online ΔF/F<sub>0</sub>calculation” and “Offline ΔF/F<sub>0</sub> calculation” sections making it clear about how we do our ΔF/F<sub>0</sub> normalization based on average fluorescence over the entire session. Using this method of normalization does increase the baseline so that some peaks appear to be below zero. Additionally, it is unclear what strategy animals are employing to achieve the rule specific target activity. The task did not constrain them to have a specific strategy for cortical activation - they were rewarded as long as they crossed the threshold in target ROI(s). For example, in 2-ROI experiments, to increase ROI1-ROI2 target activity, they could increase activity of ROI1 relative to ROI2 or decreased activity of ROI1 relative to ROI1 - both would have led to a reward as long as the result crossed the threshold.

      We have now addressed and added reference to the figures in the text in Results under “Mice can explore and learn an arbitrary task, rule, and target conditions” and “Mice can rapidly adapt to changes in the task rule” sections - thanks for pointing this out.

      For: "In general, all ROIs assessed that encompassed sensory, pre-motor, and motor areas were capable of supporting increased reward rates over time," I would provide a visual summary showing the learning curves for the different types of regions.

      We have rewritten this section to emphasize that these conclusions were based on pooled data from multiple regions of interest. The sample sizes for each type of region are different and some are missing. We believe it would be incomplete and not comparable to present this as a regular analysis since the sample sizes were not balanced. We would be happy to dive deeper into this and point to the raw and processed dataset if anyone would like to explore this further by GitHub or other queries.

      Relatedly, I would further explain the fast vs slow learners, and if they mapped onto certain regions.

      Mice were categorized into fast or slow learners based on the slope of learning over days (reward progression over the days) as shown in Supplementary Figure 3C,D. Our initial aim was not to probe cortical regions that led to fast vs slow learning but this was a grouping we did afterwards. Based on the analysis we did, the fast learners included the sensory (V1), somatosensory (BC, HL), and motor (M1, M2) areas, while the slow learners included the motor (M1, M2), and higher order (TR, RL) cortical areas. Testing all dorsal cortical areas would be prudent to establish their role in fast or slow learning and it is an interesting future direction.

      Also I would make the labels for these plots (e.g. Supp Fig3) more intuitive, versus the acronyms currently used.

      We have made more expressive labels and explained the acronyms below the Supplementary Figure 3.

      The CLMF animals showed a decrease in latency across learning, what about the CLNF animals? There is currently no mention in the text or figures.

      We have now incorporated the CLNF task latency data into both the Results text and Figure 4C. Briefly, task latency decreased as performance improved, increased following a rule change, and then decreased again as the animals relearned the task. The previous Figure 4C has been updated to Figure 4D, and the former Figure 4D has been moved to Supplementary Figure 4E.

      Reviewer #2 (Public review):

      Summary:

      In this work, Gupta & Murphy present several parallel efforts. On one side, they present the hardware and software they use to build a head-fixed mouse experimental setup that they use to track in "real-time" the calcium activity in one or two spots at the surface of the cortex. On the other side, the present another setup that they use to take advantage of the "real-time" version of DeepLabCut with their mice. The hardware and software that they used/develop is described at length, both in the article and in a companion GitHub repository. Next, they present experimental work that they have done with these two setups, training mice to max out a virtual cursor to obtain a reward, by taking advantage of auditory tone feedback that is provided to the mice as they modulate either (1) their local cortical calcium activity, or (2) their limb position.

      Strengths:

      This work illustrates the fact that thanks to readily available experimental building blocks, body movement and calcium imaging can be carried using readily available components, including imaging the brain using an incredibly cheap consumer electronics RGB camera (RGB Raspberry Pi Camera). It is a useful source of information for researchers that may be interested in building a similar setup, given the highly detailed overview of the system. Finally, it further confirms previous findings regarding the operant conditioning of the calcium dynamics at the surface of the cortex (Clancy et al. 2020) and suggests an alternative based on deeplabcut to the motor tasks that aim to image the brain at the mesoscale during forelimb movements (Quarta et al. 2022).

      Weaknesses:

      This work covers 3 separate research endeavors: (1) The development of two separate setups, their corresponding software. (2) A study that is highly inspired from the Clancy et al. 2020 paper on the modulation of the local cortical activity measured through a mesoscale calcium imaging setup. (3) A study of the mesoscale dynamics of the cortex during forelimb movements learning. Sadly, the analyses of the physiological data appears uncomplete, and more generally the paper tends to offer overstatements regarding several points:

      In contrast to the introductory statements of the article, closed-loop physiology in rodents is a well-established research topic. Beyond auditory feedback, this includes optogenetic feedback (O'Connor et al. 2013, Abbasi et al. 2018, 2023), electrical feedback in hippocampus (Girardeau et al. 2009), and much more.

      We have included and referenced these papers in our introduction section (quoted below) and rephrased the part where our previous text indicated there are fewer studies involving closed-loop physiology.

      “Some related studies have demonstrated the feasibility of closed-loop feedback in rodents, including hippocampal electrical feedback to disrupt memory consolidation (Girardeau et al.2009), optogenetic perturbations of somatosensory circuits during behavior (O'Connor et al.2013), and more recent advances employing targeted optogenetic interventions to guide behavior (Abbasi et al. 2023).”

      The behavioral setups that are presented are representative of the state of the art in the field of mesoscale imaging/head fixed behavior community, rather than a highly innovative design. In particular, the closed-loop latency that they achieve (>60 ms) may be perceived by the mice. This is in contrast with other available closed-loop setups.

      We thank the reviewer for this thoughtful comment and fully agree that our closed-loop latency is larger than that achieved in some other contemporary setups. Our primary aim in presenting this work, however, is not to compete with the lowest possible latencies, but to provide an open-source, accessible, and flexible platform that can be readily adopted by a broad range of laboratories. By building on widely available and lower-cost components, our design lowers the barrier of entry for groups that wish to implement closed-loop imaging and behavioral experiments, while still achieving latencies well within the range that can support many biologically meaningful applications.

      For example, our latency (~60 ms) remains compatible with experimental paradigms such as:

      Motor learning and skill acquisition, where sensorimotor feedback on the scale of tens to hundreds of milliseconds is sufficient to modulate performance.

      Operant conditioning and reward-based learning, in which reinforcement timing windows are typically broader and not critically dependent on sub-20 ms latencies.

      Cortical state dependent modulation, where feedback linked to slower fluctuations in brain activity (hundreds of milliseconds to seconds) can provide valuable insight.

      Studies of perception and decision-making, in which stimulus response associations often unfold on behavioral timescales longer than tens of milliseconds.

      We believe that emphasizing openness, affordability, and flexibility will encourage widespread adoption and adaptation of our setup across laboratories with different research foci. In this way, our contribution complements rather than competes with ultra-low-latency closed-loop systems, providing a practical option for diverse experimental needs.

      Through the paper, there are several statements that point out how important it is to carry out this work in a closed-loop setting with an auditory feedback, but sadly there is no "no feedback" control in cortical conditioning experiments, while there is a no-feedback condition in the forelimb movement study, which shows that learning of the task can be achieved in the absence of feedback.

      We fully agree that such a control would provide valuable insight into the contribution of feedback to learning in the CLNF paradigm. In designing our initial experiments, we envisioned multiple potential control conditions, including No-feedback and Random-feedback. However, our first and primary objective was to establish whether mice could indeed learn to modulate cortical ROI activation through auditory feedback, and to further investigate this across multiple cortical regions. For this reason, we focused on implementing the CLNF paradigm directly, without the inclusion of these additional control groups. To broaden the applicability of the system, we subsequently adapted the platform to the CLMF experiments, where we did incorporate a No-feedback group. These results, as the reviewer notes, strengthen the evidence for the role of feedback in shaping task performance. We agree that the inclusion of a No-feedback control group in the CLNF paradigm will be crucial in future studies to further dissect the specific contribution of feedback to cortical conditioning.

      The analysis of the closed-loop neuronal data behavior lacks controls. Increased performance can be achieved by modulating actively only one of the two ROIs, this is not clearly analyzed (for instance looking at the timing of the calcium signal modulation across the two ROIs. It seems that overall ROIs1 and 2 covariate, in contrast to Clancy et al. 2020. How can this be explained?

      We agree that the possibility of increased performance being driven by modulation of a single ROI is an important consideration. Our study indeed began with 1-ROI closed-loop experiments. In those early experiments, while we did observe animals improving performance across days, we realized that daily variability in ongoing cortical GCaMP activity could lead to fluctuations in threshold-crossing events. The 2-ROI design was subsequently introduced to reduce this variability, as the target activity was defined as the relative activity between the two ROIs (e.g., ROI1 – ROI2). This approach offered a more stable signal by normalizing ongoing fluctuations. In our analysis of the early 2-ROI experiments, we observed that animals adopted diverging strategies to achieve threshold crossings. Specifically, some animals increased activity in ROI1 relative to ROI2, while others decreased activity in ROI2 to accomplish the same effect. Once discovered, each animal consistently adhered to its chosen strategy throughout subsequent training sessions. This was an early and intriguing observation, but as the experiments were not originally designed to systematically test this effect, we limited our presentation to the analysis of a small number of animals (shown in Figure 11). We have added details about this observation in our Results section as well, quoted below-

      “In the 2-ROI experiment where the task rule required “ROI1 - ROI2” activity to cross a threshold for reward delivery, mice displayed divergent strategies. Some animals predominantly increased ROI1 activity, whereas others reduced ROI2 activity, both approaches leading to successful threshold crossing (Figure 11)”.

      We hope this clarifies how the use of two ROIs helps explain the apparent covariation of the signals, and why some divergence from the observations of Clancy et al. (2020) may be expected.

      Reviewer #3 (Public review):

      Summary:

      The study demonstrates the effectiveness of a cost-effective closed-loop feedback system for modulating brain activity and behavior in head-fixed mice. Authors have tested real-time closed-loop feedback system in head-fixed mice two types of graded feedback: 1) Closed-loop neurofeedback (CLNF), where feedback is derived from neuronal activity (calcium imaging), and 2) Closed-loop movement feedback (CLMF), where feedback is based on observed body movement. It is a python based opensource system, and authors call it CLoPy. The authors also claim to provide all software, hardware schematics, and protocols to adapt it to various experimental scenarios. This system is capable and can be adapted for a wide use case scenario.

      Authors have shown that their system can control both positive (water drop) and negative reinforcement (buzzer-vibrator). This study also shows that using the close loop system mice have shown better performance, learnt arbitrary task and can adapt to change in the rule as well. By integrating real-time feedback based on cortical GCaMP imaging and behavior tracking authors have provided strong evidence that such closed-loop systems can be instrumental in exploring the dynamic interplay between brain activity and behavior.

      Strengths:

      Simplicity of feedback systems designed. Simplicity of implementation and potential adoption.

      Weaknesses:

      Long latencies, due to slow Ca2+ dynamics and slow imaging (15 FPS), may limit the application of the system.

      We appreciate the reviewer’s comment and agree that latency is an important factor in our setup. The latency arises partly from the inherent slow kinetics of calcium signaling and GCaMP6s, and partly from the imaging rate of 15 FPS (every 66 ms). These limitations can be addressed in several ways: for example, using faster calcium indicators such as GCaMP8f, or adapting the system to electrophysiological signals, which would require additional processing capacity. In our implementation, image acquisition was fixed at 15 FPS to enable real-time frame processing (256 × 256 resolution) on Raspberry Pi 4B devices. With newer hardware, such as the Raspberry Pi 5, substantially higher acquisition and processing rates are feasible (although we have not yet benchmarked this extensively). More powerful platforms such as Nvidia Jetson or conventional PCs would further support much faster data acquisition and processing.

      Major comments:

      (1) Page 5 paragraph 1: "We tested our CLNF system on Raspberry Pi for its compactness, general-purpose input/output (GPIO) programmability, and wide community support, while the CLMF system was tested on an Nvidia Jetson GPU device." Can these programs and hardware be integrated with windows-based system and a microcontroller (Arduino/ Tency). As for the broad adaptability that's what a lot of labs would already have (please comment/discuss)?

      While we tested our CLNF system on a Raspberry Pi (chosen for its compactness, GPIO programmability, and large user community) and our CLMF system on an Nvidia Jetson GPU device (to leverage real-time GPU-based inference), the underlying software is fully written in Python. This design choice makes the system broadly adaptable: it can be run on any device capable of executing Python scripts, including Windows-based PCs, Linux machines, and macOS systems. For hardware integration, we have confirmed that the framework works seamlessly with microcontrollers such as Arduino or Teensy, requiring only minor modifications to the main script to enable sending and receiving of GPIO signals through those boards. In fact, we are already using the same system in an in-house project on a Linux-based PC where an Arduino is connected to the computer to provide GPIO functionality. Furthermore, the system is not limited to Raspberry Pi or Arduino boards; it can be interfaced with any GPIO-capable devices, including those from Adafruit and other microcontroller platforms, depending on what is readily available in individual labs. Since many neuroscience and engineering laboratories already possess such hardware, we believe this design ensures broad accessibility and ease of integration across diverse experimental setups.

      (2) Hardware Constraints: The reliance on Raspberry Pi and Nvidia Jetson (is expensive) for real-time processing could introduce latency issues (~63 ms for CLNF and ~67 ms for CLMF). This latency might limit precision for faster or more complex behaviors, which authors should discuss in the discussion section.

      In our system, we measured latencies of approximately ~63 ms for CLNF and ~67 ms for CLMF. While such latencies indeed limit applications requiring millisecond precision, such as fast whisker movements, saccades, or fine-reaching kinematics, we emphasize that many relevant behaviors, including postural adjustments, limb movements, locomotion, and sustained cortical state changes, occur on timescales that are well within the capture range of our system. Thus, our platform is appropriate for a range of mesoscale behavioral studies that probably needs to be discussed more. It is also important to note that these latencies are not solely dictated by hardware constraints. A significant component arises from the inherent biological dynamics of the calcium indicator (GCaMP6s) and calcium signaling itself, which introduce slower temporal kinetics independent of processing delays. Newer variants, such as GCaMP8f, offer faster response times and could further reduce effective biological latency in future implementations.

      With respect to hardware, we acknowledge that Raspberry Pi provides a low-cost solution but contributes to modest computational delays, while Nvidia Jetson offers faster inference at higher cost. Our choice reflects a balance between accessibility, cost-effectiveness, and performance, making the system deployable in many laboratories. Importantly, the modular and open-source design means the pipeline can readily be adapted to higher-performance GPUs or integrated with electrophysiological recordings, which provide higher temporal resolution. Finally, we agree with the reviewer that the issue of latency highlights deeper and interesting questions regarding the temporal requirements of behavior classification. Specifically, how much data (in time) is required to reliably identify a behavior, and what is the minimum feedback delay necessary to alter neural or behavioral trajectories? These are critical questions for the design of future closed-loop systems and ones that our work helps frame.

      We have added a slightly modified version of our response above in the discussion section under “Experimental applications and implications”.

      (3) Neurofeedback Specificity: The task focuses on mesoscale imaging and ignores finer spatiotemporal details. Sub-second events might be significant in more nuanced behaviors. Can this be discussed in the discussion section?

      This is a great point  and we have added the following to the discussion section. “In the case of CLNF we have focused on regional cortical GCAMP signals that are relatively slow in kinetics. While such changes are well suited for transcranial mesoscale imaging assessment, it is possible that cellular 2-photon imaging (Yu et al. 2021) or preparations that employ cleared crystal skulls (Kim et al. 2016) could resolve more localized and higher frequency kinetic signatures.”

      (4) The activity over 6s is being averaged to determine if the threshold is being crossed before the reward is delivered. This is a rather long duration of time during which the mice may be exhibiting stereotyped behaviors that may result in the changes in DFF that are being observed. It would be interesting for the authors to compare (if data is available) the behavior of the mice in trials where they successfully crossed the threshold for reward delivery and in those trials where the threshold was not breached. How is this different from spontaneous behavior and behaviors exhibited when they are performing the test with CLNF? 

      We would like to emphasize that we are not directly averaging activity over 6 s to compare against the reward threshold. Instead, the preceding 6 s of activity is used solely to compute a dynamic baseline for ΔF/F<sub>0</sub> ( ΔF/F<sub>0</sub> = (F –F<sub>0</sub> )/F<sub>0</sub>). Here, F<sub>0</sub>is calculated as the mean fluorescence intensity over the prior 6 s window and is updated continuously throughout the session. This baseline is then subtracted from the instantaneous fluorescence signal to detect relative changes in activity. The reward threshold is therefore evaluated against these baseline-corrected ΔF/F<sub>0</sub> values at the current time point, not against an average over 6 s. This moving-window baseline correction is a standard approach in calcium imaging analyses, as it helps control for slow drifts in signal intensity, bleaching effects, or ongoing fluctuations unrelated to the behavior of interest. Thus, the 6-s window is not introducing a temporal lag in reward assignment but is instead providing a reference to detect rapid increases in cortical activity.  We have added the term dynamic baseline to the Methods to clarify.

      Recommendations for the authors

      Reviewer #1 (Recommendations for the authors):

      Additional suggestions for improved or additional experiments, data or analyses.

      For: "Looking closely at their reward rate on day 5 (day of rule change), they had a higher reward rate in the second half of the session as compared to the first half, indicating they were adapting to the rule change within one session." It would be helpful to see this data, and would be good to see within-session learning on the rule change day

      Thank you for pointing this out. We had missed referencing the figure in the text, and have now added a citation to Supplementary Figure 4A, which shows the cumulative rewards for each day of training. As seen in the plot for day 5, the cumulative rewards are comparable to those on day 1, with most rewards occurring during the second half of the session.

      For: "These results suggest that motor learning led to less cortical activation across multiple regions, which may reflect more efficient processing of movement-related activity," it could also be the case that the behaviour became more stereotyped over learning, which would lead to more concentrated, correlated activity. To test this, it would be good to look at the limb variability across sessions. Similarly, if it is movement-related, there should be good decoding of limb kinematics.

      Indeed, we observed that behavior became more stereotyped over the course of learning, as shown in Supplementary Figure 4C, 4D. One plausible explanation for the reduction in cortical activation across multiple regions is that behavior itself became more stereotyped, a possibility we have explored in the manuscript. Specifically, forelimb movements during the trial became increasingly correlated as mice improved on the task, particularly in the groups that received auditory feedback (Rule-change and No-rule-change groups; Figure 8). As movements became more correlated, overall body movements during trials decreased and aligned more closely with the task rule (Figure 9D). This suggests that reduced cortical activity may in part reflect changes in behavior. Importantly, however, in the Rule-change group, we observed that on the day of the rule switch (day 5), when the target shifted from the left to the right forelimb, cortical activity increased bilaterally (Figure 9A–C). This finding highlights our central point: groups that received feedback (Rule-change and No-rule-change) were able to identify the task rule more effectively, and both their behavior and cortical activity became more specifically aligned with the rule compared to the No-feedback group. We agree with the reviewers that additional analyses along these lines would be valuable future directions. To facilitate this, we have included the movement data for readers who may wish to pursue further analyses, details can be found under “Data and code availability” in Methods section. However, given the limited sample sizes in our dataset and the need to keep the manuscript focused on the central message, we felt that including these additional analyses here would risk obscuring the main findings.

      For: "We believe the decrease in ΔF/F0peak is unlikely to be driven by changes in movement, as movement amplitudes did not decrease significantly during these periods (Figure 7D CLMF Rule-change)." I would formally compare the two conditions. This is an important control. Also, another way to see if the change in deltaF is related to movement would be to see if you can predict movement from the deltaF.

      Figure 7D in the previous version is Figure 9D in the current revision of the manuscript. We've assessed this for the examples shown based on graphing the movement data, unfortunately there is not enough of that data to do a group analysis of movement magnitude. We would suggest that this would be an excellent future direction that would take advantage of the flexible open source nature of our tool.

      Recommendations for improving the writing and presentation.

      In the abstract there is no mention of the rationale for the project, or the resulting significance. I would modify this to increase readership by the behavioral neuroscience community. Similarly, the introduction also doesn't highlight the value of this resource for the field. Again, I think the pyControl paper does a good job of this. For readability, I would add more subheadings earlier in the results, to separate the different technical aspects of the system.

      We have revised the introduction to include the rationale for the project, its potential implications, and its relevance for translational research. We have also framed the work within the broader context of the behavioral and systems neuroscience community. We greatly appreciate this suggestion, as we believe it enhances the clarity and accessibility of the manuscript for the community.

      For: "While brain activity can be controlled through feedback, other variables such as movements have been less studied, in part because their analysis in real time is more challenging." I would highlight research that has studied the control of behavior through feedback, such as the Mathis paper where mice learn to pull a joystick to a virtual box, and adapt this motion to a force perturbation.

      We have added a citation to the Mathis paper and describe this as an additional form of feedback. The text is quoted below:

      “Opportunities also exist in extending real time pose classification (Forys et al. 2020; Kane et al. 2020) and movement perturbation (Mathis et al. 2017) to shape aspects of an animal’s motor repertoire.”

      Some of the results content would be better suited for the methods, one example: "A previous version of the CLNF system was found to have non-linear audio generation above 10 kHz, partly due to problems in the audio generation library and partly due to the consumer-grade speaker hardware we were employing. This was fixed by switching to the Audiostream (https://github.com/kivy/audiostream) library for audio generation and testing the speakers to make sure they could output the commanded frequencies"

      This is now moved to the Methods section.

      For: "There are reports of cortical plasticity during motor learning tasks, both at cellular and mesoscopic scales (17-19), supporting the idea that neural efficiency could improve with learning," not sure I agree with this, the studies on cortical plasticity are usually to show a neural basis for the learning observed, efficiency is separate from this.

      We have modified this statement to remove the concept of efficiency "There are reports of cortical plasticity during motor learning tasks, both at cellular and mesoscopic scales (17-19).”

      The paragraph that opens "Distinct task- and reward-related cortical dynamics" that describes the experiment should appear in the previous section, as the data is introduced there.

      We have moved the mentioned paragraphs in the previous section where we presented the data and other experiment details. This makes the text more readable and contextual.

      I would present the different ROI rules with better descriptors and visualization to improve the readability.

      We have added Supplementary Figure 7, which provides visualizations of the ROIs across all task rules used in the CLNF experiments.

      Minor corrections to the text and figures.

      Figure 1 is a little crowded, combining the CLNF and CLMF experiments, I would turn this into a 2 panel figure, one for each, similar to how you did figure 2.

      We have revised Figure 1 to include two panels, one for CLNF and one for CLMF. The colored components indicate elements specific to each setup, while the uncolored components represent elements shared between CLNF and CLMF. Relevant text in the manuscript is updated to refer to these figures.

      For Figure 2, the organization of the CLMF section is not intuitive for the reader. I would reorder it so it has a similar flow as the CLNF experiment.

      We have revised the figure by updating the layout of panel B (CLMF) to align with panel A (CLNF), thereby creating a more intuitive and consistent flow between the panels. We appreciate this helpful suggestion, which we believe has substantially improved the clarity of the figure. The corresponding text in the manuscript has also been updated to reflect these changes.

      For Figure 3, highlight that C and E are examples. They also seem a little out of place, so they could even be removed.

      We have now explicitly labeled Figures 3C and 3E as representative examples (figure legend and on figure itself). We believe including these panels provides helpful context for readers: Figure 3C illustrates how the ROIs align on the dorsal cortical brain map with segmented cortical regions, while Figure 3E shows example paw trajectories in three dimensions, allowing visualization of the movement patterns observed during the trials.

      In the plots, I would add sample sizes, for instance, in CLNF learning curve in Figure 4A, how many animals are in each group? 

      We have labeled Figure 4 with number of animals used in CLNF (No-rule-change, N=23; Rule-change, N=17), and CLMF (Rule-change, N=8; No-rule-change, N=4; No-feedback, N=4).

      Also, Figure 7 for example, which figures are single-sessions, versus across animals? For Figure 7c, what time bin is the data taken from?

      We have clarified this now and mentioned it in all the figures. Figure 7 in the previous version is Figure 9 in the current updated manuscript. Figure 9A is from individual sessions on different days from the same mouse. Figure 9B is the group average reward centered ΔF/F<sub>0</sub> activity in different cortical regions (Rule-change, N=8; No-rule-change, N=4; No-feedback, N=4). Figure 9C shows average ΔF/F<sub>0</sub> peak values obtained within -1sec to +1sec centered around the reward point (N=8).

      It says "punish" in Figure 3, but there is no punishment?

      Yes, the task did not involve punishment. Each trial resulted in either a success, which is followed by a reward, or a failure, which is followed by a buzzer sound. To better reflect these outcomes, we have updated Figure 3 and replaced the labels “Reward” with “Success” and “Punish” with “Failure.”

      The regression on 5c doesn't look quite right, also this panel is not mentioned in the text.

      The figure referred to by the reviewer as Figure 5 is now presented as Figure 6 in the revised manuscript. Regarding the reviewer’s observation about the regression line in the left panel of Figure 5C, the apparent misalignment arises because the majority of the data points are densely clustered at the center of the scatter plot, where they overlap substantially. The regression line accurately reflects this concentration of overlapping data. To improve clarity, we have updated the figure and ensured that it is now appropriately referenced in the Results section.

      Reviewer #2 (Recommendations for the authors):

      (1) There would be many interesting observations and links between the peripheral and cortical studies if there was a body video available during the cortical study. Is there any such data available?

      We agree that a detailed analysis of behavior during the CLNF task would be necessary to explore any behavior correlates with success in the task. Unfortunately, we do not have a sufficient video of the whole body to perform such an analysis.

      (2) The text (p. 24) states: [intracortical GCAMP transients measured over days became more stereotyped in kinetics and were more correlated (to each other) as the task performance increased over the sessions (Figure 7E).] But I cannot find this quantification in the figures or text?

      Figure 7 in the previous version of the manuscript now appears as Figure 9. In this figure, we present cortical activity across selected regions during trials, and in Figure 9E we highlight that this activity becomes more correlated. Since we did not formally quantify variability, we have removed the previous claim that the activity became stereotyped and revised the text in the updated manuscript accordingly.

      Typos:

      10-serest c (page 13)

      Inverted color codes in figure 4E vs F

      Reviewer #3 (Recommendations for the authors):

      We have mostly attempted to limit the feedback to suggestions and posed a few questions that might be interesting to explore given the dataset the authors have collected.

      Comments:

      In close loop systems the latency is primary concern, and authors have successfully tested the latency of the system (Delay): from detection of an event to the reaction time was less than 67ms.

      We have commented on the issues and limitations caused by latency, and potential future directions to overcome these challenges in responses to some of the previous comments.

      Additional major comments:

      "In general, all ROIs assessed that encompassed sensory, pre-motor, and motor areas were capable of supporting increased reward rates over time (Figure 4A, Animation 1)." Fig 4A is merely showing change in task performance over time and does not have information regarding the changes observed specific to CLNF for each ROI.

      We acknowledge that the sample size for individual ROI rules was not sufficient for meaningful comparisons. To address this limitation, we pooled the data across all the rules tested. The manuscript includes a detailed list of the rules along with their corresponding sample sizes for transparency.

      A ΔF/F<sub>0</sub> threshold value was calculated from a baseline session on day 0 that would have allowed 25% performance. Starting from this basal performance of around 25% on day 1, mice (CLNF No-rule-change, n=28 and CLNF Rule-change, n=13). It is unclear what the replicates here are. Trials or mice? The corresponding Figure legend has a much smaller n value.

      Thank you for pointing this out. We realized that we had not indicated the sample replicates in the figure, and the use of n instead of N for the number of animals may have been misleading. We have now corrected the notation and clarified this information in the figure to resolve the discrepancy.

      What were the replicates for each ROI pairs evaluated?

      Each ROI rule and number of mice and trials are listed in Table 5 and Table 6.

      Our analysis revealed that certain ROI rules (see description in methods) lead to a greater increase in success rate over time than others (Supplementary Figure 3D). The Supplementary figures 3C and 3D are blurry and could use higher resolution images. 

      We have increased the font size of the text that was previously difficult to read and re-exported the figure at a higher resolution (300 DPI). We believe these changes will resolve the issue.

      Also, It will help the reader is a visual representation of the ROI pairs are provided, instead of the text view. One interesting question is whether there are anatomical biases to fast vs slow learning pairs (Directionality - anterior/posterior, distance between the selected ROIs etc). This could be interesting to tease apart.

      We have added Supplementary Figure 7, which provides visualizations of the ROIs across all task rules used in the CLNF experiments. While a detailed investigation of the anatomical basis of fast versus slow learning cortical ROIs is beyond the scope of the present study, we agree that this represents an exciting future direction for further research.

      How distant should the ROIs be to achieve increased task performance?

      We appreciate this insightful question. We did not specifically test this scenario. In our study, we selected 0.3 × 0.3 mm ROIs centered on the standard AIBS mouse brain atlas (CCF). At this resolution, ROIs do not overlap, regardless of their placement in a two-ROI experiment. Furthermore, because our threshold calculations are based on baseline recordings, we expect the system would function for any combination of ROI placements. Nonetheless, exploring this systematically would be an interesting avenue for future experiments.

      Figures:

      I would leave out some of the methodological details such as the protocol for water restriction (Fig. 3) out of the legend. This will help with readability.

      We have removed some of the methodological details, including those mentioned above, from the legend of Figure 3 in the updated manuscript.

      Fig 1 and Fig 2: In my opinion, It would be easier for the reader if the current Fig. 2, which provides a high level description of CLNF and CLBF is presented as Fig. 1. The current Fig. 1, goes into a lot of methodological implementation details, and also includes a lot of programming jargon that is being introduced early in the paper that is hard to digest early on in the paper's narrative.

      Thank you for the suggestion. In the new manuscript, Figure 1 and Figure 2 have been swapped.

      Higher-resolution images/ plots are needed in many instances. Unsure if this is the pdf compression done by the manuscript portal that is causing this.

      All figures were prepared in vector graphics format using the open-source software Inkscape. For this manuscript, we exported the images at 300 DPI, which is generally sufficient for publication-quality documents. The submission portal may apply additional processing, which could have resulted in a reduction in image quality. We will carefully review the final submission files and ensure that all figures are clear and of high quality.

      The authors repeatedly show ROI specific analysis M1_L, F1_R etc. It will be helpful to provide a key, even if redundant in all figures to help the reader.

      We have now included keys to all such abbreviations in all the figures.

      There are also instances of editorialization and interpretation e.g., "Surprisingly, the "Rule-change" mice were able to discover the change in rule and started performing above 70% within a day of the rule change, on day 6" that would be more appropriate in the main body of the paper.

      Thank you for pointing this out in the figure legend, and we have removed it now since we already discussed this in the Results.

      Minor comments

      (1) The description of Figure 1 is hard to follow and can be described better based on how the information is processed and executed in the system from source to processing and back. Using separated colors (instead of shaded of grey) for the neuro feedback and movement feedback would help as well. Common components could have a different color. The specification like the description of the config file should come later.

      Figure 1 in the previous version is Figure 2 in the updated version. We have taken suggestions from other reviewers and made the figure easier to understand and split it into two panels with color coding Green for CLNF, Pink for CLMF specific parts while common shared parts are left without any color.

      (2) Page 20 last paragraph:

      Authors are neglecting that the rule change is done one day prior and the results that you see in the second half on the 6th day are not just because of the first half of the 6th day instead combined training on the 5th day (rule change) and then the first half of the 6th day. Rephrasing this observation is essential.

      We have revised the text for clarity to indicate that the performance increase observed on day 6 is not necessarily attributable to training on that day. In fact, we noted and mentioned that mice began to perform the task better during the second half of the session on day 5 itself.

      (3)  The method section description of the CLMF setup (Page no 39 first paragraph) is more detailed, a diagram of this setup would make it easy to follow and a better read.

      We have made changes to the CLMF setup (Figure 1B) and CLMF schematic (Figure 2B) to make it easier to understand parts of the setup and flow of control.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Bansal et al examine and characterize feeding behaviour in Anopheles stephensi mosquitoes. While sharing some similarities to the well-studied Aedes aegypti mosquito, the authors demonstrate that mated-females, but not unmated (virgin) females, exhibit suppression in their blood-feeding behaviour. Using brain transcriptomic analysis comparing sugar fed, blood fed and starved mosquitoes, several candidate genes potentially responsible for influencing blood-feeding behaviour were identified, including two neuropeptides (short NPF and RYamide) that are known to modulate feeding behaviour in other mosquito species. Using molecular tools including in situ hybridization, the authors map the distribution of cells producing these neuropeptides in the nervous system and in the gut. Further, by implementing systemic RNA interference (RNAi), the study suggests that both neuropeptides appear to promote blood-feeding (but do not impact sugar feeding) although the impact was observed only after both neuropeptide genes underwent knockdown.

      While the authors have addressed most of the concerns of the original manuscript, a few issues remain. Particularly, the following two points:

      (5) Figure 4

      The authors state that there is more efficient knockdown in the head of unfed females; however, this is not accurate since they only get knockdown in unfed animals, and no evidence of any knockdown in fed animals (panel D). This point should be revised in the results test as well.

      Perhaps we do not understand the reviewer's point or there has been a misunderstanding. In Figure 4D, we show that while there is more robust gene knockdown in unfed females, blood-fed females also showed modest but measurable knockdowns ranging from 5-40% for RYamide and 2-21% for sNPF.

      NEW-

      In both the dsRNA treatments where animals were fed, neither was significantly different from control. Therefore, there is no change, and indeed this is confirmed by the author's labelling of the figure stats in panel 4D.

      In addition, do the uninjected and dsGFP-injected relative mRNA expression data reflect combined RYa and sNPF levels? Why is there no variation in these data,...

      In these qPCRs, we calculated relative mRNA expression using the delta-delta Ct method (see line 975). For each neuropeptide its respective control was used. For simplicity, we combined the RYa and sNPF control data into a single representation. The value of this control is invariant because this method sets the control baseline to a value of 1.

      NEW-

      The authors are claiming that there is no variation between individual qPCR experiments (particularly in their controls)? Normally, one uses a known standard value (or calibrator) across multiple experiments/plates so that variation across biological replicates can be assessed. This has an impact on statistical analyses since there is no variation in the control data. Indeed, this impacts all figures/datasets in the manuscript where qPCR data is presented. All the controls have zero variation!

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Bansal et al. present a study on the fundamental blood and nectar feeding behaviors of the critical disease vector, Anopheles stephensi. The study encompasses not just the fundamental changes in blood feeding behaviors of the crucially understudied vector, but then uses a transcriptomic approach to identify candidate neuromodulation pathways which influence blood feeding behavior in this mosquito species. The authors then provide evidence through RNAi knockdown of candidate pathways that the neuromodulators sNPF and Rya modulate feeding either via their physiological activity in the brain alone or through joint physiological activity along the brain-gut axis (but critically not the gut alone). Overall, I found this study to be built on tractable, well-designed behavioral experiments.

      Their study begins with a well-structured experiment to assess how the feeding behaviors of A. stephensi change over the course of its life history and in response to its age, mating, and oviposition status. The authors are careful and validate their experimental paradigm in the more well-studied Ae. aegypti, and are able to recapitulate the results of prior studies, which show that mating is a prerequisite for blood feeding behaviors in Ae. aegypt. Here they find A. Stephensi, like other Anopheline mosquitoes, has a more nuanced regulation of its blood and nectar feeding behaviors.

      The authors then go on to show in a Y-maze olfactometer that ,to some degree, changes in blood feeding status depend on behavioral modulation to host cues, and this is not likely to be a simple change to the biting behaviors alone. I was especially struck by the swap in valence of the host cues for the blood-fed and mated individuals, which had not yet oviposited. This indicates that there is a change in behavior that is not simply desensitization to host cues while navigating in flight, but something much more exciting is happening.

      The authors then use a transcriptomic approach to identify candidate genes in the blood-feeding stages of the mosquito's life cycle to identify a list of 9 candidates that have a role in regulating the host-seeking status of A. stephensi. Then, through investigations of gene knockdown of candidates, they identify the dual action of RYa and sNPF and candidate neuromodulators of host-seeking in this species. Overall, I found the experiments to be well-designed. I found the molecular approach to be sound. While I do not think the molecular approach is necessarily an all-encompassing mechanism identification (owing mostly to the fact that genetic resources are not yet available in A. stephensi as they are in other dipteran models), I think it sets up a rich line of research questions for the neurobiology of mosquito behavioral plasticity and comparative evolution of neuromodulator action.

      We appreciate the reviewer’s detailed summary of our work. We thank them for their positive comments and agree with them on the shortcomings of our approach.

      Strengths:

      I am especially impressed by the authors' attention to small details in the course of this article. As I read and evaluated this article, I continued to think about how many crucial details could potentially have been missed if this had not been the approach. The attention to detail paid off in spades and allowed the authors to carefully tease apart molecular candidates of blood-seeking stages. The authors' top-down approach to identifying RYamide and sNPF starting from first principles behavioral experiments is especially comprehensive. The results from both the behavioral and molecular target studies will have broad implications for the vectorial capacity of this species and comparative evolution of neural circuit modulation.

      We really appreciate that the reviewer has recognised the attention to detail we have tried to put, thank you!

      Weaknesses:

      There are a few elements of data visualizations and methodological reporting that I found confusing on a first few read-throughs. Figure 1F, for example, was initially confusing as it made it seem as though there were multiple 2-choice assays for each of the conditions. I would recommend removing the "X" marker from the x-axis to indicate the mosquitoes did not feed from either nectar, blood, or neither in order to make it clear that there was one assay in which mosquitoes had access to both food sources, and the data quantify if they took both meals, one meal, or no meals.

      We thank the reviewer for flagging the schematic in figure 1F. As suggested, we have removed the “X” markers from the x-axis and revised the axis label from “choice of food” to “choice made” to better reflect what food the mosquitoes chose in the assay. For clarity, we have now also plotted the same data as stacked graphs at the bottom of Fig. 1F, which clearly shows the proportion of mosquitoes fed on each particular choice. We avoid the stacked graph as the sole representation of this data, as it does not capture the variability in the data.

      I would also like to know more about how the authors achieved tissue-specific knockdown for RNAi experiments. I think this is an intriguing methodology, but I could not figure out from the methods why injections either had whole-body or abdomen-specific knockdown.

      The tissue-specific knockdown (abdomen only or abdomen+head) emerged from initial standardisations where we were unable to achieve knockdown in the head unless we used higher concentrations of dsRNA and did the injections in older females. We realised that this gave us the opportunity to isolate the neuronal contribution of these neuropeptides in the phenotype produced. Further optimisations revealed that injecting dsRNA into 0-10h old females produced abdomen-specific knockdowns without affecting head expression, whereas injections into 4 days old females resulted in knockdowns in both tissues. Moreover, head knockdowns in older females required higher dsRNA concentrations, with knockdown efficiency correlating with the amount injected. In contrast, abdominal knockdowns in younger females could be achieved even with lower dsRNA amounts.

      We have mentioned the knockdown conditions- time of injection and the amount dsRNA injected- for tissue-specific knockdowns in methods but realise now that it does not explain this well enough. We have now edited it to state our methodology more clearly (see lines 932-948).

      I also found some interpretations of the transcriptomic to be overly broad for what transcriptomes can actually tell us about the organism's state. For example, the authors mention, "Interestingly, we found that after a blood meal, glucose is neither spent nor stored, and that the female brain goes into a state of metabolic 'sugar rest', while actively processing proteins (Figure S2B, S3)".

      This would require a physiological measurement to actually know. It certainly suggests that there are changes in carbohydrate metabolism, but there are too many alternative interpretations to make this broad claim from transcriptomic data alone.

      We thank the reviewer for pointing this out and agree with them. We have now edited our statement to read:

      “Instead, our data suggests altered carbohydrate metabolism after a blood meal, with the female brain potentially entering a state of metabolic 'sugar rest' while actively processing proteins (Figure S2B, S3). However, physiological measurements of carbohydrate and protein metabolism will be required to confirm whether glucose is indeed neither spent nor stored during this period.” See lines 271-277.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Bansal et al examine and characterize feeding behaviour in Anopheles stephensi mosquitoes. While sharing some similarities to the well-studied Aedes aegypti mosquito, the authors demonstrate that mated females, but not unmated (virgin) females, exhibit suppression in their bloodfeeding behaviour. Using brain transcriptomic analysis comparing sugar-fed, blood-fed, and starved mosquitoes, several candidate genes potentially responsible for influencing blood-feeding behaviour were identified, including two neuropeptides (short NPF and RYamide) that are known to modulate feeding behaviour in other mosquito species. Using molecular tools, including in situ hybridization, the authors map the distribution of cells producing these neuropeptides in the nervous system and in the gut. Further, by implementing systemic RNA interference (RNAi), the study suggests that both neuropeptides appear to promote blood-feeding (but do not impact sugar feeding), although the impact was observed only after both neuropeptide genes underwent knockdown.

      Strengths and/or weaknesses:

      Overall, the manuscript was well-written; however, the authors should review carefully, as some sections would benefit from restructuring to improve clarity. Some statements need to be rectified as they are factually inaccurate.

      Below are specific concerns and clarifications needed in the opinion of this reviewer:

      (1) What does "central brains" refer to in abstract and in other sections of the manuscript (including methods and results)? This term is ambiguous, and the authors should more clearly define what specific components of the central nervous system was/were used in their study.

      Central brain, or mid brain, is a commonly used term to refer to brain structures/neuropils without the optic lobes (For example: https://www.nature.com/articles/s41586-024-07686-5). In this study we have focused our analysis on the central brain circuits involved in modulating blood-feeding behaviour and have therefore excluded the optic lobes. As optic lobes account for nearly half of all the neurons in the mosquito brain (https://pmc.ncbi.nlm.nih.gov/articles/PMC8121336/), including them would have disproportionately skewed our transcriptomic data toward visual processing pathways. 

      We have indicated this in figure 3A and in the methods (see lines 800-801, 812). We have now also clarified it in the results section for neurotranscriptomics to avoid confusion (see lines 236-237).

      (2) The abstract states that two neuropeptides, sNPF and RYamide are working together, but no evidence is summarized for the latter in this section.

      We thank the reviewer for pointing this out. We have now added a statement “This occurs in the context of the action of RYa in the brain” to end of the abstract, for a complete summary of our proposed model. 

      (3) Figure 1

      Panel A: This should include mating events in the reproductive cycle to demonstrate differences in the feeding behavior of Ae. aegypti.

      Our data suggest that mating can occur at any time between eclosion and oviposition in An. stephensi and between eclosion and blood feeding in Ae. aegypti. Adding these into (already busy) 1A, would cloud the purpose of the schematic, which is to indicate the time points used in the behavioural assays and transcriptomics.

      Panel F: In treatments where insects were not provided either blood or sugar, how is it that some females and males had fed? Also, it is unclear why the y-axis label is % fed when the caption indicates this is a choice assay. Also, it is interesting that sugar-starved females did not increase sugar intake. Is there any explanation for this (was it expected)?

      We apologise for the confusion. The experiment is indeed a choice assay in which sugar-starved or sugar-sated females, co-housed with males, were provided simultaneous access to both blood and sugar, and were assessed for the choice made (indicated on the x-axis): both blood and sugar, blood only, sugar only, or neither. The x-axis indicates the choice made by the mosquitoes, not the choice provided in the assay, and the y-axis indicates the percentage of males or females that made each particular choice. We have now removed the “X” markers from the x-axis and revised the axis label from “choice of food” to “choice made” to better reflect what food the mosquitoes chose to take.

      In this assay, we scored females only for the presence or absence of each meal type (blood or sugar) and are therefore unable to comment on whether sugar-starved females consumed more sugar than sugarsated females. However, when sugar-starved, a higher proportion of females consumed both blood and sugar, while fewer fed on blood alone.

      For clarity, we have now also plotted the same data as stacked graphs at the bottom of Fig. 1F, which clearly shows the proportion of mosquitoes fed on each particular choice. We avoid the stacked graph as the sole representation of this data as it does not capture the variability in the data.

      (4) Figure 3

      In the neurotranscriptome analysis of the (central) brain involving the two types of comparisons, can the authors clarify what "excluded in males" refers to? Does this imply that only genes not expressed in males were considered in the analysis? If so, what about co-expressed genes that have a specific function in female feeding behaviour?

      This is indeed correct. We reasoned that since blood feeding is exclusive to females, we should focus our analysis on genes that were specifically upregulated in them. As the reviewer points out, it is very likely that genes commonly upregulated in males and females may also promote blood feeding and we will miss out on any such candidates based on our selection criteria. 

      (5) Figure 4

      The authors state that there is more efficient knockdown in the head of unfed females; however, this is not accurate since they only get knockdown in unfed animals, and no evidence of any knockdown in fed animals (panel D). This point should be revised in the results test as well.

      Perhaps we do not understand the reviewer’s point or there has been a misunderstanding. In figure 4D, we show that while there is more robust gene knockdown in unfed females, blood-fed females also showed modest but measurable knockdowns ranging from 5-40% for RYamide and 2-21% for sNPF. 

      Relatedly, blood-feeding is decreased when both neuropeptide transcripts are targeted compared to uninjected (panel C) but not compared to dsGFP injected (panel E). Why is this the case if authors showed earlier in this figure (panel B) that dsGFP does not impact blood feeding?

      We realise this concern stems from our representation of the data. Since we had earlier determined that dsGFP-injected females fed similarly to uninjected females (fig 4B), we used these controls interchangeably in subsequent experiments. To avoid confusion, we have now only used the label ‘control’ in figure 4 (and supplementary figure S9) and specified which control was used for each experiment in the legend.

      In addition to this, we wanted to clarify that fig 4C and 4E are independent experiments. 4C is the behaviour corresponding to when the neuropeptides were knocked down in both heads and abdomens. 4E is the behaviour corresponding to when the neuropeptides were knocked down in only the abdomens. We have now added a schematic in the plots to make this clearer.

      In addition, do the uninjected and dsGFP-injected relative mRNA expression data reflect combined RYa and sNPF levels? Why is there no variation in these data,…

      In these qPCRs, we calculated relative mRNA expression using the delta-delta Ct method (see line 975). For each neuropeptide its respective control was used. For simplicity, we combined the RYa and sNPF control data into a single representation. The value of this control is invariant because this method sets the control baseline to a value of 1.

      …and how do transcript levels of RYa and sNPF compare in the brain versus the abdomen (the presentation of data doesn't make this relationship clear).

      The reviewer is correct in pointing out that we have not clarified this relationship in our current presentation. While we have not performed absolute mRNA quantifications, we extracted relative mRNA levels from qPCR data of 96h old unmanipulated control females. We observed that both sNPF and RYa transcripts are expressed at much lower levels in the abdomens, as compared to those in the heads, as shown in Author response Image 1 below. 

      Author response image 1.

      (6) As an overall comment, the figure captions are far too long and include redundant text presented in the methods and results sections.

      We thank the reviewer for flagging this and have now edited the legends to remove redundancy.  

      (7) Criteria used for identifying neuropeptides promoting blood-feeding: statement that reads "all neuropeptides, since these are known to regulate feeding behaviours". This is not accurate since not all neuropeptides govern feeding behaviors, while certainly a subset do play a role.

      We agree with the reviewer that not all neuropeptides regulate feeding behaviours. Our statement refers to the screening approach we used: in our shortlist of candidates, we chose to validate all neuropeptides.

      (8) In the section beginning with "Two neuropeptides - sNPF and RYa - showed about 25% and 40% reduced mRNA levels...", the authors state that there was no change in blood-feeding and later state the opposite. The wording should be clarified as it is unclear.

      Thank you for pointing this out. We were referring to an unchanged proportion of the blood fed females. We have now edited the text to the following: 

      “Two neuropeptides - sNPF and RYa - showed about 25% and 40% reduced mRNA levels in the heads but the proportion of females that took blood meals remained unchanged”. See lines 338-340.

      (9) Just before the conclusions section, the statement that "neuropeptide receptors are often ligandpromiscuous" is unjustified. Indeed, many studies have shown in heterologous systems that high concentrations of structurally related peptides, which are not physiologically relevant, might cross-react and activate a receptor belonging to a different peptide family; however, the natural ligand is often many times more potent (in most cases, orders of magnitude) than structurally related peptides. This is certainly the case for various RYamide and sNPF receptors characterized in various insect species.

      We agree with the reviewer and apologise for the mistake. We have now removed the statement.

      (10) Methods

      In the dsRNA-mediated gene knockdown section, the authors could more clearly describe how much dsRNA was injected per target. At the moment, the reader must carry out calculations based on the concentrations provided and the injected volume range provided later in this section.

      We have now edited the section to reflect the amount of dsRNA injected per target. Please see lines 921-931.

      It is also unclear how tissue-specific knockdown was achieved by performing injection on different days/times. The authors need to explain/support, and justify how temporal differences in injection lead to changes in tissue-specific expression. Does the blood-brain barrier limit knockdown in the brain instead, while leaving expression in the peripheral organs susceptible?

      To achieve tissue-specific knockdowns of sNPF and RYa, we optimised both the time of injection as well as the dsRNA concentration to be injected. Injecting dsRNA into 0-10h females produced abdomen-specific knockdowns without affecting head expression, whereas injections into 96h old females resulted in knockdowns in both tissues. Head knockdowns in older females required higher dsRNA concentrations, with knockdown efficiency correlating with the amount injected. In contrast, abdominal knockdowns in younger females could be achieved even with lower dsRNA amounts, reflecting the lower baseline expression of sNPF in abdomens compared to heads and the age-dependent increase in head expression (as confirmed by qPCR). It is possible that the blood-brain barrier also limits the dsRNA entering the brain, thereby requiring higher amounts to be injected for head knockdowns. 

      We have now edited this section to state our methodology more clearly (see lines 932-948).

      For example, in Figure 4, the data support that knockdown in the head/brain is only effective in unfed animals compared to uninjected animals, while there is no evidence of knockdown in the brain relative to dsGFP-injected animals. Comparatively, evidence appears to show stronger evidence of abdominal knockdown mostly for the RYa transcript (>90%) while still significantly for the sNPF transcript (>60%).

      As we explained earlier, this concern likely stems from our representation of the data. Since we had earlier determined that dsGFP-injected females fed similarly to uninjected females (fig 4B), we used these controls interchangeably in subsequent experiments. To avoid confusion, we have now only used the label ‘control’ in figure 4 (and supplementary figure S9) and specified which control was used for each experiment in the legend.

      In addition to this, we wanted to clarify that fig 4C and 4E are independent experiments. 4C is the behaviour corresponding to when the neuropeptides were knocked down in both heads and abdomens.  4E is the behaviour corresponding to when the neuropeptides were knocked down in only the abdomen. We have now added a schematic in the plots to make this clearer.

      Reviewer #3 (Public review):

      Summary:

      This manuscript investigates the regulation of host-seeking behavior in Anopheles stephensi females across different life stages and mating states. Through transcriptomic profiling, the authors identify differential gene expression between "blood-hungry" and "blood-sated" states. Two neuropeptides, sNPF and RYamide, are highlighted as potential mediators of host-seeking behavior. RNAi knockdown of these peptides alters host-seeking activity, and their expression is anatomically mapped in the mosquito brain (sNPF and RYamide) and midgut (sNPF only).

      Strengths:

      (1) The study addresses an important question in mosquito biology, with relevance to vector control and disease transmission.

      (2) Transcriptomic profiling is used to uncover gene expression changes linked to behavioral states.

      (3) The identification of sNPF and RYamide as candidate regulators provides a clear focus for downstream mechanistic work.

      (4) RNAi experiments demonstrate that these neuropeptides are necessary for normal host-seeking behavior.

      (5) Anatomical localization of neuropeptide expression adds depth to the functional findings.

      Weaknesses:

      (1) The title implies that the neuropeptides promote host-seeking, but sufficiency is not demonstrated (for example, with peptide injection or overexpression experiments).

      Demonstrating sufficiency would require injecting sNPF peptide or its agonist. To date, no small-molecule agonists (or antagonists) that selectively mimic sNPF or RYa neuropeptides have been identified in insects. An NPY analogue, TM30335, has been reported to activate the Aedes aegypti NPY-like receptor 7 (NPYLR7; Duvall et al., 2019), which is also activated by sNPF peptides at higher doses (Liesch et al., 2013). Unfortunately, the compound is no longer available because its manufacturer, 7TM Pharma, has ceased operations. Synthesising the peptides is a possibility that we will explore in the future.

      (2) The proposed model regarding central versus peripheral (gut) peptide action is inconsistently presented and lacks strong experimental support.

      The best way to address this would be to conduct tissue-specific manipulations, the tools for which are not available in this species. Our approach to achieve head+abdomen and abdomen only knockdown was the closest we could get to achieving tissue specificity and allowed us to confirm that knockdown in the head was necessary for the phenotype. However, as the reviewer points out, this did not allow us to rule out any involvement of the abdomen. This point has been addressed in lines 364-371.

      (3) Some conclusions appear premature based on the current data and would benefit from additional functional validation.

      The most definitive way of demonstrating necessity of sNPF and RYa in blood feeding would be to generate mutant lines. While we are pursuing this line of experiments, they lie beyond the scope of a revision. In its absence, we relied on the knockdown of the genes using dsRNA. We would like to posit that despite only partial knockdown, mosquitoes do display defects in blood-feeding behaviour, without affecting sugar-feeding. We think this reflects the importance of sNPF in promoting blood feeding.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Overall, I found this manuscript to be well-prepared, visually the figures are great and clearly were carefully thought out and curated, and the research is impactful. It was a wonderful read from start to finish. I have the following recommendations:

      Thank you very much, we are very pleased to hear that you enjoyed reading our manuscript!

      (1) For future manuscripts, it would make things significantly easier on the reviewer side to submit a format that uses line numbers.

      We sincerely apologise for the oversight. We have now incorporated line numbers in the revised manuscript.

      (2) There are a few statements in the text that I think may need clarification or might be outside the bounds of what was actually studied here. For example, in the introduction "However, mating is dispensable in Anophelines even under conditions of nutritional satiety". I am uncertain what is meant by this statement - please clarify.

      We apologise for the lack of clarity in the statement and have now deleted it since we felt it was not necessary.

      (3) Typo/Grammatical minutiae:

      (a) A small idiosyncrasy of using hyphens in compound words should also be fixed throughout. Typically, you don't hyphenate if the words are being used as a noun, as in the case: e.g. "Age affects blood feeding.". However, you would hyphenate if the two words are used as a compound adjective "Age affects blood-feeding behavior". This may not be an all-inclusive list, but here are some examples where hyphens need to either be removed or added. Some examples:

      "Nutritional state also influences other internal state outputs on blood-feeding": blood-feeding -> blood feeding

      "... the modulation of blood-feeding": blood-feeding -> blood feeding

      "For example, whether virgin females take blood-meals...": blood-meals -> blood meals

      ".... how internal and external cues shape meal-choice"-> meal choice

      "blood-meal" is often used throughout the text, but is correctly "blood meal" in the figures.

      There are many more examples throughout.

      We apologise for these errors and appreciate the reviewer’s keen eye. We have now fixed them throughout the manuscript.  

      (b) Figure 1 Caption has a typo: "co-housed males were accessed for sugar-feeding" should be "co-housed males were assessed for sugar feeding"

      We apologise for the typo and thank the reviewer for spotting it. We have now corrected this.  

      (c) It would be helpful in some other figure captions to more clearly label which statement is relevant to which part of the text. For example, in Figure 4's caption.

      "C,D. Blood-feeding and sugar-feeding behaviour of females when both RYa and sNPF are knocked down in the head (C). Relative mRNA expressions of RYa and sNPF in the heads of dsRYa+dssNPF - injected blood-fed and unfed females, as compared to that in uninjected females, analysed via qPCR (D)."

      I found re-referencing C and D at the end of their statements makes it look as thought C precedes the "Relative mRNA expression" and on a first read through, I thought the figure captions were backwards. I'd recommend reformatting here and throughout consistently to only have the figure letter precede its relevant caption information, e.g.:

      "C. Blood-feeding and sugar-feeding behaviour of females when both RYa and sNPF are knocked down in the head. D. Relative mRNA expressions of RYa and sNPF in the heads of dsRYa+dssNPF - injected bloodfed and unfed females, as compared to that in uninjected females, analysed via qPCR."

      We have now edited the legends as suggested.

      Reviewer #2 (Recommendations for the authors):

      Separately from the clarifications and limitations listed above, the authors could strengthen their study and the conclusions drawn if they could rescue the behavioural phenotype observed following knockdown of sNPF and RYamide. This could be achieved by injection of either sNPF or RYa peptide independently or combined following knockdown to validate the role of these peptides in promoting blood-feeding in An. stephensi. Additionally, the apparent (but unclear) regionalized (or tissue-specific) knockdown of sNPF and RYamide transcripts could be visualized and verified by implementing HCR in situ hyb in knockdown animals (or immunohistochemistry using antibodies specific for these two neuropeptides). 

      In a follow up of this work, we are generating mutants and peptides for these candidates and are planning to conduct exactly the experiments the reviewer suggests.

      Reviewer #3 (Recommendations for the authors):

      The loss-of-function data suggest necessity but not sufficiency. Synthetic peptide injection in non-hostseeking (blood-fed mated or juvenile) mosquitoes would provide direct evidence for peptide-induced behavioral activation. The lack of these experiments weakens the central claim of the paper that these neuropeptides directly promote blood feeding.

      As noted above, we plan to synthesise the peptide to test rescue in a mutant background and sufficiency.  

      Some of the claims about knockdown efficiency and interpretation are conflicting; the authors dismiss Hairy and Prp as candidates due to 30-35% knockdown, yet base major conclusions on sNPF and RYamide knockdowns with comparable efficiencies (25-40%). This inconsistency should be addressed, or the justification for different thresholds should be clearly stated.

      We have not defined any specific knockdown efficacy thresholds in the manuscript, as these can vary considerably between genes, and in some cases, even modest reductions can be sufficient to produce detectable phenotypes. For example, knockdown efficiencies of even as low as about 25% - 40% gave us observable phenotypes for sNPF and RYa RNAi (Figure S9B-G).

      No such phenotypes were observed for Hairy (30%) or Prp (35%) knockdowns. Either these genes are not involved in blood feeding, or the knockdown was not sufficient for these specific genes to induce phenotypes. We cannot distinguish between these scenarios. 

      The observation that knockdown animals take smaller blood meals is interesting and could reflect a downstream effect of altered host-seeking or an independent physiological change. The relationship between meal size and host-seeking behavior should be clarified.

      We agree with the reviewer that the reduced meal size observed in sNPF and RYa knockdown animals could result from their inability to seek a host or due to an independent effect on blood meal intake. Unfortunately, we did not measure host-seeking in these animals. We plan to distinguish between these possibilities using mutants in future work.

      Several figures are difficult to interpret due to cluttered labeling and poorly distinguishable color schemes. Simplifying these and improving contrast (especially for co-housed vs. virgin conditions) would enhance readability. 

      We regret that the reviewer found the figures difficult to follow. We have now revised our annotations throughout the manuscript for enhanced readability. For example, “D1<sup>B”</sup> is now “D1<sup>PBM”</sup> (post-bloodmeal) and “D1<sup>O”</sup> is now “D1<sup>PO”</sup> (post-oviposition). Wherever mated females were used, we have now appended “(m)” to the annotations and consistently depicted these females with striped abdomens in all the schematics. We believe these changes will improve clarity and readability.

      The manuscript does not clearly justify the use of whole-brain RNA sequencing to identify peptides involved in metabolic or peripheral processes. Given that anticipatory feeding signals are often peripheral, the logic for brain transcriptomics should be explained.

      The reviewer is correct in pointing out that feeding signals could also emerge from peripheral tissues. Signals from these tissues – in response to both changing nutritional and reproductive states – are then integrated by the central brain to modulate feeding choices. For example, in Drosophila, increased protein intake is mediated by central brain circuitry including those in the SEZ and central complex (Munch et al., 2022; Liu et al., 2017; Goldschmidt et al., 202ti). In the context of mating, male-derived sex peptide further increases protein feeding by acting on a dedicated central brain circuitry (Walker et al., 2015). We, therefore focused on the central brain for our studies.

      The proposed model suggests brain-derived peptides initiate feeding, while gut peptides provide feedback. However, gut-specific knockdowns had no effect, undermining this hypothesis. Conversely, the authors also suggest abdominal involvement based on RNAi results. These contradictions need to be resolved into a consistent model.

      We thank the reviewer for raising this point and recognise their concern. Our reasons for invoking an involvement of the gut were two-fold:

      (1) We find increased sNPF transcript expression in the entero-endocrine cells of the midgut in blood-hungry females, which returns to baseline after a blood-meal (Fig. 4L, M).

      (2) While the abdomen-only knockdowns did not affect blood feeding, every effective head knockdown that affected blood feeding also abolished abdominal transcript levels (Fig. S9C, F). (Achieving a head-only reduction proved impossible because (i) systemic dsRNA delivery inevitably reaches the abdomen and (ii) abdominal expression of both peptides is low, leaving little dynamic range for selective manipulation.) Consequently, we can only conclude the following: 1) that brain expression is required for the behaviour, 2) that we cannot exclude a contributory role for gut-derived sNPF. We have discussed this in lines 364-371.

      The identification of candidate receptors is promising, but the manuscript would be significantly strengthened by testing whether receptor knockdowns phenocopy peptide knockdowns. Without this, it is difficult to conclude that the identified receptors mediate the behavioral effects.

      We agree that functional validation of the receptors would strengthen the evidence for sNPF and RYa-mediated control of blood feeding in An. stephensi. We selected these receptors based on sequence homology. A possibility remains that sNPF neuropeptides activate more than one receptor, each modulating a distinct circuit, as shown in the case of Drosophila Tachykinin (https://pmc.ncbi.nlm.nih.gov/articles/PMC10184743/). This will mean a systematic characterisation and knockdown of each of them to confirm their role. We are planning these experiments in the future.  

      The authors compared the percentage changes in sugar-fed and blood-fed animals under sugar-sated or sugar-starved conditions. Figure 1F should reflect what was discussed in the results.

      Perhaps this concern stems from our representation of the data in figure 1F? We have now edited the xaxis and revised its label from “choice of food” to “choice made” to better reflect what food the mosquitoes chose to take.

      For clarity, we have now also plotted the same data as stacked graphs at the bottom of Fig. 1F, which clearly shows the proportion of mosquitoes fed on each particular choice. We avoid the stacked graph as the sole representation of this data because it does not capture the variability in the data.

      Minor issues:

      (1) The authors used mosquitoes with belly stripes to indicate mated females. To be consistent, the post-oviposition females should also have belly stripes.

      We thank the reviewer for pointing this out. We have now edited all the figures as suggested.

      (2) In the first paragraph on the right column of the second page, the authors state, "Since females took blood-meals regardless of their prior sugar-feeding status and only sugar-feeding was selectively suppressed by prior sugar access." Just because the well-fed animals ate less than the starved animals does not mean their feeding behavior was suppressed.

      Perhaps there has been a misunderstanding in the experimental setup of figure 1F, probably stemming from our data representation. The experiment is a choice assay in which sugar-starved or sugar-sated females, co-housed with males, were provided simultaneous access to both blood and sugar, and were assessed for the choice made (indicated on the x-axis): both blood and sugar, blood only, sugar only, or neither. We scored females only for the presence or absence of each meal type (blood or sugar) and did not quantify the amount consumed.

      (3) The figure legend for Figure 1A and the naming convention for different experimental groups are difficult to follow. A simplified or consistently abbreviated scheme would help readers navigate the figures and text.

      We regret that the reviewer found the figure difficult to follow. We have now revised our annotations throughout the manuscript for enhanced readability. For example, “D1<sup>B”</sup> is now “D1<sup>PBM”</sup> (post-bloodmeal) and “D1<sup>O”</sup> is now “D1<sup>PO”</sup> (post-oviposition).

      (4) In the last paragraph of the Y-maze olfactory assay for host-seeking behaviour in An. stephensi in Methods, the authors state, "When testing blood-fed females, aged-matched sugar-fed females (bloodhungry) were included as positive controls where ever possible, with satisfactory results." The authors should explicitly describe what the criteria are for "satisfactory results".

      We apologise for the lack of clarity. We have now edited the statement to read:

      “When testing blood-fed females, age-matched sugar-fed females (blood-hungry) were included wherever possible as positive controls. These females consistently showed attraction to host cues, as expected.” See lines 786-790.

      (5) In the first paragraph of the dsRNA-mediated gene knockdown section in Methods, dsRNA against GFP is used as a negative control for the injection itself, but not for the potential off-target effect.

      We agree with the reviewer that dsGFP injections act as controls only for injection-related behavioural changes, and not for off-target effects of RNAi. We have now corrected the statement. See lines 919-920.

      To control for off-target effects, we could have designed multiple dsRNAs targeting different parts of a given gene. We regret not including these controls for potential off-target effects of dsRNAs injected. 

      (6) References numbers 48, 89, and 90 are not complete citations.

      We thank the reviewer for spotting these. We have now corrected these citations.

    1. Author response:

      The following is the authors’ response to the original reviews.

      First, we thank the reviewers for the valuable and constructive reviews. Thanks to these, we believe the article has been considerably improved. We have organized our response to address points that are relevant to both reviewers first, after which we address the unique concerns of each individual reviewer separately. We briefly paraphrase each concern and provide comments for clarification, outlining the precise changes that we have made to the text.

      Common Concerns (R1 & R2):

      Can you clarify how NREM and REM sleep relate to the oneirogen hypothesis?

      Within the submission draft we tried to stay agnostic as to whether mechanistically similar replay events occur during NREM or REM sleep; however, upon a more thorough literature review, we think that there is moderately greater evidence in favor of Wake-Sleep-type replay occurring during REM sleep which is related to classical psychedelic drug mechanisms of action.

      First, we should clarify that replay has been observed during both REM and NREM sleep, and dreams have been documented during both sleep stages, though the characteristics of dreams differ across stages, with NREM dreams being more closely tied to recent episodic experience and REM dreams being more bizarre/hallucinatory (see Stickgold et al., 2001 for a review). Replay during sleep has been studied most thoroughly during NREM sharp-wave ripple events, in which significant cortical-hippocampal coupling has been observed (Ji & Wilson, 2007). However, it is critical to note that the quantification methods used to identify replay events in the hippocampal literature usually focus on identifying what we term ‘episodic replay,’ which involves a near-identical recapitulation of neural trajectories that were recently experienced during waking experimental recordings (Tingley & Peyrach, 2020). In contrast, our model focuses on ‘generative replay,’ where one expects only a statistically similar reproduction of neural activity, without any particular bias towards recent or experimentally controlled experience. This latter form of replay may look closer to the ‘reactivation’ observed in cortex by many studies (e.g. Nguyen et al., 2024), where correlation structures of neural activity similar to those observed during stimulus-driven experience are recapitulated. Under experimental conditions in which an animal is experiencing highly stereotyped activity repeatedly, over extended periods of time, these two forms of replay may be difficult to dissociate.

      Interestingly, though NREM replay has been shown to couple hippocampal and cortical activity, a similar study in waking animals administered psychedelics found hippocampal replay without any obvious coupling to cortical activity (Domenico et al., 2021). This could be because the coupling was not strong enough to produce full trajectories in the cortex (psychedelic administration did not increase ‘alpha’ enough), and that a causal manipulation of apical/basal influence in the cortex may be necessary to observe the increased coupling. Alternatively, as Reviewer 1 noted, it may be that psychedelics induce a form of hippocampus-decoupled replay, as one would expect from the REM stage of a recently proposed complementary learning systems model (Singh et al., 2022). 

      Evidence in favor of a similarity between the mechanism of action of classical psychedelics and the mechanism of action of memory consolidation/learning during REM sleep is actually quite strong. In particular, studies have shown that REM sleep increases the activity of soma-targeting parvalbumin (PV) interneurons and decreases the activity of apical dendrite-targeting somatostatin (SOM) interneurons (Niethard et al., 2021), that this shift in balance is controlled by higher-order thalamic nuclei, and that this shift in balance is critical for synaptic consolidation of both monocular deprivation effects in early visual cortex (Zhou et al., 2020) and for the consolidation of auditory fear conditioning in the dorsal prefrontal cortex (Aime et al., 2022). These last studies were not discussed in our previous text–we have added them, in addition to a more nuanced description of the evidence connecting our model to NREM and REM replay. 

      Relevant modifications: Page 4, 1st paragraph; Page 11, 1st paragraph.

      Can you explain how synaptic plasticity induced by psychedelics within your model relates to learning at a behavioral level?

      While the Wake-Sleep algorithm is a useful model for unsupervised statistical learning, it is not a model of reward or fear-based conditioning, which likely occur via different mechanisms in the brain (e.g. dopamine-dependent reinforcement learning or serotonin-dependent emotional learning). The Wake-Sleep algorithm is a ‘normative plasticity algorithm,’ that connects synaptic plasticity to the formation of structured neural representations, but it is not the case that all synaptic plasticity induced by psychedelic administration within our model should induce beneficial learning effects. According to the Wake-Sleep algorithm, plasticity at apical synapses is enhanced during the Wake phase, and plasticity at basal synapses is enhanced during the Sleep phase; under the oneirogen hypothesis, hallucinatory conditions (increased ‘alpha’) cause an increase in plasticity at both apical and basal sites. Because neural activity is in a fundamentally aberrant state when ‘alpha’ is increased, there are no theoretical guarantees that plasticity will improve performance on any objective: psychedelic-induced plasticity within our model could perhaps better be thought of as ‘noise’ that may have a positive or negative effect depending on the context.

      In particular, such ‘noise’ may be beneficial for individuals or networks whose synapses have become locked in a suboptimal local minimum. The addition of large amounts of random plasticity could allow a system to extricate itself from such local minima over subsequent learning (or with careful selection of stimuli during psychedelic experience), similar to simulated annealing optimization approaches. If our model were fully validated, this view of psychedelic-induced plasticity as ‘noise’ could have relevance for efforts to alleviate the adverse effects of PTSD, early life trauma, or sensory deprivation; it may also provide a cautionary note against repeated use of psychedelic drugs within a short time frame, as the plasticity changes induced by psychedelic administration under our model are not guaranteed to be good or useful in-and-of themselves without subsequent re-learning and compensation.

      We should also note that we have deliberately avoided connecting the oneirogen hypothesis model to fear extinction experimental results that have been observed through recordings of the hippocampus or the amygdala (Bombardi & Giovanni, 2013; Jiang et al., 2009; Kelly et al., 2024; Tiwari et al., 2024). Both regions receive extensive innervation directly from serotonergic synapses originating in the dorsal raphe nucleus, which have been shown to play an important role in emotional learning (Lesch & Waider, 2012); because classical psychedelics may play a more direct role in modulating this serotonergic innervation, it is possible that fear conditioning results (in addition to the anxiolytic effects of psychedelics) cannot be attributed to a shift in balance between apical and basal synapses induced by psychedelic administration. We have provided a more detailed review of these results in the text, as well as more clarity regarding their relation to our model.

      Relevant modifications: Page 9, final paragraph; Page 12, final paragraph.

      Reviewer 1 Concerns:

      Is it reasonable to assign a scalar parameter ‘alpha’ to the effects of classical psychedelics? And is your proposed mechanism of action unique to classical psychedelics? E.g. Could this idea also apply to kappa opioid agonists, ketamine, or the neural mechanisms of hallucination disorders?

      We have clarified that within our model ‘alpha’ is a parameter that reflects the balance between apical and basal synapses in determining the activity of neurons in the network. For the sake of simplicity we used a single ‘alpha’ parameter, but realistically, each neuron would have its own ‘alpha’ parameter, and different layers or individual neurons could be affected differentially by the administration of any particular drug; therefore, our scalar ‘alpha’ value can be thought of as a mean parameter for all neurons, disregarding heterogeneity across individual neurons.

      There are many different mechanisms that could theoretically affect this ‘alpha’ parameter, including: 5-HT2a receptor agonism, kappa opioid receptor binding, ketamine administration, or possibly the effects of genetic mutations underlying the pathophysiology of complex developmental hallucination disorders. We focused exclusively on 5-HT2a receptor agonism for this study because the mechanism is comparatively simple and extensively characterized, but similar mechanisms may well be responsible for the hallucinatory symptoms of a variety of drugs and disorders.

      Relevant modifications: Page 4, first paragraph; Page 13, first paragraph.

      Can you clarify the role of 5-HT2a receptor expression on interneurons within your model?

      While we mostly focused on the effects of 5-HT2a receptors on the apical dendrites of pyramidal neurons, these receptors are also expressed on soma-targeting parvalbumin (PV) interneurons. This expression on PV interneurons is consistent with our proposed psychedelic mechanism of action, because it could lead to a coordinated decrease in the influence of somatic and proximal dendritic inputs while increasing the influence of apical dendritic inputs. We have elaborated on this point, and moved the discussion earlier in the text.

      Relevant modifications: Page 1, 1st paragraph; Page 4, 2nd paragraph.

      Discussions of indigenous use of psychedelics over millenia may amount to over-romanticization.

      We ultimately decided to remove these discussions from the main text, as they had little bearing on the content of our work. Within the Ethics Declarations section we softened our claims from “millenia” to “centuries,” as indigenous psychedelic use over this latter period of time is well-substantiated.

      Relevant modifications: removed from introduction; modified Ethics Declarations

      You isolate the 5-HT2a agonism as the mechanism of action underlying ‘alpha’ in your model, but there exist 5-HT2a agonists that do not have hallucinatory effects (e.g. lisuride). How do you explain this?

      Lisuride has much-reduced hallucinatory effects compared to other psychedelic drugs at clinical doses (though it does indeed induce hallucinations at high doses; Marona-Lewicka et al., 2002), and we should note that serotonin (5-HT) itself is pervasive in the cortex without inducing hallucinatory effects during natural function. Similarly, MDMA is a partial agonist for 5-HT2a receptors, but it has much-reduced perceptual hallucination effects relative to classical psychedelics (Green et al., 2003) in addition to many other effects not induced by classical psychedelics.

      Therefore, while we argue that 5-HT2a agonism induces an increase in influence of apical dendritic compartments and a decrease in influence of basal/somatic compartments, and that this change induces hallucinations, we also note that there are many other factors that control whether or not hallucinations are ultimately produced, so that not all 5-HT2a agonists are hallucinogenic. There are two possible additional factors that could contribute to this phenomenon: 5-HT receptor binding affinity and cellular membrane permeability.

      Importantly, many 5-HT2a receptor agonists are also 5-HT1a receptor agonists (e.g. serotonin itself and lisuride), while MDMA has also been shown to increase serotonin, norepinephrine, and dopamine release (Green et al., 2003). While 5-HT2a receptor agonism has been shown to reduce sensory stimulus responses (Michaiel et al., 2019), 5-HT1a receptor agonism inhibits spontaneous cortical activity (Azimi et al., 2020); thus one might expect the net effect of administering serotonin or a nonselective 5-HT receptor agonist to be widespread inhibition of a circuit, as has been observed in visual cortex (Azimi et al., 2020). Therefore, selective 5-HT2a agonism is critical for the induction of hallucinations according to our model, though any intervention that jointly excites pyramidal neurons’ apical dendrites and inhibits their basal/somatic compartments across a broad enough area of cortex would be predicted to have a similar effect. Lisuride has a much higher binding affinity for 5-HT1a receptors than, for instance, LSD (Marona-Lewicka et al., 2002).

      Secondly, it has recently been shown that both the head-twitch effect (a coarse behavioral readout of hallucinations in animals) and the plasticity effects of psychedelics are abolished when administering 5-HT2a agonists that are impermeable to the cellular membrane because of high polarity, and that these effects can be rescued by temporarily rendering the cellular membrane permeable (Vargas et al., 2023). This suggests that the critical hallucinatory effects of psychedelics (apical excitation according to our model) may be mediated by intracellular 5-HT2a receptors. Notably, serotonin itself is not membrane permeable in the cortex.

      Therefore, either of these two properties could play a role in whether a given 5-HT2a agonist induces hallucinatory effects. We have provided an extended discussion of these nuances in our revision.

      Relevant modifications: Page 1, paragraph 2.

      Your model proposes that an increase in top-down influence on neural activity underlies the hallucinatory effects of psychedelics. How do you explain experimental results that show increases in bottom-up functional connectivity (either from early sensory areas or the thalamus)?

      Firstly, we should note that our proposed increase in top-down influence is a causal, biophysical property, not necessarily a statistical/correlative one. As such, we will stress that the best way to test our model is via direct intervention in cortical microcircuitry, as opposed to correlative approaches taken by most fMRI studies, which have shown mixed results with regard to this particular question. Correlative approaches can be misleading due to dense recurrent coupling in the system, and due to the coarse temporal and spatial resolution provided by noninvasive recording technologies (changes in statistical/functional connectivity do not necessarily correspond to changes in causal/mechanistic connectivity, i.e. correlation does not imply causation).

      There are two experimental results that appear to contradict our hypothesis that deserve special consideration. The first shows an increase in directional thalamic influence on the distributed cortical networks after psychedelic administration (Preller et al., 2018). To explain this, we note that this study does not distinguish between lower-order sensory thalamic nuclei (e.g. the lateral and medial geniculate nuclei receiving visual and auditory stimuli respectively) and the higher-order thalamic nuclei that participate in thalamocortical connectivity loops (Whyte et al., 2024). Subsequent more fine-grained studies have noted an increase in influence of higher order thalamic nuclei on the cortex (Pizzi et al., 2023; Gaddis et al., 2022), and in fact extensive causal intervention research has shown that classical psychedelics (and 5-HT2a agonism) decrease the influence of incoming sensory stimuli on the activity of early sensory cortical areas, indicating decoupling from the sensory thalamus (Evarts et al., 1955; Azimi et al., 2020; Michaiel et al. 2019). The increased influence of higher-order thalamic nuclei is consistent with both the cortico-striatal-thalamo-cortical (CTSC) model of psychedelic action as well as the oneirogen hypothesis, since higher-order thalamic inputs modulate the apical dendrites of pyramidal neurons in cortex (Whyte et al., 2024).

      The second experimental result notes that DMT induces traveling waves during resting state activity that propagate from early visual cortex to deeper cortical layers (Alamia et al., 2020). There are several possibilities that could explain this phenomenon: 1) it could be due to the aforementioned difficulties associated with directed functional connectivity analyses, 2) it could be due to a possible high binding affinity for DMT in the visual cortex relative to other brain areas, or 3) it could be due to increases in apical influence on activity caused by local recurrent connectivity within the visual cortex which, in the absence of sensory input, could lead to propagation of neural activity from the visual cortex to the rest of the brain. This last possibility is closest to the model proposed by (Ermentrout & Cowan, 1979), and which we believe would be best explained within our framework by a topographically connected recurrent network architecture trained on video data; a potentially fruitful direction for future research.

      Relevant modifications: Page 9, paragraph 1; Page 10, final paragraph; Page 11, final paragraph.

      Shouldn’t the hallucinations generated by your model look more ‘psychedelic,’ like those produced by the DeepDream algorithm?

      We believe that the differences in hallucination visualization quality between our Wake-Sleep-trained models and DeepDream are mostly due to differences in the scale and power of the models used across these two studies. We are confident that with more resources (and potentially theoretical innovations to improve the Wake-Sleep algorithm’s performance) the produced hallucination visualizations could become more realistic.

      We note that more powerful generative models trained with backpropagation are able to produce surreal images of comparable quality (Rezende et al., 2014; Goodfellow et al., 2020; Vahdat & Kautz, 2020), though these have not yet been used as a model of psychedelic hallucinations. However, the DeepDream model operates on top of large pretrained image processing models, and does not provide an biologically mechanistic/testable interpretation of its hallucination effects. When training smaller models with a local synaptic plasticity rule (as opposed to backpropagation), the hallucination effects are less visually striking due to the reduced quality of our trained generative model, though they are still strongly tied to the statistics of sensory inputs, as quantified by our correlation similarity metric (Fig. 5b).

      To demonstrate that our proposed hallucination mechanism is capable of producing more complex hallucinations in larger, more powerful models, we employed our same hallucination generation mechanism in a pretrained Very Deep Variational Autoencoder (VDVAE) (Child et al., 2021), which is a hierarchical variational autoencoder with a nearly identical structure compared to our Wake-Sleep-trained networks, with both a bottom-up inference pathway and a top-down generative pathway that maps cleanly onto our multicompartmental neuron model. VDVAEs are trained on the same objective function as our Wake-Sleep-trained networks, but using the backpropagation algorithm. The VDVAE models were able to generate much more complex hallucinations (emergence of complex geometric patterns, smooth deformations of objects and faces), whose complexity arguably exceeds those produced by the DeepDream algorithm. Therefore while the VDVAEs are less biologically realistic (they do not learn via local synaptic plasticity), they function as a valuable high-level model of hallucination generation that complements our Wake-Sleep-trained approach. As further validation, we were also able to replicate our key results and testable predictions with these models.

      Relevant modifications: Results section “Modeling hallucinations in large-scale pretrained networks”; Figure 6, S7, S8; Page 12, paragraph 3; Methods section “Generating hallucinations in hierarchical variational autoencoders.”

      Your model assumes domination by entirely bottom-up activity during the ‘wake’ phase, and domination entirely by top-down activity during ‘sleep,’ despite experimental evidence indicating that a mixture of top-down and bottom-up inputs influence neural activity during both stages in the brain. How do you explain this?

      Our use of the Wake-Sleep algorithm, in which top-down inputs (Sleep) or bottom-up inputs (Wake) dominate network activity is an over-simplification made within our model for computational and theoretical reasons. Models that receive a mixture of top-down and bottom-up inputs during ‘Wake’ activity do exist (in particular the closely related Boltzmann machine (Ackley et al., 1985)), but these models are considerably more computationally costly to train due to a need to run extensive recurrent network relaxation dynamics for each input stimulus. Further, these models do not generalize as cleanly to processing temporal inputs. For this reason, we focused on the Wake-Sleep algorithm, at the cost of some biological realism, though we note that our model should certainly be extended to support mixed apical-basal waking regimes. We have added a discussion of this in our ‘Model Limitations’ section.

      Relevant modifications: Page 12, paragraph 4.

      Your model proposes that 5-HT2a agonism enhances glutamatergic transmission, but this is not true in the hippocampus, which shows decreases in glutamate after psychedelic administration.

      We should note that our model suggests only compartment specific increases in glutamatergic transmission; as such, our model does not predict any particular directionality for measures of glutamatergic transmission that includes signaling at both apical and basal compartments in aggregate, as was measured in the provided study (Mason et al., 2020).

      You claim that your model is consistent with the Entropic Brain theory, but you report increases in variance, not entropy. In fact, it has been shown that variance decreases while entropy increases under psychedelic administration. How do you explain this discrepancy?

      Unfortunately, ‘entropy’ and ‘variance’ are heavily overloaded terms in the noninvasive imaging literature, and the particularities of the method employed can exert a strong influence on the reported effects. The reduction in variance reported by (Carhart-Harris et al., 2016) is a very particular measure: they are reporting the variance of resting state synchronous activity, averaged across a functional subnetwork that spans many voxels; as such, the reduction in variance in this case is a reduction in broad, synchronous activity. We do not have any resting state synchronous activity in our network due to the simplified nature of our model (particularly an absence of recurrent temporal dynamics), so we see no reduction in variance in our model due to these effects.

      Other studies estimate ‘entropy’ or network state disorder via three different methods that we have been able to identify. 1) (Carhart-Harris et al., 2014) uses a different measure of variance: in this case, they subtract out synchronous activity within functional subnetworks, and calculate variability across units in the network. This measure reports increases in variance (Fig. 6), and is the closest measure to the one we employ in this study. 2) (Lebedev et al., 2016) uses sample entropy, which is a measure of temporal sequence predictability. It is specifically designed to disregard highly predictable signals, and so one might imagine that it is a measure that is robust to shared synchronous activity (e.g. resting state oscillations). 3) (Mediano et al., 2024) uses Lempel-Ziv complexity, which is, similar to sample entropy, a measure of sequence diversity; in this case the signal is binarized before calculation, which makes this method considerably different from ours. All three of the preceding methods report increases in sequence diversity, in agreement with our quantification method. Our strongest explanation for why the variance calculation in (Carhart-Harris et al., 2016) produces a variance reduction is therefore due to a reduction in low-rank synchronous activity in subnetworks during resting state.

      As for whether the entropy increase is meaningful: we share Reviewer 1’s concern that increases in entropy could simply be due to a higher degree of cognitive engagement during resting state recordings, due to the presence of sensory hallucinations or due to an inability to fall asleep. This could explain why entropy increases are much more minimal relative to non-hallucinating conditions during audiovisual task performance (Siegel et al., 2024; Mediano et al., 2024). However, we can say that our model is consistent with the Entropic Brain Theory without including any form of ‘cognitive processing’: we observe increases in variability during resting state in our model, but we observe highly similar distributions of activity when averaging over a wide variety of sensory stimulus presentations (Fig. 5b-c). This is because variability in our model is not due to unstructured noise: it corresponds to an exploration of network states that would ordinarily be visited by some stimulus. Therefore, when averaging across a wide variety of stimuli, the distribution of network states under hallucinating or non-hallucinating conditions should be highly similar.

      One final point of clarification: here we are distinguishing Entropic Brain Theory from the REBUS model–the oneirogen hypothesis is consistent with the increase in entropy observed experimentally, but in our model this entropy increase is not due to increased influence of bottom-up inputs (it is due instead to an increase in top-down influence). Therefore, one could view the oneirogen hypothesis as consistent with EBT, but inconsistent with REBUS.

      Relevant modifications: Page 10, paragraph 1.

      You relate your plasticity rule to behavioral-timescale plasticity (BTSP) in the hippocampus, but plasticity has been shown to be reduced in the hippocampus after psychedelic administration. Could you elaborate on this connection?

      When we were establishing a connection between our ‘Wake-Sleep’ plasticity rule and BTSP learning, the intended connection was exclusively to the mathematical form of the plasticity rule, in which activity in the apical dendrites of pyramidal neurons functions as an instructive signal for plasticity in basal synapses (and vice versa): we will clarify this in the text. Similarly, we point out that such a plasticity rule tends to result in correlated tuning between apical and basal dendritic compartments, which has been observed in hippocampus and cortex: this is intended as a sanity check of our mapping of the Wake-Sleep algorithm to cortical microcircuitry, and has limited further bearing on the effects of psychedelics specifically.

      Reduction in plasticity in the hippocampus after psychedelic administration could be due to a complementary learning systems-type model, in which the hippocampus becomes partly decoupled from the cortex during REM sleep (Singh et al., 2022); were this to be the case, it would not be incompatible with our model, which is mostly focused on the cortex. Notably, potentiating 5HT-2a receptors in the ventral hippocampus does not induce the head-twitch response, though it does produce anxiolytic effects (Tiwari et al., 2024), indicating that the hallucinatory and anxiolytic effects of classical psychedelics may be partly decoupled. 

      Reviewer 2 Concerns:

      Could you provide visualizations of the ‘ripple’ phenomenon that you’re referring to?

      In our revised submission, ‘ripple’ phenomena are now visible in two places: Fig 2c-d, and Fig 6 (rows 2 and 3). Because the VDVAE models used to generate Figure 6 produce higher quality generated images, the ripples appearing in these plots are likely more prototypical, but it is not easy to evaluate the quality of these visualizations relative to subjective hallucination phenomena.

      Could you provide a more nuanced description of alternative roles for top-down feedback, beyond being used exclusively for learning as depicted in your model?

      For the sake of simplicity, we only treat top-down inputs in our model as a source of an instructive teaching signal, the originator of generative replay events during the Sleep phase, and as the mechanism of hallucination generation. However, as discussed in a response to a previous question, in the cortex pyramidal neurons receive and respond to a mixture of top-down and bottom-up processing.

      There are a variety of theories for what role top-down inputs could play in determining network activity. To name several, top-down input could function as: 1) a denoising/pattern completion signal (Kadkhodaie & Simoncelli, 2021), 2) a feedback control signal (Podlaski & Machens, 2020), 3) an attention signal (Lindsay, 2020), 4) ordinary inputs for dynamic recurrent processing that play no specialized role distinct from bottom-up or lateral inputs except to provide inputs from higher-order association areas or other sensory modalities (Kar et al., 2019; Tugsbayar et al., 2025). Though our model does not include these features, they are perfectly consistent with our approach.

      In particular, denoising/pattern completion signals in the predictive coding framework (closely related to the Wake-Sleep algorithm) also play a role as an instructive learning signal (Salvatori et al., 2021); and top-down control signals can play a similar role in some models (Gilra & Gerstner, 2017; Meulemans et al., 2021). Thus, options 1 and 2 are heavily overlapping with our approach, and are a natural consequence of many biologically plausible learning algorithms that minimize a variational free energy loss (Rao & Ballard, 1997; Ackley et al., 1985). Similarly, top-down attentional signals can exist alongside top-down learning signals, and some models have argued that such signals can be heavily overlapping or mutually interchangeable (Roelfsema & van Ooyen, 2005). Lastly, generic recurrent connectivity (from any source) can be incorporated into the Wake-Sleep algorithm (Dayan & Hinton, 1996), though we avoided doing this in the present study due to an absence of empirical architecture exploration in the literature and the computational complexity associated with training on time series data.

      To conclude, there are a variety of alternative functions proposed for top-down inputs onto pyramidal neurons in the cortex, and we view these additional features as mutually compatible with our approach; for simplicity we did not include them in our Wake-Sleep-trained model, but we believe that these features are unlikely to interfere with our testable predictions or empirical results. In fact, the pretrained VDVAE models that we worked with do include top-down influence during the Wake-stage inference process, and these models recapitulated our key results and testable predictions (Fig. S8).

      Relevant modifications: Fig. S8; Page 12, paragraph 4.

    1. As a girl, Ms. Book would save up her allowance then head to Indigo to pick out her next read, usually whatever had the coolest cover and best synopsis on the $6 book shelf.

      anecdotal lead

    1. Two of the fairest stars in all the heaven, 860Having some business, do entreat her eyes To twinkle in their spheres till they return. What if her eyes were there, they in her head?

      Hyperbole: Her eyes are so bright stars swap places with them. Extreme romantic exaggeration.

    2. Romeo's character traits are described in act II, scene 3: Romeo: "Good morrow, father." – His cheerful greeting contrasts with the Friar's philosophical mood. Shows Romeo's youthful energy and distraction by love. Friar: "Young son, it argues a distemper'd head" – "Distemper'd" = disordered, sick. He assumes Romeo is unwell to be up so early. He's right—love-sick. "Our Romeo hath not been in bed to-night" – The Friar deduces the truth. Romeo's behavior is so unusual it reveals his all-night adventure.

    1. Mercutio. Nay, an there were two such, we should have none shortly, for one would kill the other. Thou! why, thou wilt quarrel with a man that hath a hair more, 1515or a hair less, in his beard, than thou hast: thou wilt quarrel with a man for cracking nuts, having no other reason but because thou hast hazel eyes: what eye but such an eye would spy out such a quarrel? Thy head is as fun of quarrels as an egg is full of 1520meat, and yet thy head hath been beaten as addle as an egg for quarrelling: thou hast quarrelled with a man for coughing in the street, because he hath wakened thy dog that hath lain asleep in the sun: didst thou not fall out with a tailor for wearing 1525his new doublet before Easter? with another, for tying his new shoes with old riband? and yet thou wilt tutor me from quarrelling!

      Mercutio’s exaggeration. Hyperbole: Mercutio claims Benvolio would fight over anything, even a cough or someone’s shoes. This is funny exaggeration.

    1. The Spanish relied on Indigenous allies. The Tlaxcala were among the most important Spanish allies in their conquest. This sixteenth-century drawing depicts the Spanish and their Tlaxcalan allies fighting against the Purépecha

      I find this image interesting as there are many different things going on. I notice there is a man hanging from the top, as well as a man with his head chopped off in the bottom. I also see a dog fighting as well as horses. I also notice the different weapons and shields used.

    1. We think faith means shutting off our brains and hoping for the best. We act like NPCs (Non-Player Characters) walking into walls, expecting the wall to move because we "believed hard enough." This isn't faith; it's negligence.

      ⚠️ Glitch Detected: This is the "Brute Force" error. Status Check: Where are you currently "banging your head against the wall" expecting a breakthrough, when you should be stopping to analyse the lock? Compassion Note: If you are tired, you are likely using force, not Sakal.

    1. Prince Escalus. Seal up the mouth of outrage for a while, 3190Till we can clear these ambiguities, And know their spring, their head, their true descent; And then will I be general of your woes,

      Prince longs for settlement of the feud between the Capulets and the Montagues.

    2. Prince Escalus. A glooming peace this morning with it brings; The sun, for sorrow, will not show his head: Go hence, to have more talk of these sad things; Some shall be pardon'd, and some punished: For never was a story of more woe 3285Than this of Juliet and her Romeo.

      Conclusion. Prince's final words: "A glooming peace" - a sad peace comes. The Prince says this is the saddest story ever.

    1. Reviewer #1 (Public review):

      Summary:

      Here, the authors have addressed the recruitment and firing patterns of motor units (MUs) from the long and lateral heads of triceps in the mouse. They used their newly developed Myomatrix arrays to record from these muscles during treadmill locomotion at different speeds, and they used template-based spike sorting (Kilosort) to extract units. Between MUs from the two heads, the authors observe differences in their firing rates, recruitment probability, phase of activation within the locomotor cycle and interspike interval patterning. Examining different walking speeds, the authors find increases in both recruitment probability and firing rates as speed increases. The authors also observed differences in the relation between recruitment and the angle of elbow extension between motor units from each head. These differences indicate meaningful variation between motor units within and across motor pools, and may reflect the somewhat distinct joint actions of the two heads of triceps.

      Strengths:

      The extraction of MU spike timing for many individual units is an exciting new method that has great promise for exposing the fine detail in muscle activation and its control by the motor system. In particular, the methods developed by the authors for this purpose seem to be the only way to reliably resolve single MUs in the mouse, as the methods used previously in humans and in monkeys (e.g. Marshall et al. Nature Neuroscience, 2022) do not seem readily adaptable for use in rodents.

      The paper provides a number of interesting observations. There are signs of interesting differences in MU activation profiles for individual muscles here, consistent with those shown by Marshall et al. It is also nice to see fine scale differences in the activation of different muscle heads, which could relate to their partially distinct functions. The mouse offers greater opportunities for understanding the control of these distinct functions, compared to the other organisms in which functional differences between heads have previously been described.

      The Discussion is very thorough, providing a very nice recounting of a great deal of relevant previous results.

      Weaknesses:

      The findings are limited to one pair of muscle heads. While the findings are important in their own right, the lack of confirmation from analysis of other muscles acting at other joints leaves the generalization of these findings unclear.

      While differences between muscle heads with somewhat distinct functions are interesting and relevant to joint control, differences between MUs for individual muscles, like those in Marshall et al., are more striking because they cannot be attributed potentially to differences in each head's function. The present manuscript does show some signs of differences for MUs within individual heads (e.g. Figure 2C), but the manuscript falls short of providing a statistical basis for the existence of distinct subpopulations.

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Here, the authors have addressed the recruitment and firing patterns of motor units (MUs) from the long and lateral heads of the triceps in the mouse. They used their newly developed Myomatrix arrays to record from these muscles during treadmill locomotion at different speeds, and they used template-based spike sorting (Kilosort) to extract units. Between MUs from the two heads, the authors observed differences in their firing rates, recruitment probability, phase of activation within the locomotor cycle, and interspike interval patterning. Examining different walking speeds, the authors find increases in both recruitment probability and firing rates as speed increases. The authors also observed differences in the relation between recruitment and the angle of elbow extension between motor units from each head. These differences indicate meaningful variation between motor units within and across motor pools and may reflect the somewhat distinct joint actions of the two heads of triceps.

      Strengths:

      The extraction of MU spike timing for many individual units is an exciting new method that has great promise for exposing the fine detail in muscle activation and its control by the motor system. In particular, the methods developed by the authors for this purpose seem to be the only way to reliably resolve single MUs in the mouse, as the methods used previously in humans and in monkeys (e.g. Marshall et al. Nature Neuroscience, 2022) do not seem readily adaptable for use in rodents.

      The paper provides a number of interesting observations. There are signs of interesting differences in MU activation profiles for individual muscles here, consistent with those shown by Marshall et al. It is also nice to see fine-scale differences in the activation of different muscle heads, which could relate to their partially distinct functions. The mouse offers greater opportunities for understanding the control of these distinct functions, compared to the other organisms in which functional differences between heads have previously been described.

      The Discussion is very thorough, providing a very nice recounting of a great deal of relevant previous results.

      We thank the Reviewer for these comments.

      Weaknesses:

      The findings are limited to one pair of muscle heads. While an important initial finding, the lack of confirmation from analysis of other muscles acting at other joints leaves the general relevance of these findings unclear.

      The Reviewer raises a fair point. While outside the scope of this paper, future studies should certainly address a wider range of muscles to better characterize motor unit firing patterns across different sets of effectors with varying anatomical locations. Still, the importance of results from the triceps long and lateral heads should not be understated as this paper, to our knowledge, is the first to capture the difference in firing patterns of motor units across any set of muscles in the locomoting mouse.

      While differences between muscle heads with somewhat distinct functions are interesting and relevant to joint control, differences between MUs for individual muscles, like those in Marshall et al., are more striking because they cannot be attributed potentially to differences in each head's function. The present manuscript does show some signs of differences for MUs within individual heads: in Figure 2C, we see what looks like two clusters of motor units within the long head in terms of their recruitment probability. However, a statistical basis for the existence of two distinct subpopulations is not provided, and no subsequent analysis is done to explore the potential for differences among MUs for individual heads.

      We agree with the Reviewer and have revised the manuscript to better examine potential subpopulations of units within each muscle as presented in Figure 2C. We performed Hartigan’s dip test on motor units within each muscle to test for multimodal distributions. For both muscles, p > 0.05, so we cannot reject the null hypothesis that the units in each muscle come from a multimodal distribution. However, Hartigan’s test and similar statistical methods have poor statistical power for the small sample sizes (n=17 and 16 for long and lateral heads, respectively) considered here, so the failure to achieve statistical significance might reflect either the absence of a true difference or a lack of statistical resolution.

      Still, the limited sample size warrants further data collection and analysis since the varying properties across motor units may lead to different activation patterns. Given these results, we have edited the text as follows:

      “A subset of units, primarily in the long head, were recruited in under 50% of the total strides and with lower spike counts (Figure 2C). This distribution of recruitment probabilities might reflect a functionally different subpopulation of units. However, the distribution of recruitment probabilities were not found to be significantly multimodal (p>0.05 in both cases, Hartigan’s dip test; Hartigan, 1985). However, Hartigan’s test and similar statistical methods have poor statistical power for the small sample sizes (n=17 and 16 for long and lateral heads, respectively) considered here, so the failure to achieve statistical significance might reflect either the absence of a true difference or a lack of statistical resolution.”

      The statistical foundation for some claims is lacking. In addition, the description of key statistical analysis in the Methods is too brief and very hard to understand. This leaves several claims hard to validate.

      We thank the Reviewer for these comments and have clarified the text related to key statistical analyses throughout the manuscript, as described in our other responses below.

      Reviewer #2 (Public review):

      The present study, led by Thomas and collaborators, aims to describe the firing activity of individual motor units in mice during locomotion. To achieve this, they implanted small arrays of eight electrodes in two heads of the triceps and performed spike sorting using a custom implementation of Kilosort. Simultaneously, they tracked the positions of the shoulder, elbow, and wrist using a single camera and a markerless motion capture algorithm (DeepLabCut). Repeated one-minute recordings were conducted in six mice at five different speeds, ranging from 10 to 27.5 cm·s<sup>-1</sup>.

      From these data, the authors reported that:

      (1) a significant portion of the identified motor units was not consistently recruited across strides,

      (2) motor units identified from the lateral head of the triceps tended to be recruited later than those from the long head,

      (3) the number of spikes per stride and peak firing rates were correlated in both muscles, and

      (4) the probability of motor unit recruitment and firing rates increased with walking speed.

      The authors conclude that these differences can be attributed to the distinct functions of the muscles and the constraints of the task (i.e., speed).

      Strengths:

      The combination of novel electrode arrays to record intramuscular electromyographic signals from a larger muscle volume with an advanced spike sorting pipeline capable of identifying populations of motor units.

      We thank the Reviewer for this comment.

      Weaknesses:

      (1) There is a lack of information on the number of identified motor units per muscle and per animal.

      The Reviewer is correct that this information was not explicitly provided in the prior submission. We have therefore added Table 1 that quantifies the number of motor units per muscle and per animal.

      (2) All identified motor units are pooled in the analyses, whereas per-animal analyses would have been valuable, as motor units within an individual likely receive common synaptic inputs. Such analyses would fully leverage the potential of identifying populations of motor units.

      Please see our answer to the following point, where we address questions (2) and (3) together.

      (3) The current data do not allow for determining which motor units were sampled from each pool. It remains unclear whether the sample is biased toward high-threshold motor units or representative of the full pool.

      We thank the Reviewer for these comments. To clarify how motor unit responses were distributed across animals and muscle targets, we updated or added the following figures:  

      Figure 2C

      Figure 4–figure supplement 1

      Figure 5–figure supplement 2

      Figure 6–figure supplement 2

      These provide a more complete look at the range of activity within each motor pool, suggesting that we do measure from units with different activation thresholds within the same motor pool, rather than this variation being due to cross-animal differences. For example, Figure 2C illustrates that motor units from the same muscle and animal show a wide variety of recruitment probabilities. However, the limited number of motor units recorded from each individual animal does not allow a statistically rigorous test for examining cross-animal differences.

      (4) The behavioural analysis of the animals relies solely on kinematics (2D estimates of elbow angle and stride timing). Without ground reaction forces or shoulder angle data, drawing functional conclusions from the results is challenging.

      The Reviewer is correct that we did not measure muscular force generation or ground reaction forces in the present study. Although outside the scope of this study, future work might employ buckle force transducers as used in larger animals (Biewener et al., 1988; Karabulut et al., 2020) to examine the complex interplay between neural commands, passive biomechanics, and the complex force-generating properties of muscle tissue.

      Major comments:

      (1) Spike sorting

      The conclusions of the study rely on the accuracy and robustness of the spike sorting algorithm during a highly dynamic task. Although the pipeline was presented in a previous publication (Chung et al., 2023, eLife), a proper validation of the algorithm for identifying motor unit spikes is still lacking. This is particularly important in the present study, as the experimental conditions involve significant dynamic changes. Under such conditions, muscle geometry is altered due to variations in both fibre pennation angles and lengths.

      This issue differs from electrode drift, and it is unclear whether the original implementation of Kilosort includes functions to address it. Could the authors provide more details on the various steps of their pipeline, the strategies they employed to ensure consistent tracking of motor unit action potentials despite potential changes in action potential waveforms, and the methods used for manual inspection of the spike sorting algorithm's output?

      This is an excellent point and we agree that the dynamic behavior used in this investigation creates potential new challenges for spike sorting. In our analysis, Kilosort 2.5 provides key advantages in comparing unit waveforms across multiple channels and in detecting overlapping spikes. We modified this version of Kilosort to construct unit waveform templates using only the channels within the same muscle (Chung et al., 2023), as clarified in the revised Methods section (see “Electromyography (EMG)”):

      “A total of 33 units were identified across all animals. Each unit’s isolation was verified by confirming that no more than 2% of inter-spike intervals violated a 1 ms refractory limit. Additionally, we manually reviewed cross-correlograms to ensure that each waveform was only reported as a single motor unit.”

      The Reviewer is correct that our ability to precisely measure a unit’s activity based on its waveform will depend on the relationship between the embedded electrode and the muscle geometry, which alters over the course of the stride. As a follow-up to the original text, we have included new analyses to characterize the waveform activity throughout the experiment and stride (also in Methods):

      “We further validated spike sorting by quantifying the stability of each unit’s waveform across time (Figure 1–figure supplement 1). First, we calculated the median waveform of each unit across every trial to capture long-term stability of motor unit waveforms. Additionally, we calculated the median waveform through the stride binned in 50 ms increments using spiking from a single trial. This second metric captures the stability of our spike sorting during the rapid changes in joint angles that occur during the burst of an individual motor unit. In doing so, we calculated each motor unit’s waveforms from the single channel in which that unit’s amplitude was largest and did not attempt to remove overlapping spikes from other units before measuring the median waveform from the data. We then calculated the correlation between a unit’s waveform over either trials or bins in which at least 30 spikes were present. The high correlation of a unit waveform over time, despite potential changes in the electrodes’ position relative to muscle geometry over the dynamic task, provides additional confidence in both the stability of our EMG recordings and the accuracy of our spike sorting.”

      (2) Yield of the spike sorting pipeline and analyses per animal/muscle

      A total of 33 motor units were identified from two heads of the triceps in six mice (17 from the long head and 16 from the lateral head). However, precise information on the yield per muscle per animal is not provided. This information is crucial to support the novelty of the study, as the authors claim in the introduction that their electrode arrays enable the identification of populations of motor units. Beyond reporting the number of identified motor units, another way to demonstrate the effectiveness of the spike sorting algorithm would be to compare the recorded EMG signals with the residual signal obtained after subtracting the action potentials of the identified motor units, using a signal-to-residual ratio.

      Furthermore, motor units identified from the same muscle and the same animal are likely not independent due to common synaptic inputs. This dependence should be accounted for in the statistical analyses when comparing changes in motor unit properties across speeds and between muscles.

      We thank the Reviewer for this comment. Regarding motor unit yield, as described above the newly-added Table 1 displays the yield from each animal and muscle.

      Regarding spike sorting, while signal-to-residual is often an excellent metric, it is not ideal for our high-resolution EMG signals since isolated single motor units are typically superimposed on a “bulk” background consisting of the low-amplitude waveforms of other motor units. Because these smaller units typically cannot be sorted, it is challenging to estimate the “true” residual after subtracting (only) the largest motor unit, since subtracting each sorted unit’s waveform typically has a very small effect on the RMS of the total EMG signal. To further address concerns regarding spike sorting quality, we added Figure 1–figure supplement 1 that demonstrates motor units’ consistency over the experiment, highlighting that the waveform maintains its shape within each stride despite muscle/limb dynamics and other possible sources of electrical noise or artifact.

      Finally, the Reviewer is correct that individual motor units in the same muscle are very likely to receive common synaptic inputs. These common inputs may reflect in sparse motor units being recruited in overlapping rather than different strides. Indeed, in the following added to the Results, we identified that motor units are recruited with higher probability when additional units are recruited.

      “Probabilistic recruitment is correlated across motor units

      Our results show that the recruitment of individual motor units is probabilistic even within a single speed quartile (Figure 5A-C) and predicts body movements (Figure 6), raising the question of whether the recruitment of individual motor units are correlated or independent. Correlated recruitment might reflect shared input onto the population of motor units innervating the muscle (De Luca, 1985; De Luca & Erim, 1994; Farina et al., 2014). For example, two motor units, each with low recruitment probabilities, may still fire during the same set of strides. To assess the independence of motor unit recruitment across the recorded population, we compared each unit’s empirical recruitment probability across all strides to its conditional recruitment probability during strides in which another motor unit from the same muscle was recruited (Figure 7). Doing this for all motor unit pairs revealed that motor units in both muscles were biased towards greater recruitment when additional units were active (p<0.001, Wilcoxon signed-rank tests for both the lateral and long heads of triceps). This finding suggests that probabilistic recruitment reflects common synaptic inputs that covary together across locomotor strides.”

      (3) Representativeness of the sample of identified motor units

      However, to draw such conclusions, the authors should exclusively compare motor units from the same pool and systematically track violations of the recruitment order. Alternatively, they could demonstrate that the motor units that are intermittently active across strides correspond to the smallest motor units, based on the assumption that these units should always be recruited due to their low activation thresholds.

      One way to estimate the size of motor units identified within the same muscle would be to compare the amplitude of their action potentials, assuming that all motor units are relatively close to the electrodes (given the selectivity of the recordings) and that motoneurons innervating more muscle fibres generate larger motor unit action potentials.

      We thank the Reviewer for this comment. Below, we provide more detailed analyses of the relationships between motor unit spike amplitude and the recruitment probability as well as latency (relative to stride onset) of activation.

      We generated the below figures to illustrate the relationship between the amplitude of motor units and their firing properties. As suspected, units with larger-amplitude waveforms fired with lower probability and produced their first spikes later in the stride. If we were comfortable assuming that larger spike amplitudes mean higher-force units, then this would be consistent with a key prediction of the size principle (i.e. that higher-force units are recruited later). However, we are hesitant to base any conclusions on this assumption or emphasize this point with a main-text figure, since EMG signal amplitude may also vary due to the physical properties of the electrode and distance from muscle fibers. Thus it is possible that a large motor unit may have a smaller waveform amplitude relative to the rest of the motor pool.

      Author response image 1.

      Relation between motor unit amplitude and (A) recruitment probability and (B) mean first spike time within the stride. Colored lines indicate the outcome of linear regression analyses.

      Currently, the data seem to support the idea that motor units that are alternately recruited across strides have recruitment thresholds close to the level of activation or force produced during slow walking. The fact that recruitment probability monotonically increases with speed suggests that the force required to propel the mouse forward exceeds the recruitment threshold of these "large" motor units. This pattern would primarily reflect spatial recruitment following the size principle rather than flexible motor unit control.

      We thank the Reviewer for this comment. We agree with this interpretation, particularly in relation to the references suggested in later comments, and have added the following text to the Discussion to better reflect this argument:

      “To investigate the neuromuscular control of locomotor speed, we quantified speed-dependent changes in both motor unit recruitment and firing rate. We found that the majority of units were recruited more often and with larger firing rates at faster speeds (Figure 5, Figure5–figure supplement 1). This result may reflect speed-dependent differences in the common input received by populations of motor neurons with varying spiking thresholds (Henneman et al., 1965). In the case of mouse locomotion, faster speeds might reflect a larger common input, increasing the recruitment probability as more neurons, particularly those that are larger and generate more force, exceed threshold for action potentials (Farina et al., 2014).”

      (4) Analysis of recruitment and firing rates

      The authors currently report active duration and peak firing rates based on spike trains convolved with a Gaussian kernel. Why not report the peak of the instantaneous firing rates estimated from the inverse of the inter-spike interval? This approach appears to be more aligned with previous studies conducted to describe motor unit behaviour during fast movements (e.g., Desmedt & Godaux, 1977, J Physiol; Van Cutsem et al., 1998, J Physiol; Del Vecchio et al., 2019, J Physiol).

      We thank the Reviewer for this comment. In the revised Discussion (see ‘Firing rates in mouse locomotion compared to other species’) we reference several examples of previous studies that quantified spike patterns based on the instantaneous firing rate. We chose to report the peak of the smoothed firing rate because that quantification includes strides with zero spikes or only one spike, which occur regularly in our dataset (and for which ISI rate measures, which require two spikes to define an instantaneous firing rate, cannot be computed). Regardless, in the revised Figure 4B, we present an analysis that uses inter-spike intervals as suggested, which yielded similar ranges of firing rates as the primary analysis.

      (5) Additional analyses of behaviour

      The authors currently analyse motor unit recruitment in relation to elbow angle. It would be valuable to include a similar analysis using the angular velocity observed during each stride, re broadly, comparing stride-by-stride changes in firing rates with changes in elbow angular velocity would further strengthen the final analyses presented in the results section.

      We thank the Reviewer for this comment. To address this, we have modified Figure 6 and the associated Supplemental Figures, to show relationships in unit activation with both the range of elbow extension and the range of elbow velocity for each stride. These new Supplemental Figures show that the trends shown in main text Figure 6C and 6E (which show data from all speed quartiles on the same axes) are also apparent in both the slower and faster quartiles individually, although single-quartile statistical tests (with smaller sample size than the main analysis) not reach statistical significance in all cases.

      Reviewer #3 (Public review):

      Summary:

      Using the approach of Myomatrix recording, the authors report that:

      (1) Motor units are recruited differently in the two types of muscles.

      (2) Individual units are probabilistically recruited during the locomotion strides, whereas the population bulk EMG has a more reliable representation of the muscle.

      (3) The recruitment of units was proportional to walking speed.

      Strengths:

      The new technique provides a unique data set, and the data analysis is convincing and well-performed.

      We thank the Reviewer for the comment.

      Weaknesses:

      The implications of "probabilistical recruitment" should be explored, addressed, and analyzed further.

      Comments:

      One of the study's main findings (perhaps the main finding) is that the motor units are "probabilistically" recruited. The authors do not define what they mean by probabilistically recruited, nor do they present an alternative scenario to such recruitment or discuss why this would be interesting or surprising. However, on page 4, they do indicate that the recruitment of units from both muscles was only active in a subset of strides, i.e., they are not reliably active in every step.

      If probabilistic means irregular spiking, this is not new. Variability in spiking has been seen numerous times, for instance in human biceps brachii motor units during isometric contractions (Pascoe, Enoka, Exp physiology 2014) and elsewhere. Perhaps the distinction the authors are seeking is between fluctuation-driven and mean-driven spiking of motor units as previously identified in spinal motor networks (see Petersen and Berg, eLife 2016, and Berg, Frontiers 2017). Here, it was shown that a prominent regime of irregular spiking is present during rhythmic motor activity, which also manifests as a positive skewness in the spike count distribution (i.e., log-normal).

      We thank the Reviewer for this comment and have clarified several passages in response. The Reviewer is of course correct that irregular motor unit spiking has been described previously and may reflect motor neurons’ operating in a high-sensitivity (fluctuation-driven) regime. We now cite these papers in the Discussion (see ‘Firing rates in mouse locomotion compared to other species’). Additionally, the revision clarifies that “probabilistically” - as defined in our paper - refers only to the empirical observation that a motor unit spikes during only a subset of strides, either when all locomotor speeds are considered together (Figure 2) or separately (Figure 5A-C):

      “Motor units in both muscles exhibited this pattern of probabilistic recruitment (defined as a unit’s firing on only a fraction of strides), but with differing distributions of firing properties across the long and lateral heads (Figure 2).”

      “Our findings (Figure 4) highlight that even with the relatively high firing rates observed in mice, there are still significant changes in firing rate and recruitment probability across the spikes within bursts (Figure 4B) and across locomotor speeds (Figure 5F). Future studies should more carefully examine how these rapidly changing spiking patterns derive from both the statistics of synaptic inputs and intrinsic properties of motor neurons (Manuel & Heckman, 2011; Petersen & Berg, 2016; Berg, 2017).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      As mentioned above, there are several issues with the statistics that need to be corrected to properly support the claims made in the paper.

      The authors compare the fractions of MUs that show significant variation across locomotor speeds in their firing rate and recruitment probability. However, it is not statistically founded to compare the results of separate statistical tests based on different kinds of measurements and thus have unconstrained differences in statistical power. The comparison of the fractional changes in firing rates and recruitment across speeds that follow is helpful, though in truth, by contemporary standards, one would like to see error bars on these estimates. These could be generated using bootstrapping.

      The Reviewer is correct, and we have revised the manuscript to better clarify which quantities should or should not be compared, including the following passage (see “Motor unit mechanisms of speed control” in Results):

      “Speed-dependent increases in peak firing rate were therefore also present in our dataset, although in a smaller fraction of motor units (22/33) than changes in recruitment probability (31/33). Furthermore, the mean (± SE) magnitude of speed-dependent increases was smaller for spike rates (mean rate<sub>fast</sub>/rate<sub>slow</sub> of 111% ± 20% across all motor units) than for recruitment probabilities (mean p(recruitment) <sub>fast</sub>/p(recruitment) <sub>slow</sub> of 179% ± 3% across all motor units). While fractional changes in rate and recruitment probability are not readily comparable given their different upper limits, these findings could suggest that while both recruitment and peak rate change across speed quartiles, increased recruitment probability may play a larger role in driving changes in locomotor speed.”

      The description in the Methods of the tests for variation in firing rates and recruitment probability across speeds are extremely hard to understand - after reading many times, it is still not clear what was done, or why the method used was chosen. In the main text, the authors quote p-values and then state "bootstrap confidence intervals," which is not a statistical test that yields a p-value. While there are mathematical relationships between confidence intervals and statistical tests such that a one-to-one correspondence between them can exist, the descriptions provided fall short of specifying how they are related in the present instance. For this reason, and those described in what follows, it is not clear what the p-values represent.

      Next, the authors refer to fitting a model ("a Poisson distribution") to the data to estimate firing rate and recruitment probability, that the model results agree with their actual data, and that they then bootstrapped from the model estimates to get confidence intervals and compute p-values. Why do this? Why not just do something much simpler, like use the actual spike counts, and resample from those? I understand that it is hard to distinguish between no recruitment and just no spikes given some low Poisson firing rate, but how does that challenge the ability to test if the firing rates or the number of spiking MUs changes significantly across speeds? I can come up with some reasons why I think the authors might have decided to do this, but reasoning like this really should be made explicit.

      In addition, the authors would provide an unambiguous description of the model, perhaps using an equation and a description of how it was fit. For the bootstrapping, a clear description of how the resampling was done should be included. The focus on peak firing rate instead of mean (or median) firing rate should also be justified. Since peaks are noisier, I would expect the statistical power to be lower compared to using the mean or median.

      We thank the Reviewer for the comments and have revised and expanded our discussion of the statistical tests employed. We expanded and clarified our description of these techniques in the updated Methods section:

      “Joint model of rate and recruitment

      We modeled the recruitment probability and firing rate based on empirical data to best characterize firing statistics within the stride. Particularly, this allowed for multiple solutions to explain why a motor unit would not spike within a stride. From the empirical data alone, strides with zero spikes would have been assumed to have no recruitment of a unit. However, to create a model of motor unit activity that includes both recruitment and rate, it must be possible that a recruited unit can have a firing rate of zero. To quantify the firing statistics that best represent all spiking and non-spiking patterns, we modeled recruitment probability and peak firing rate along the following piecewise function:

      where y denotes the observed peak firing rate on a given stride (determined by convolving motor unit spike times with a Gaussian kernel as described above), p denotes the probability of recruitment, and λ denotes the expected peak firing rate from a Poisson distribution of outcomes. Thus, an inactive unit on a given stride may be the result of either non-recruitment or recruitment with a stochastically zero firing rate. The above equations were fit by minimizing the negative log-likelihood of the parameters given the data.

      “Permutation test for joint model of rate and recruitment and type 2 regression slopes

      To quantify differences in firing patterns across walking speeds, we subdivided each mouse’s total set of strides into speed quartiles and calculated rate (𝜆, Eq. 1 and 2, Fig. 5A-C) and recruitment probability terms (p, Eq. 1 and 2, Fig. 5D-F) for each unit in each speed quartile. Here we calculated the difference in both the rate and recruitment terms across the fastest and slowest speed quartiles (p<sub>fast</sub>-p<sub>slow</sub> and 𝜆<sub>fast</sub>-𝜆<sub>slow</sub>). To test whether these model parameters were significantly different depending on locomotor speed, we developed a null model combining strides from both the fastest and slowest speed quartiles. After pooling strides from both quartiles, we randomly distributed the pooled set of strides into two groups with sample sizes equal to the original slow and fast quartiles. We then calculated the null model parameters for each new group and found the difference between like terms. To estimate the distribution of possible differences, we bootstrapped this result using 1000 random redistributions of the pooled set of strides. Following the permutation test, the 95% confidence interval of this final distribution reflects the null hypothesis of no difference between groups. Thus, the null hypothesis can be rejected if the true difference in rate or recruitment terms exceeds this confidence interval.

      We followed a similar procedure to quantify cross-muscle differences in the relationship between firing parameters. For each muscle, we estimated the slope across firing parameters for each motor unit using type 2 regression. In this case, the true difference was the difference in slopes between muscles. To test the null hypothesis that there was no difference in slopes, the null model reflected the pooled set of units from both muscles. Again, slopes were calculated for 1000 random resamplings of this pooled data to estimate the 95% confidence interval.”

      The argument for delayed activation of the lateral head is interesting, but I am not comfortable saying the nervous system creates a delay just based on observations of the mean time of the first spike, given the potential for differential variability in spike timing across muscles and MUs. One way to make a strong case for a delay would be to show aggregate PSTHs for all the spikes from all the MUs for each of the two heads. That would distinguish between a true delay and more gradual or variable activation between the heads.

      This is a good point and we agree that the claim made about the nervous system is too strong given the results. Even with Author response image 2 below that the Reviewer suggested, there is still not enough evidence to isolate the role of the nervous system in the muscles’ activation.

      Author response image 2.

      Aggregate peristimulus time histogram (PSTH) for all motor unit spike times in the long head (top) and lateral head (bottom) within the stride.

      In the ideal case, we would have more simultaneous recordings from both muscles to make a more direct claim on the delay. Still, within the current scope of the paper, to correct this and better describe the difference in timing of muscle activity, we edited the text to the following:

      “These findings demonstrate that despite the synergistic (extensor) function of the long and lateral heads of the triceps at the elbow, the motor pool for the long head becomes active roughly 100 ms before the motor pool supplying the lateral head during locomotion (Figure 3C).”

      The results from Marshall et al. 2022 suggest that the recruitment of some MUs is not just related to muscle force, but also the frequency of force variation - some of their MUs appear to be recruited only at certain frequencies. Figure 5C could have shown signs of this, but it does not appear to. We do not really know the force or its frequency of variation in the measurements here. I wonder whether there is additional analysis that could address whether frequency-dependent recruitment is present. It may not be addressable with the current data set, but this could be a fruitful direction to explore in the future with MU recordings from mice.

      We agree that this would be a fruitful direction to explore, however the Reviewer is correct that this is not easily addressable with the dataset. As the Reviewer points out, stride frequency increases with increased speed, potentially offering the opportunity to examine how motor unit activity varies with the frequency, phase, and amplitude of locomotor movements. However, given our lack of force data (either joint torques or ground reaction forces), dissociating the frequency/phase/amplitude of skeletal kinematics from the frequency/phase/amplitude of muscle force. Marshall et al. (2022) mitigated these issues by using an isometric force-production task (Marshall et al., 2022). Therefore, while we agree that it would be a major contribution to extend such investigations to whole-body movements like locomotion, given the complexities described above we believe this is a project for the future, and beyond the scope of the present study.

      Minor:

      Page 5: "Units often displayed no recruitment in a greater proportion of strides than for any particular spike count when recruited (Figures 2A, B)," - I had to read this several times to understand it. I suggest rephrasing for clarity.

      We have changed the text to read:

      “Units demonstrated a variety of firing patterns, with some units producing 0 spikes more frequently than any non-zero spike count (Figure 2A, B),...”

      Figure 3 legend: "Mean phase ({plus minus} SE) of motor unit burst duration across all strides.": It is unclear what this means - durations are not usually described as having a phase. Do we mean the onset phase?

      We have changed the text to read:

      “Mean phase ± SE of motor unit burst activity within each stride”

      Page 9: "suggesting that the recruitment of individual motor units in the lateral and long heads might have significant (and opposite) effects on elbow angle in strides of similar speed (see Discussion)." I wouldn't say "opposite" here - that makes it sound like the authors are calling the long head a flexor. The authors should rephrase or clarify the sense in which they are opposite.

      This is a fair point and we agree we should not describe the muscles as ‘opposite’ when both muscles are extensors. We have removed the phrase ‘and opposite’ from the text.

      Page 11: "in these two muscles across in other quadrupedal species" - typo.

      We have corrected this error.

      Page 16: This reviewer cannot decipher after repeated attempts what the first two sentences of the last paragraph mean. - “Future studies might also use perturbations of muscle activity to dissociate the causal properties of each motor unit’s activity from the complex correlation structure of locomotion. Despite the strong correlations observed between motor unit recruitment and limb kinematics (Fig. 6, Supplemental Fig. 3), these results might reflect covariations of both factors with locomotor speed rather than the causal properties of the recorded motor unit.”

      For better clarity, we have changed the text to read:

      “Although strong correlations were observed between motor unit recruitment and limb kinematics during locomotion (Figure 6, Figure 6–figure supplement 1), it remains unclear whether such correlations actually reflect the causal contributions that those units make to limb movement. To resolve this ambiguity, future studies could use electrical or optical perturbations of muscle contraction levels (Kim et al., 2024; Lu et al., 2024; Srivastava et al., 2015, 2017) to test directly how motor unit firing patterns shape locomotor movements. The short-latency effects of patterned motor unit stimulation (Srivastava et al., 2017) could then reveal the sensitivity of behavior to changes in muscle spiking and the extent to which the same behaviors can be performed with many different motor commands.”

      Reviewer #2 (Recommendations for the authors):

      Minor comments:

      Introduction:

      (1) "Although studies in primates, cats, and zebrafish have shown that both the number of active motor units and motor unit firing rates increase at faster locomotor speeds (Grimby, 1984; Hoffer et al., 1981, 1987; Marshall et al., 2022; Menelaou & McLean, 2012)." I would remove Marshall et al. (2022) as their monkeys performed pulling tasks with the upper limb. You can alternatively remove locomotor from the sentence and replace it with contraction speed.

      Thank you for the comment. While we intended to reference this specific paper to highlight the rhythmic activity in muscles, we agree that this deviates from ‘locomotion’ as it is referenced in the other cited papers which study body movement. We have followed the Reviewer’s suggestion to remove the citation to Marshall et al.

      (2) "The capability and need for faster force generation during dynamic behavior could implicate motor unit recruitment as a primary mechanism for modulating force output in mice."

      The authors could add citations to this sentence, of works that showed that recruitment speed is the main determinant of the rate of force development (see for example Dideriksen et al. (2020) J Neurophysiol; J. L. Dideriksen, A. Del Vecchio, D. Farina, Neural and muscular determinants of maximal rate of force development. J Neurophysiol 123, 149-157 (2020)).

      Thank you for pointing out this important reference. We have included this as a citation as recommended.

      Results:

      (3) "Electrode arrays (32-electrode Myomatrix array model RF-4x8-BHS-5) were implanted in the triceps brachii (note that Figure 1D shows the EMG signal from only one of the 16 bipolar recording channels), and the resulting data were used to identify the spike times of individual motor units (Figure 1E) as described previously (Chung et al., 2023)."

      This sentence can be misleading for the reader as the array used by the researchers has 4 threads of 8 electrodes. Would it be possible to specify the number of electrodes implanted per head of interest? I assume 8 per head in most mice (or 4 bipolar channels), even if that's not specifically written in the manuscript.

      Thank you for the suggestion. As described above, we have added Table 1, which includes all array locations, and we edited the statement referenced in the comment as follows:

      “Electrode arrays (32-electrode Myomatrix array model RF-4x8-BHS-5) were implanted in forelimb muscles (note that Figure 1D shows the EMG signal from only one of the 16 bipolar recording channels), and the resulting data were used to identify the spike times of individual motor units in the triceps brachii long and lateral heads (Table 1, Figure 1E) as described previously (Chung et al., 2023).“

      (4) "These findings demonstrate that despite the overlapping biomechanical functions of the long and lateral heads of the triceps, the nervous system creates a consistent, approximately 100 ms delay (Figure 3C) between the activation of the two muscles' motor neuron pools. This timing difference suggests distinct patterns of synaptic input onto motor neurons innervating the lateral and long heads."

      Both muscles don't have fully overlapping biomechanical functions, as one of them also acts on the shoulder joint. Please be more specific in this sentence, saying that both muscles are synergistic at the elbow level rather than "have overlapping biomechanical functions".

      We agree with the above reasoning and that our manuscript should be clearer on this point. We edited the above text in accordance with the Reviewer suggestion as follows:

      "These findings demonstrate that despite the synergistic (extensor) function of the long and lateral heads of the triceps at the elbow, …”  

      (5) "Together with the differences in burst timing shown in Figure 3B, these results again suggest that the motor pools for the lateral and long heads of the triceps receive distinct patterns of synaptic input, although differences in the intrinsic physiological properties of motor neurons innervating the two muscles might also play an important role."

      It is difficult to draw such an affirmative conclusion on the synaptic inputs from the data presented by the authors. The differences in firing rates may solely arise from other factors than distinct synaptic inputs, such as the different intrinsic properties of the motoneurons or the reception of distinct neuromodulatory inputs.

      To better explain our findings, we adjusted the above text in the Results (see “Motor unit firing patterns in the long and lateral heads of the triceps”):

      “Together with the differences in burst timing shown in Figure 3B, these results again suggest that the motor pools for the lateral and long heads of the triceps receive distinct patterns of synaptic input, although differences in the intrinsic physiological properties of motor neurons innervating the two muscles might also play an important role.”

      We also included the following distinction in the Discussion (see “Differences in motor unit activity patterns across two elbow extensors”) to address the other plausible mechanisms mentioned.

      “The large differences in burst timing and spike patterning across the muscle heads suggest that the motor pools for each muscle receive distinct inputs. However, differences in the intrinsic physiological properties of motor units and neuromodulatory inputs across motor pools might also make substantial contributions to the structure of motor unit spike patterns (Martínez-Silva et al., 2018; Miles & Sillar, 2011).”

      (6) "We next examined whether the probabilistic recruitment of individual motor units in the triceps and elbow extensor muscle predicted stride-by-stride variations in elbow angle kinematics."

      I'm not sure that the wording is appropriate here. The analysis does not predict elbow angle variations from parameters extracted from the spiking activity. It rather compares the average elbow angle between two conditions (motor unit active or not active).

      We thank the Reviewer for this comment and agree that the wording could be improved here to better reflect our analysis. To lower the strength of our claim, we replaced usage of the word ‘predict’ with ‘correlates’ in the above text and throughout the paper when discussing this result.

      Methods:

      (7) "Using the four threads on the customizable Myomatrix array (RF-4x8-BHS-5), we implanted a combination of muscles in each mouse, sometimes using multiple threads within the same muscle. [...] Some mice also had threads simultaneously implanted in their ipsilateral or contralateral biceps brachii although no data from the biceps is presented in this study."

      A precise description of the localisation of the array (muscles and the number of arrays per muscle) for each animal would be appreciated.

      (8) "A total of 33 units were identified and manually verified across all animals." A precise description of the number of motor units concurrently identified per muscle and per animal would be appreciated. Moreover, please add details on the manual inspection. Does it involve the manual selection of missing spikes? What are the criteria for considering an identified motor unit as valid?

      As discussed earlier, we added Table 1 to the main text to provide the details mentioned in the above comments.

      Regarding spike sorting, given the very large number of spikes recorded, we did not rely on manual adjusting mislabeled spikes. Instead, as described in the revised Methods section, we verified unit isolation by ensuring units had >98% of spikes outside of 1ms of each other. Moreover, as described above we have added new analyses (Figure 1–figure supplement 1) confirming the stability of motor unit waveforms across both the duration of individual recording sessions (roughly 30 minutes) and across the rapid changes in limb position within individual stride cycles (roughly 250 msec).

      Reviewer #3 (Recommendations for the authors):

      Figure 2 (and supplement) show spike count distributions with strong positive skewness, which is in accordance with the prediction of a fluctuation-driven regime. I suggest plotting these on a logarithmic x-axis (in addition to the linear axis), which should reveal a bell-shaped distribution, maybe even Gaussian, in a majority of the units.

      We thank the Reviewer for the suggestion. We present the requested analysis below, which shows bell-shaped distributions for some (but not all) distributions. However, we believe that investigating why some replotted distributions are Gaussian and others are not falls beyond the scope of this paper, and likely requires a larger dataset than the one we were able to obtain.

      Author response image 3.

      Spike count distributions for each motor unit on a logarithmic x-axis.

      Why not more data? I tried to get an overview of how much data was collected.

      Supplemental Figure 1 has all the isolated units, which amounts to 38 (are the colors the two muscle types?). Given there are 16 leads in each myomatrix, in two muscles, of six mice, this seems like a low yield. Could the authors comment on the reasons for this low yield?

      Regarding motor unit yield, even with multiple electrodes per muscle and a robust sorting algorithm, we often isolated only a few units per muscle. This yield likely reflects two factors. First, because of the highly dynamic nature of locomotion and high levels of muscle contraction, isolating individual spikes reliably across different locomotor speeds is inherently challenging, regardless of the algorithm being employed. Second, because the results of spike-train analyses can be highly sensitive to sorting errors, we have only included the motor units that we can sort with the highest possible confidence across thousands of strides.

      Minor:

      Figure captions especially Figure 6: The text is excessively long. Can the text be shortened?

      We thank the Reviewer for this comment. Generally, we seek to include a description of the methods and results within the figure captions, but we concede that we can condense the information in some cases. In a number of cases, we have moved some of the descriptive text from the caption to the Methods section.

      References

      Berg, R. W. (2017). Neuronal Population Activity in Spinal Motor Circuits: Greater Than the Sum of Its Parts. Frontiers in Neural Circuits, 11. https://doi.org/10.3389/fncir.2017.00103

      Biewener, A. A., Blickhan, R., Perry, A. K., Heglund, N. C., & Taylor, C. R. (1988). Muscle Forces During Locomotion in Kangaroo Rats: Force Platform and Tendon Buckle Measurements Compared. Journal of Experimental Biology, 137(1), 191–205. https://doi.org/10.1242/jeb.137.1.191

      Chung, B., Zia, M., Thomas, K. A., Michaels, J. A., Jacob, A., Pack, A., Williams, M. J., Nagapudi, K., Teng, L. H., Arrambide, E., Ouellette, L., Oey, N., Gibbs, R., Anschutz, P., Lu, J., Wu, Y., Kashefi, M., Oya, T., Kersten, R., … Sober, S. J. (2023). Myomatrix arrays for high-definition muscle recording. eLife, 12, RP88551. https://doi.org/10.7554/eLife.88551

      De Luca, C. J. (1985). Control properties of motor units. Journal of Experimental Biology, 115(1), 125–136. https://doi.org/10.1242/jeb.115.1.125

      De Luca, C. J., & Erim, Z. (1994). Common drive of motor units in regulation of muscle force. Trends in Neurosciences, 17(7), 299–305. https://doi.org/10.1016/0166-2236(94)90064-7

      Farina, D., Negro, F., & Dideriksen, J. L. (2014). The effective neural drive to muscles is the common synaptic input to motor neurons. The Journal of Physiology, 592(16), 3427–3441. https://doi.org/10.1113/jphysiol.2014.273581

      Hartigan, P. M. (1985). Algorithm AS 217: Computation of the Dip Statistic to Test for Unimodality. Applied Statistics, 34(3), 320. https://doi.org/10.2307/2347485

      Henneman, E., Somjen, G., & Carpenter, D. O. (1965). FUNCTIONAL SIGNIFICANCE OF CELL SIZE IN SPINAL MOTONEURONS. Journal of Neurophysiology, 28(3), 560–580. https://doi.org/10.1152/jn.1965.28.3.560

      Karabulut, D., Dogru, S. C., Lin, Y.-C., Pandy, M. G., Herzog, W., & Arslan, Y. Z. (2020). Direct Validation of Model-Predicted Muscle Forces in the Cat Hindlimb During Locomotion. Journal of Biomechanical Engineering, 142(5), 051014. https://doi.org/10.1115/1.4045660

      Kim, J. J., Wyche, I. S., Olson, W., Lu, J., Bakir, M. S., Sober, S. J., & O’Connor, D. H. (2024). Myo-optogenetics: Optogenetic stimulation and electrical recording in skeletal muscles. https://doi.org/10.1101/2024.06.21.600113

      Lu, J., Zia, M., Baig, D. A., Yan, G., Kim, J. J., Nagapudi, K., Anschutz, P., Oh, S., O’Connor, D., Sober, S. J., & Bakir, M. S. (2024). Opto-Myomatrix: μLED integrated microelectrode arrays for optogenetic activation and electrical recording in muscle tissue. https://doi.org/10.1101/2024.07.01.601601

      Manuel, M., & Heckman, C. J. (2011). Adult mouse motor units develop almost all of their force in the subprimary range: A new all-or-none strategy for force recruitment? Journal of Neuroscience, 31(42), 15188–15194. https://doi.org/10.1523/JNEUROSCI.2893-11.2011

      Marshall, N. J., Glaser, J. I., Trautmann, E. M., Amematsro, E. A., Perkins, S. M., Shadlen, M. N., Abbott, L. F., Cunningham, J. P., & Churchland, M. M. (2022). Flexible neural control of motor units. Nature Neuroscience, 25(11), 1492–1504. https://doi.org/10.1038/s41593-022-01165-8

      Martínez-Silva, M. de L., Imhoff-Manuel, R. D., Sharma, A., Heckman, C. J., Shneider, N. A., Roselli, F., Zytnicki, D., & Manuel, M. (2018). Hypoexcitability precedes denervation in the large fast-contracting motor units in two unrelated mouse models of ALS. eLife, 7(2007), 1–26. https://doi.org/10.7554/eLife.30955

      Miles, G. B., & Sillar, K. T. (2011). Neuromodulation of Vertebrate Locomotor Control Networks. Physiology, 26(6), 393–411. https://doi.org/10.1152/physiol.00013.2011

      Petersen, P. C., & Berg, R. W. (2016). Lognormal firing rate distribution reveals prominent fluctuation–driven regime in spinal motor networks. eLife, 5. https://doi.org/10.7554/elife.18805

      Srivastava, K. H., Elemans, C. P. H., & Sober, S. J. (2015). Multifunctional and Context-Dependent Control of Vocal Acoustics by Individual Muscles. The Journal of Neuroscience, 35(42), 14183–14194. https://doi.org/10.1523/JNEUROSCI.3610-14.2015

      Srivastava, K. H., Holmes, C. M., Vellema, M., Pack, A. R., Elemans, C. P. H., Nemenman, I., & Sober, S. J. (2017). Motor control by precisely timed spike patterns. Proceedings of the National Academy of Sciences of the United States of America, 114(5), 1171–1176. https://doi.org/10.1073/pnas.1611734114

    1. Act I, Scene 1 Verona. A public place.       next scene [Enter SAMPSON and GREGORY, of the house of Capulet, armed with swords and bucklers] Sampson. Gregory, o' my word, we'll not carry coals. Gregory. No, for then we should be colliers. Sampson. I mean, an we be in choler, we'll draw. Gregory. Ay, while you live, draw your neck out o' the collar. 20 Sampson. I strike quickly, being moved. Gregory. But thou art not quickly moved to strike. Sampson. A dog of the house of Montague moves me. Gregory. To move is to stir; and to be valiant is to stand: therefore, if thou art moved, thou runn'st away. 25 Sampson. A dog of that house shall move me to stand: I will take the wall of any man or maid of Montague's. Gregory. That shows thee a weak slave; for the weakest goes to the wall. Sampson. True; and therefore women, being the weaker vessels, 30are ever thrust to the wall: therefore I will push Montague's men from the wall, and thrust his maids to the wall. Gregory. The quarrel is between our masters and us their men. Sampson. 'Tis all one, I will show myself a tyrant: when I 35have fought with the men, I will be cruel with the maids, and cut off their heads. Gregory. The heads of the maids? Sampson. Ay, the heads of the maids, or their maidenheads; take it in what sense thou wilt. 40 Gregory. They must take it in sense that feel it. Sampson. Me they shall feel while I am able to stand: and 'tis known I am a pretty piece of flesh. Gregory. 'Tis well thou art not fish; if thou hadst, thou hadst been poor John. Draw thy tool! here comes 45two of the house of the Montagues. Sampson. My naked weapon is out: quarrel, I will back thee. Gregory. How! turn thy back and run? Sampson. Fear me not. Gregory. No, marry; I fear thee! 50 Sampson. Let us take the law of our sides; let them begin. Gregory. I will frown as I pass by, and let them take it as they list. Sampson. Nay, as they dare. I will bite my thumb at them; which is a disgrace to them, if they bear it. 55 [Enter ABRAHAM and BALTHASAR] Abraham. Do you bite your thumb at us, sir? Sampson. I do bite my thumb, sir. Abraham. Do you bite your thumb at us, sir? Sampson. [Aside to GREGORY] Is the law of our side, if I say 60ay? Gregory. No. Sampson. No, sir, I do not bite my thumb at you, sir, but I bite my thumb, sir. Gregory. Do you quarrel, sir? 65 Abraham. Quarrel sir! no, sir. Sampson. If you do, sir, I am for you: I serve as good a man as you. Abraham. No better. Sampson. Well, sir. Gregory. Say 'better:' here comes one of my master's kinsmen. 70 Sampson. Yes, better, sir. Abraham. You lie. Sampson. Draw, if you be men. Gregory, remember thy swashing blow. [They fight] [Enter BENVOLIO] Benvolio. Part, fools! Put up your swords; you know not what you do. [Beats down their swords] [Enter TYBALT] Tybalt. What, art thou drawn among these heartless hinds? 80Turn thee, Benvolio, look upon thy death. Benvolio. I do but keep the peace: put up thy sword, Or manage it to part these men with me. Tybalt. What, drawn, and talk of peace! I hate the word, As I hate hell, all Montagues, and thee: 85Have at thee, coward! [They fight] [Enter, several of both houses, who join the fray; then enter Citizens, with clubs] First Citizen. Clubs, bills, and partisans! strike! beat them down! 90Down with the Capulets! down with the Montagues! [Enter CAPULET in his gown, and LADY CAPULET] Capulet. What noise is this? Give me my long sword, ho! Lady Capulet. A crutch, a crutch! why call you for a sword? Capulet. My sword, I say! Old Montague is come, 95And flourishes his blade in spite of me. [Enter MONTAGUE and LADY MONTAGUE] Montague. Thou villain Capulet,—Hold me not, let me go. Lady Montague. Thou shalt not stir a foot to seek a foe. [Enter PRINCE, with Attendants] Prince Escalus. Rebellious subjects, enemies to peace, Profaners of this neighbour-stained steel,— Will they not hear? What, ho! you men, you beasts, That quench the fire of your pernicious rage With purple fountains issuing from your veins, 105On pain of torture, from those bloody hands Throw your mistemper'd weapons to the ground, And hear the sentence of your moved prince. Three civil brawls, bred of an airy word, By thee, old Capulet, and Montague, 110Have thrice disturb'd the quiet of our streets, And made Verona's ancient citizens Cast by their grave beseeming ornaments, To wield old partisans, in hands as old, Canker'd with peace, to part your canker'd hate: 115If ever you disturb our streets again, Your lives shall pay the forfeit of the peace. For this time, all the rest depart away: You Capulet; shall go along with me: And, Montague, come you this afternoon, 120To know our further pleasure in this case, To old Free-town, our common judgment-place. Once more, on pain of death, all men depart. [Exeunt all but MONTAGUE, LADY MONTAGUE, and BENVOLIO] Montague. Who set this ancient quarrel new abroach? 125Speak, nephew, were you by when it began? Benvolio. Here were the servants of your adversary, And yours, close fighting ere I did approach: I drew to part them: in the instant came The fiery Tybalt, with his sword prepared, 130Which, as he breathed defiance to my ears, He swung about his head and cut the winds, Who nothing hurt withal hiss'd him in scorn: While we were interchanging thrusts and blows, Came more and more and fought on part and part, 135Till the prince came, who parted either part. Lady Montague. O, where is Romeo? saw you him to-day? Right glad I am he was not at this fray. Benvolio. Madam, an hour before the worshipp'd sun Peer'd forth the golden window of the east, 140A troubled mind drave me to walk abroad; Where, underneath the grove of sycamore That westward rooteth from the city's side, So early walking did I see your son: Towards him I made, but he was ware of me 145And stole into the covert of the wood: I, measuring his affections by my own, That most are busied when they're most alone, Pursued my humour not pursuing his, And gladly shunn'd who gladly fled from me. 150 Montague. Many a morning hath he there been seen, With tears augmenting the fresh morning dew. Adding to clouds more clouds with his deep sighs; But all so soon as the all-cheering sun Should in the furthest east begin to draw 155The shady curtains from Aurora's bed, Away from the light steals home my heavy son, And private in his chamber pens himself, Shuts up his windows, locks far daylight out And makes himself an artificial night: 160Black and portentous must this humour prove, Unless good counsel may the cause remove. Benvolio. My noble uncle, do you know the cause? Montague. I neither know it nor can learn of him. Benvolio. Have you importuned him by any means? 165 Montague. Both by myself and many other friends: But he, his own affections' counsellor, Is to himself—I will not say how true— But to himself so secret and so close, So far from sounding and discovery, 170As is the bud bit with an envious worm, Ere he can spread his sweet leaves to the air, Or dedicate his beauty to the sun. Could we but learn from whence his sorrows grow. We would as willingly give cure as know. 175 [Enter ROMEO] Benvolio. See, where he comes: so please you, step aside; I'll know his grievance, or be much denied. Montague. I would thou wert so happy by thy stay, To hear true shrift. Come, madam, let's away. 180 [Exeunt MONTAGUE and LADY MONTAGUE] Benvolio. Good-morrow, cousin. Romeo. Is the day so young? Benvolio. But new struck nine. Romeo. Ay me! sad hours seem long. 185Was that my father that went hence so fast? Benvolio. It was. What sadness lengthens Romeo's hours? Romeo. Not having that, which, having, makes them short. Benvolio. In love? Romeo. Out— 190 Benvolio. Of love? Romeo. Out of her favour, where I am in love. Benvolio. Alas, that love, so gentle in his view, Should be so tyrannous and rough in proof! Romeo. Alas, that love, whose view is muffled still, 195Should, without eyes, see pathways to his will! Where shall we dine? O me! What fray was here? Yet tell me not, for I have heard it all. Here's much to do with hate, but more with love. Why, then, O brawling love! O loving hate! 200O any thing, of nothing first create! O heavy lightness! serious vanity! Mis-shapen chaos of well-seeming forms! Feather of lead, bright smoke, cold fire, sick health! 205Still-waking sleep, that is not what it is! This love feel I, that feel no love in this. Dost thou not laugh? Benvolio. No, coz, I rather weep. Romeo. Good heart, at what? 210 Benvolio. At thy good heart's oppression. Romeo. Why, such is love's transgression. Griefs of mine own lie heavy in my breast, Which thou wilt propagate, to have it prest With more of thine: this love that thou hast shown 215Doth add more grief to too much of mine own. Love is a smoke raised with the fume of sighs; Being purged, a fire sparkling in lovers' eyes; Being vex'd a sea nourish'd with lovers' tears: What is it else? a madness most discreet, 220A choking gall and a preserving sweet. Farewell, my coz. Benvolio. Soft! I will go along; An if you leave me so, you do me wrong. Romeo. Tut, I have lost myself; I am not here; 225This is not Romeo, he's some other where. Benvolio. Tell me in sadness, who is that you love. Romeo. What, shall I groan and tell thee? Benvolio. Groan! why, no. But sadly tell me who. 230 Romeo. Bid a sick man in sadness make his will: Ah, word ill urged to one that is so ill! In sadness, cousin, I do love a woman. Benvolio. I aim'd so near, when I supposed you loved. Romeo. A right good mark-man! And she's fair I love. 235 Benvolio. A right fair mark, fair coz, is soonest hit. Romeo. Well, in that hit you miss: she'll not be hit With Cupid's arrow; she hath Dian's wit; And, in strong proof of chastity well arm'd, From love's weak childish bow she lives unharm'd. 240She will not stay the siege of loving terms, Nor bide the encounter of assailing eyes, Nor ope her lap to saint-seducing gold: O, she is rich in beauty, only poor, That when she dies with beauty dies her store. 245 Benvolio. Then she hath sworn that she will still live chaste? Romeo. She hath, and in that sparing makes huge waste, For beauty starved with her severity Cuts beauty off from all posterity. She is too fair, too wise, wisely too fair, 250To merit bliss by making me despair: She hath forsworn to love, and in that vow Do I live dead that live to tell it now. Benvolio. Be ruled by me, forget to think of her. Romeo. O, teach me how I should forget to think. 255 Benvolio. By giving liberty unto thine eyes; Examine other beauties. Romeo. 'Tis the way To call hers exquisite, in question more: These happy masks that kiss fair ladies' brows 260Being black put us in mind they hide the fair; He that is strucken blind cannot forget The precious treasure of his eyesight lost: Show me a mistress that is passing fair, What doth her beauty serve, but as a note 265Where I may read who pass'd that passing fair? Farewell: thou canst not teach me to forget. Benvolio. I'll pay that doctrine, or else die in debt. [Exeunt] previous scene       Act I, Scene 2 A street.       next scene [Enter CAPULET, PARIS, and Servant] Capulet. But Montague is bound as well as I, In penalty alike; and 'tis not hard, I think, For men so old as we to keep the peace. Paris. Of honourable reckoning are you both; And pity 'tis you lived at odds so long. 275But now, my lord, what say you to my suit? Capulet. But saying o'er what I have said before: My child is yet a stranger in the world; She hath not seen the change of fourteen years, Let two more summers wither in their pride, 280Ere we may think her ripe to be a bride. Paris. Younger than she are happy mothers made. Capulet. And too soon marr'd are those so early made. The earth hath swallow'd all my hopes but she, She is the hopeful lady of my earth: 285But woo her, gentle Paris, get her heart, My will to her consent is but a part; An she agree, within her scope of choice Lies my consent and fair according voice. This night I hold an old accustom'd feast, 290Whereto I have invited many a guest, Such as I love; and you, among the store, One more, most welcome, makes my number more. At my poor house look to behold this night Earth-treading stars that make dark heaven light: 295Such comfort as do lusty young men feel When well-apparell'd April on the heel Of limping winter treads, even such delight Among fresh female buds shall you this night Inherit at my house; hear all, all see, 300And like her most whose merit most shall be: Which on more view, of many mine being one May stand in number, though in reckoning none, Come, go with me. [To Servant, giving a paper] 305Go, sirrah, trudge about Through fair Verona; find those persons out Whose names are written there, and to them say, My house and welcome on their pleasure stay. [Exeunt CAPULET and PARIS] Servant. Find them out whose names are written here! It is written, that the shoemaker should meddle with his yard, and the tailor with his last, the fisher with his pencil, and the painter with his nets; but I am sent to find those persons whose names are here 315writ, and can never find what names the writing person hath here writ. I must to the learned.—In good time. [Enter BENVOLIO and ROMEO] Benvolio. Tut, man, one fire burns out another's burning, One pain is lessen'd by another's anguish; 320Turn giddy, and be holp by backward turning; One desperate grief cures with another's languish: Take thou some new infection to thy eye, And the rank poison of the old will die. Romeo. Your plaintain-leaf is excellent for that. 325 Benvolio. For what, I pray thee? Romeo. For your broken shin. Benvolio. Why, Romeo, art thou mad? Romeo. Not mad, but bound more than a mad-man is; Shut up in prison, kept without my food, 330Whipp'd and tormented and—God-den, good fellow. Servant. God gi' god-den. I pray, sir, can you read? Romeo. Ay, mine own fortune in my misery. Servant. Perhaps you have learned it without book: but, I pray, can you read any thing you see? 335 Romeo. Ay, if I know the letters and the language. Servant. Ye say honestly: rest you merry! Romeo. Stay, fellow; I can read. [Reads] 'Signior Martino and his wife and daughters; 340County Anselme and his beauteous sisters; the lady widow of Vitravio; Signior Placentio and his lovely nieces; Mercutio and his brother Valentine; mine uncle Capulet, his wife and daughters; my fair niece Rosaline; Livia; Signior Valentio and his cousin 345Tybalt, Lucio and the lively Helena.' A fair assembly: whither should they come? Servant. Up. Romeo. Whither? Servant. To supper; to our house. 350 Romeo. Whose house? Servant. My master's. Romeo. Indeed, I should have ask'd you that before. Servant. Now I'll tell you without asking: my master is the great rich Capulet; and if you be not of the house 355of Montagues, I pray, come and crush a cup of wine. Rest you merry! [Exit] Benvolio. At this same ancient feast of Capulet's Sups the fair Rosaline whom thou so lovest, 360With all the admired beauties of Verona: Go thither; and, with unattainted eye, Compare her face with some that I shall show, And I will make thee think thy swan a crow. Romeo. When the devout religion of mine eye 365Maintains such falsehood, then turn tears to fires; And these, who often drown'd could never die, Transparent heretics, be burnt for liars! One fairer than my love! the all-seeing sun Ne'er saw her match since first the world begun. 370 Benvolio. Tut, you saw her fair, none else being by, Herself poised with herself in either eye: But in that crystal scales let there be weigh'd Your lady's love against some other maid That I will show you shining at this feast, 375And she shall scant show well that now shows best. Romeo. I'll go along, no such sight to be shown, But to rejoice in splendor of mine own. [Exeunt] previous scene       Act I, Scene 3 A room in Capulet’s house.       next scene [Enter LADY CAPULET and Nurse] Lady Capulet. Nurse, where's my daughter? call her forth to me. Nurse. Now, by my maidenhead, at twelve year old, I bade her come. What, lamb! what, ladybird! God forbid! Where's this girl? What, Juliet! [Enter JULIET] Juliet. How now! who calls? Nurse. Your mother. Juliet. Madam, I am here. What is your will? Lady Capulet. This is the matter:—Nurse, give leave awhile, 390We must talk in secret:—nurse, come back again; I have remember'd me, thou's hear our counsel. Thou know'st my daughter's of a pretty age. Nurse. Faith, I can tell her age unto an hour. Lady Capulet. She's not fourteen. 395 Nurse. I'll lay fourteen of my teeth,— And yet, to my teeth be it spoken, I have but four— She is not fourteen. How long is it now To Lammas-tide? Lady Capulet. A fortnight and odd days. 400 Nurse. Even or odd, of all days in the year, Come Lammas-eve at night shall she be fourteen. Susan and she—God rest all Christian souls!— Were of an age: well, Susan is with God; She was too good for me: but, as I said, 405On Lammas-eve at night shall she be fourteen; That shall she, marry; I remember it well. 'Tis since the earthquake now eleven years; And she was wean'd,—I never shall forget it,— Of all the days of the year, upon that day: 410For I had then laid wormwood to my dug, Sitting in the sun under the dove-house wall; My lord and you were then at Mantua:— Nay, I do bear a brain:—but, as I said, When it did taste the wormwood on the nipple 415Of my dug and felt it bitter, pretty fool, To see it tetchy and fall out with the dug! Shake quoth the dove-house: 'twas no need, I trow, To bid me trudge: And since that time it is eleven years; 420For then she could stand alone; nay, by the rood, She could have run and waddled all about; For even the day before, she broke her brow: And then my husband—God be with his soul! A' was a merry man—took up the child: 425'Yea,' quoth he, 'dost thou fall upon thy face? Thou wilt fall backward when thou hast more wit; Wilt thou not, Jule?' and, by my holidame, The pretty wretch left crying and said 'Ay.' To see, now, how a jest shall come about! 430I warrant, an I should live a thousand years, I never should forget it: 'Wilt thou not, Jule?' quoth he; And, pretty fool, it stinted and said 'Ay.' Lady Capulet. Enough of this; I pray thee, hold thy peace. Nurse. Yes, madam: yet I cannot choose but laugh, 435To think it should leave crying and say 'Ay.' And yet, I warrant, it had upon its brow A bump as big as a young cockerel's stone; A parlous knock; and it cried bitterly: 'Yea,' quoth my husband,'fall'st upon thy face? 440Thou wilt fall backward when thou comest to age; Wilt thou not, Jule?' it stinted and said 'Ay.' Juliet. And stint thou too, I pray thee, nurse, say I. Nurse. Peace, I have done. God mark thee to his grace! Thou wast the prettiest babe that e'er I nursed: 445An I might live to see thee married once, I have my wish. Lady Capulet. Marry, that 'marry' is the very theme I came to talk of. Tell me, daughter Juliet, How stands your disposition to be married? 450 Juliet. It is an honour that I dream not of. Nurse. An honour! were not I thine only nurse, I would say thou hadst suck'd wisdom from thy teat. Lady Capulet. Well, think of marriage now; younger than you, Here in Verona, ladies of esteem, 455Are made already mothers: by my count, I was your mother much upon these years That you are now a maid. Thus then in brief: The valiant Paris seeks you for his love. Nurse. A man, young lady! lady, such a man 460As all the world—why, he's a man of wax. Lady Capulet. Verona's summer hath not such a flower. Nurse. Nay, he's a flower; in faith, a very flower. Lady Capulet. What say you? can you love the gentleman? This night you shall behold him at our feast; 465Read o'er the volume of young Paris' face, And find delight writ there with beauty's pen; Examine every married lineament, And see how one another lends content And what obscured in this fair volume lies 470Find written in the margent of his eyes. This precious book of love, this unbound lover, To beautify him, only lacks a cover: The fish lives in the sea, and 'tis much pride For fair without the fair within to hide: 475That book in many's eyes doth share the glory, That in gold clasps locks in the golden story; So shall you share all that he doth possess, By having him, making yourself no less. Nurse. No less! nay, bigger; women grow by men. 480 Lady Capulet. Speak briefly, can you like of Paris' love? Juliet. I'll look to like, if looking liking move: But no more deep will I endart mine eye Than your consent gives strength to make it fly. [Enter a Servant] Servant. Madam, the guests are come, supper served up, you called, my young lady asked for, the nurse cursed in the pantry, and every thing in extremity. I must hence to wait; I beseech you, follow straight. Lady Capulet. We follow thee. 490[Exit Servant] Juliet, the county stays. Nurse. Go, girl, seek happy nights to happy days. [Exeunt] previous scene       Act I, Scene 4 A street.       next scene [Enter ROMEO, MERCUTIO, BENVOLIO, with five or six [p]Maskers, Torch-bearers, and others] Romeo. What, shall this speech be spoke for our excuse? Or shall we on without a apology? Benvolio. The date is out of such prolixity: We'll have no Cupid hoodwink'd with a scarf, 500Bearing a Tartar's painted bow of lath, Scaring the ladies like a crow-keeper; Nor no without-book prologue, faintly spoke After the prompter, for our entrance: But let them measure us by what they will; 505We'll measure them a measure, and be gone. Romeo. Give me a torch: I am not for this ambling; Being but heavy, I will bear the light. Mercutio. Nay, gentle Romeo, we must have you dance. Romeo. Not I, believe me: you have dancing shoes 510With nimble soles: I have a soul of lead So stakes me to the ground I cannot move. Mercutio. You are a lover; borrow Cupid's wings, And soar with them above a common bound. Romeo. I am too sore enpierced with his shaft 515To soar with his light feathers, and so bound, I cannot bound a pitch above dull woe: Under love's heavy burden do I sink. Mercutio. And, to sink in it, should you burden love; Too great oppression for a tender thing. 520 Romeo. Is love a tender thing? it is too rough, Too rude, too boisterous, and it pricks like thorn. Mercutio. If love be rough with you, be rough with love; Prick love for pricking, and you beat love down. Give me a case to put my visage in: 525A visor for a visor! what care I What curious eye doth quote deformities? Here are the beetle brows shall blush for me. Benvolio. Come, knock and enter; and no sooner in, But every man betake him to his legs. 530 Romeo. A torch for me: let wantons light of heart Tickle the senseless rushes with their heels, For I am proverb'd with a grandsire phrase; I'll be a candle-holder, and look on. The game was ne'er so fair, and I am done. 535 Mercutio. Tut, dun's the mouse, the constable's own word: If thou art dun, we'll draw thee from the mire Of this sir-reverence love, wherein thou stick'st Up to the ears. Come, we burn daylight, ho! Romeo. Nay, that's not so. 540 Mercutio. I mean, sir, in delay We waste our lights in vain, like lamps by day. Take our good meaning, for our judgment sits Five times in that ere once in our five wits. Romeo. And we mean well in going to this mask; 545But 'tis no wit to go. Mercutio. Why, may one ask? Romeo. I dream'd a dream to-night. Mercutio. And so did I. Romeo. Well, what was yours? 550 Mercutio. That dreamers often lie. Romeo. In bed asleep, while they do dream things true. Mercutio. O, then, I see Queen Mab hath been with you. She is the fairies' midwife, and she comes In shape no bigger than an agate-stone 555On the fore-finger of an alderman, Drawn with a team of little atomies Athwart men's noses as they lie asleep; Her wagon-spokes made of long spiders' legs, The cover of the wings of grasshoppers, 560The traces of the smallest spider's web, The collars of the moonshine's watery beams, Her whip of cricket's bone, the lash of film, Her wagoner a small grey-coated gnat, Not so big as a round little worm 565Prick'd from the lazy finger of a maid; Her chariot is an empty hazel-nut Made by the joiner squirrel or old grub, Time out o' mind the fairies' coachmakers. And in this state she gallops night by night 570Through lovers' brains, and then they dream of love; O'er courtiers' knees, that dream on court'sies straight, O'er lawyers' fingers, who straight dream on fees, O'er ladies ' lips, who straight on kisses dream, Which oft the angry Mab with blisters plagues, 575Because their breaths with sweetmeats tainted are: Sometime she gallops o'er a courtier's nose, And then dreams he of smelling out a suit; And sometime comes she with a tithe-pig's tail Tickling a parson's nose as a' lies asleep, 580Then dreams, he of another benefice: Sometime she driveth o'er a soldier's neck, And then dreams he of cutting foreign throats, Of breaches, ambuscadoes, Spanish blades, Of healths five-fathom deep; and then anon 585Drums in his ear, at which he starts and wakes, And being thus frighted swears a prayer or two And sleeps again. This is that very Mab That plats the manes of horses in the night, And bakes the elflocks in foul sluttish hairs, 590Which once untangled, much misfortune bodes: This is the hag, when maids lie on their backs, That presses them and learns them first to bear, Making them women of good carriage: This is she— 595 Romeo. Peace, peace, Mercutio, peace! Thou talk'st of nothing. Mercutio. True, I talk of dreams, Which are the children of an idle brain, Begot of nothing but vain fantasy, 600Which is as thin of substance as the air And more inconstant than the wind, who wooes Even now the frozen bosom of the north, And, being anger'd, puffs away from thence, Turning his face to the dew-dropping south. 605 Benvolio. This wind, you talk of, blows us from ourselves; Supper is done, and we shall come too late. Romeo. I fear, too early: for my mind misgives Some consequence yet hanging in the stars Shall bitterly begin his fearful date 610With this night's revels and expire the term Of a despised life closed in my breast By some vile forfeit of untimely death. But He, that hath the steerage of my course, Direct my sail! On, lusty gentlemen. 615 Benvolio. Strike, drum. [Exeunt] previous scene       Act I, Scene 5 A hall in Capulet’s house.         [Musicians waiting. Enter Servingmen with napkins] First Servant. Where's Potpan, that he helps not to take away? He shift a trencher? he scrape a trencher! 620 Second Servant. When good manners shall lie all in one or two men's hands and they unwashed too, 'tis a foul thing. First Servant. Away with the joint-stools, remove the court-cupboard, look to the plate. Good thou, save me a piece of marchpane; and, as thou lovest me, let 625the porter let in Susan Grindstone and Nell. Antony, and Potpan! Second Servant. Ay, boy, ready. First Servant. You are looked for and called for, asked for and sought for, in the great chamber. 630 Second Servant. We cannot be here and there too. Cheerly, boys; be brisk awhile, and the longer liver take all. [Enter CAPULET, with JULIET and others of his house, meeting the Guests and Maskers] Capulet. Welcome, gentlemen! ladies that have their toes Unplagued with corns will have a bout with you. 635Ah ha, my mistresses! which of you all Will now deny to dance? she that makes dainty, She, I'll swear, hath corns; am I come near ye now? Welcome, gentlemen! I have seen the day That I have worn a visor and could tell 640A whispering tale in a fair lady's ear, Such as would please: 'tis gone, 'tis gone, 'tis gone: You are welcome, gentlemen! come, musicians, play. A hall, a hall! give room! and foot it, girls. [Music plays, and they dance] 645More light, you knaves; and turn the tables up, And quench the fire, the room is grown too hot. Ah, sirrah, this unlook'd-for sport comes well. Nay, sit, nay, sit, good cousin Capulet; For you and I are past our dancing days: 650How long is't now since last yourself and I Were in a mask? Second Capulet. By'r lady, thirty years. Capulet. What, man! 'tis not so much, 'tis not so much: 'Tis since the nuptials of Lucentio, 655Come pentecost as quickly as it will, Some five and twenty years; and then we mask'd. Second Capulet. 'Tis more, 'tis more, his son is elder, sir; His son is thirty. Capulet. Will you tell me that? 660His son was but a ward two years ago. Romeo. [To a Servingman] What lady is that, which doth enrich the hand Of yonder knight? Servant. I know not, sir. 665 Romeo. O, she doth teach the torches to burn bright! It seems she hangs upon the cheek of night Like a rich jewel in an Ethiope's ear; Beauty too rich for use, for earth too dear! So shows a snowy dove trooping with crows, 670As yonder lady o'er her fellows shows. The measure done, I'll watch her place of stand, And, touching hers, make blessed my rude hand. Did my heart love till now? forswear it, sight! For I ne'er saw true beauty till this night. 675 Tybalt. This, by his voice, should be a Montague. Fetch me my rapier, boy. What dares the slave Come hither, cover'd with an antic face, To fleer and scorn at our solemnity? Now, by the stock and honour of my kin, 680To strike him dead, I hold it not a sin. Capulet. Why, how now, kinsman! wherefore storm you so? Tybalt. Uncle, this is a Montague, our foe, A villain that is hither come in spite, To scorn at our solemnity this night. 685 Capulet. Young Romeo is it? Tybalt. 'Tis he, that villain Romeo. Capulet. Content thee, gentle coz, let him alone; He bears him like a portly gentleman; And, to say truth, Verona brags of him 690To be a virtuous and well-govern'd youth: I would not for the wealth of all the town Here in my house do him disparagement: Therefore be patient, take no note of him: It is my will, the which if thou respect, 695Show a fair presence and put off these frowns, And ill-beseeming semblance for a feast. Tybalt. It fits, when such a villain is a guest: I'll not endure him. Capulet. He shall be endured: 700What, goodman boy! I say, he shall: go to; Am I the master here, or you? go to. You'll not endure him! God shall mend my soul! You'll make a mutiny among my guests! You will set cock-a-hoop! you'll be the man! 705 Tybalt. Why, uncle, 'tis a shame. Capulet. Go to, go to; You are a saucy boy: is't so, indeed? This trick may chance to scathe you, I know what: You must contrary me! marry, 'tis time. 710Well said, my hearts! You are a princox; go: Be quiet, or—More light, more light! For shame! I'll make you quiet. What, cheerly, my hearts! Tybalt. Patience perforce with wilful choler meeting Makes my flesh tremble in their different greeting. 715I will withdraw: but this intrusion shall Now seeming sweet convert to bitter gall. [Exit] Romeo. [To JULIET] If I profane with my unworthiest hand This holy shrine, the gentle fine is this: 720My lips, two blushing pilgrims, ready stand To smooth that rough touch with a tender kiss. Juliet. Good pilgrim, you do wrong your hand too much, Which mannerly devotion shows in this; For saints have hands that pilgrims' hands do touch, 725And palm to palm is holy palmers' kiss. Romeo. Have not saints lips, and holy palmers too? Juliet. Ay, pilgrim, lips that they must use in prayer. Romeo. O, then, dear saint, let lips do what hands do; They pray, grant thou, lest faith turn to despair. 730 Juliet. Saints do not move, though grant for prayers' sake. Romeo. Then move not, while my prayer's effect I take. Thus from my lips, by yours, my sin is purged. Juliet. Then have my lips the sin that they have took. Romeo. Sin from thy lips? O trespass sweetly urged! 735Give me my sin again. Juliet. You kiss by the book. Nurse. Madam, your mother craves a word with you. Romeo. What is her mother? Nurse. Marry, bachelor, 740Her mother is the lady of the house, And a good lady, and a wise and virtuous I nursed her daughter, that you talk'd withal; I tell you, he that can lay hold of her Shall have the chinks. 745 Romeo. Is she a Capulet? O dear account! my life is my foe's debt. Benvolio. Away, begone; the sport is at the best. Romeo. Ay, so I fear; the more is my unrest. Capulet. Nay, gentlemen, prepare not to be gone; 750We have a trifling foolish banquet towards. Is it e'en so? why, then, I thank you all I thank you, honest gentlemen; good night. More torches here! Come on then, let's to bed. Ah, sirrah, by my fay, it waxes late: 755I'll to my rest. [Exeunt all but JULIET and Nurse] Juliet. Come hither, nurse. What is yond gentleman? Nurse. The son and heir of old Tiberio. Juliet. What's he that now is going out of door? 760 Nurse. Marry, that, I think, be young Petrucio. Juliet. What's he that follows there, that would not dance? Nurse. I know not. Juliet. Go ask his name: if he be married. My grave is like to be my wedding bed. 765 Nurse. His name is Romeo, and a Montague; The only son of your great enemy. Juliet. My only love sprung from my only hate! Too early seen unknown, and known too late! Prodigious birth of love it is to me, 770That I must love a loathed enemy. Nurse. What's this? what's this? Juliet. A rhyme I learn'd even now Of one I danced withal. [One calls within 'Juliet.'] Nurse. Anon, anon! Come, let's away; the strangers all are gone. [Exeunt]

      I can see various characterizations, themes and stylistic devices, which I will discuss below

    2. Here were the servants of your adversary, And yours, close fighting ere I did approach: I drew to part them: in the instant came The fiery Tybalt, with his sword prepared, 130Which, as he breathed defiance to my ears, He swung about his head and cut the winds, Who nothing hurt withal hiss'd him in scorn: While we were interchanging thrusts and blows, Came more and more and fought on part and part, 135Till the prince came, who parted either part.

      Even though Benvolio wants to keep the peace, Tybalt was able to provoke him enough to where he actually fought back. This shows the intensity of the feud and how even people that would like peace are forced to become violent.

    1. An obstetrician may elect to numb the perineum during childbirth and perform an episiotomy, an incision made in the posterior vaginal wall and perineum to prevent tearing and to also facilitate the birth of the fetus’ head. Perineal tears and episiotomies must be sutured shortly after birth to ensure optimal healing.

      As this procedure is no longer routinely recommended, I'm not sure we should include it.

    1. Ubuntu

      I found this very interesting. At my high school, a long-time teacher who's been there for 25+ years, who's also the head football coach, Steve Valach, emphasizes the word "Ubuntu" at the kickoff assembly each year during the first week of school. When I heard about it for the first time nearly 4.5 years ago, he made it so memorable because it means "I am because we are." That is the key part about ubuntu, the connectedness it creates, the team aspect of it, it's harmonious as it says in the text because it's a unity feeling. In football when the game is really close, sometimes, I see a huddle of how a team is going to win a nail biting game, if the game is 28-27, "ubuntu" comes into mind because the offense needs to have receivers catching, good routing, special teams making it harder for the other team to score, defense stepping up to the plate, and when the clock goes to 0, it really has that feeling of "We won the game, everyone contributed." Which is, in my mind, that feeling of "Ubuntu." everyone pitched in, nobody did something where it hindered someone's capabilities, everyone was capable. This idea also connects to virtue ethics, because it emphasizes developing good character through cooperation, respect, and helping others succeed.

    1. Detailed information on fetal growth, such as that derivedfrom serial ultrasound scans, is difficult to obtain. Epide-miological studies instead use proxy measures, such as birthweight, gestational age, and, less often, birth length, headcircumference, placental weight, or ponderal index (4). Wefocus here on birth weight, which is the most commonlystudied measure of fetal development, but it is important tonote that birth weight, gestational age, and birth length mayreflect different underlying mechanisms with independenteffects on specific mental health outcomes

      Fetal growth is challenging to assess directly, so researchers rely on proxy measures such as birth weight, gestational age, birth length, head circumference, placental weight, and the ponderal index. However, these indicators may capture different biological processes and can therefore reflect distinct underlying mechanisms, each with independent associations with specific mental health outcomes.

      For example, low birth weight may reflect inadequate fetal nutrition, whereas reduced head circumference may indicate altered brain development; each may be linked to different risks for later mental health conditions. In addition, differences seen can be attributed to genetics instead of maternal health.

    1. Ironically, like Roger Taney’s opining in Dred Scott, it first raises its head in a “headnote” to a case not dealing with the issue. In remarks setting the scene for their decision in Santa Clara County v. Southern Pacific Railroad (1886), the court remarked, “The court does not wish to hear argument on the question whether the provision in the Fourteenth Amendment to the Constitution, which forbids a State to deny to any person within its jurisdiction the equal protection of the laws, applies to these corporations. We are all of the opinion that it does.” This statement, that the court were “all of the opinion” that “any person” applied to the fictional entities created by state charters, precluded any formal challenge by telegraphing the outcome. Thus, in a sneaky way, the court avoided having to actually produce a decision of an actual case to establish this principle. Talk about legislating from the bench!
    1. sperm

      Sperm(精子)— with appropriate images

      Image

      Image

      Image

      Image


      English (thorough explanation)

      1) What sperm is (core definition)

      Sperm is the male gamete (sex cell) used in sexual reproduction. Its role is to deliver male genetic information to the egg during fertilization.

      In simple terms: 👉 Sperm = male sex cell that carries DNA to the egg


      2) Chromosome number

      • Sperm cells are haploid (n)
      • They contain half the normal number of chromosomes

      Example (humans):

      • Sperm: 23 chromosomes
      • Egg: 23 chromosomes
      • Fertilization → zygote with 46 chromosomes

      3) Structure of a sperm cell

      A sperm cell has three main parts, each with a specific function:

      • Head

      • Contains the nucleus (DNA)

      • Covered by the acrosome, which has enzymes to penetrate the egg
      • Midpiece

      • Packed with mitochondria

      • Produces energy (ATP) for movement
      • Tail (flagellum)

      • Enables the sperm to swim toward the egg

      This streamlined shape is adapted for speed and mobility.


      4) How sperm are produced

      Sperm are produced in the testes through spermatogenesis:

      • Starts from diploid cells
      • Uses meiosis
      • Produces four haploid sperm cells
      • Occurs continuously after puberty

      5) Role of sperm in fertilization

      During fertilization:

      1. Sperm swim toward the egg
      2. One sperm penetrates the egg membrane
      3. The nuclei fuse
      4. A diploid zygote forms

      Only one sperm normally fertilizes the egg.


      6) Why sperm are important

      Sperm:

      • Enable sexual reproduction
      • Contribute genetic diversity
      • Determine the biological sex of offspring in humans (X or Y chromosome)

      中文(详细解释)

      1)什么是精子(Sperm)

      精子是用于有性生殖雄性配子。 它的主要作用是将父方的遗传信息传递给卵细胞。

      一句话: 👉 精子 = 携带遗传信息的雄性性细胞


      2)染色体数目

      • 精子是单倍体(n)
      • 只含一半染色体

      人类例子:

      • 精子:23 条
      • 卵子:23 条
      • 受精后:46 条(合子)

      3)精子的结构

      精子主要由三部分组成:

      • 头部

      • 含细胞核(DNA)

      • 顶体,释放酶帮助进入卵子
      • 中段

      • 富含线粒体

      • 提供运动能量
      • 尾部

      • 帮助精子游动


      4)精子的产生

      精子在睾丸中通过精子发生(减数分裂)产生:

      • 每个原始细胞可形成 4 个精子
      • 从青春期后持续进行

      5)精子在受精中的作用

      受精过程:

      • 精子进入卵细胞
      • 细胞核融合
      • 形成二倍体合子

      6)精子的重要性

      精子:

      • 实现有性生殖
      • 增加遗传多样性
      • 决定后代的生理性别(X 或 Y)

      一句话总结(双语)

      • Sperm = the male haploid gamete
      • 精子 = 单倍体的雄性配子

      如果你需要,我可以继续帮你做 ✔️ sperm vs egg 对照表 ✔️ Biology 10/11 易错点速记 ✔️ 双语闪卡(结构 + 功能)

    1. Movement, prolonged standing, and pressure accentuate the discomfort and associated muscle spasm.

      Carnett sign: A bedside maneuver used to help distinguish abdominal wall pain from visceral pain. The examiner identifies the point of maximal tenderness and maintains pressure while the patient tenses the abdominal muscles (by lifting the head/shoulders or raising the legs). Pain that remains unchanged or worsens with muscle tensing (positive Carnett sign) suggests an abdominal wall source, whereas decreased pain suggests intra-abdominal/visceral pathology. The sign is more specific than sensitive and is most useful in patients with localized tenderness and otherwise benign evaluation.

    1. slugs

      Image

      Image

      Image

      Image

      Image

      English (thorough explanation)

      1) What slugs are

      Slugs are soft-bodied land mollusks, closely related to snails. Unlike snails, they do not have a visible external shell. Some species still possess a small internal shell plate, hidden beneath the mantle.

      2) Key body parts (refer to images)

      • Mantle – a thick, shield-like area behind the head; often covers the internal shell.
      • Tentacles (feelers) – usually two pairs:

      • Upper pair: eyes at the tips (detect light and shapes)

      • Lower pair: smell and touch
      • Foot – the wide underside used for movement.
      • Pneumostome – a small breathing hole on the side of the mantle.
      • Mucus (slime) – a slippery secretion covering the body and leaving trails.

      3) How slugs move

      Slugs move using wave-like muscle contractions along the foot. The mucus:

      • reduces friction so they can glide,
      • allows them to climb walls and leaves,
      • protects the body from injury and dehydration.

      4) Habitat and behavior

      Slugs prefer cool, moist environments, which is why they are commonly seen:

      • at night,
      • after rain,
      • under logs, rocks, and leaf litter.

      Sunlight and dry air can quickly dry them out, so they avoid exposed areas.

      5) What slugs eat

      • Decomposers: dead plants, fungi, algae
      • Herbivores: garden plants (lettuce, seedlings, strawberries)
      • Some species are predators of small invertebrates

      They play an important role in nutrient recycling, even though gardeners may see them as pests.

      6) Slugs in the food web

      Slugs are prey for:

      • birds,
      • frogs and salamanders,
      • toads,
      • beetles,
      • small mammals.

      They are an important link in ecosystem energy flow.

      7) Reproduction

      Most slugs are hermaphrodites (each individual has both male and female organs). After mating, they lay clusters of gel-like eggs in moist soil. This allows populations to increase rapidly under favorable conditions.


      中文(配图详解)

      1)什么是 slugs(蛞蝓)

      蛞蝓是一种陆生软体动物,和蜗牛同属腹足纲。 与蜗牛不同的是,蛞蝓没有明显的外壳,但很多种类仍保留着退化的内壳,藏在外套膜下面。

      2)主要身体结构(对应图片)

      • 外套膜(mantle):头后方厚实的区域,可能覆盖内壳
      • 触角:通常有两对

      • 上触角:顶端有眼点,感知光线

      • 下触角:负责嗅觉和触觉
      • 足(foot):腹部的爬行结构
      • 呼吸孔(pneumostome):侧面的小孔,用于呼吸
      • 黏液(slime):覆盖身体并留下痕迹

      3)蛞蝓如何移动

      蛞蝓靠足部产生的波浪式肌肉收缩前进。 分泌的黏液可以:

      • 减少摩擦,方便滑行
      • 增强附着力,能爬墙
      • 保护身体,减缓失水

      4)生活环境与活动时间

      蛞蝓喜欢阴凉、潮湿的环境,因此常见于:

      • 夜晚
      • 雨后
      • 石头、木头、落叶下面

      干燥和阳光会让它们迅速脱水

      5)吃什么?生态作用

      • 分解者:吃腐烂植物和真菌
      • 食草者:啃食园艺植物
      • 少数为捕食性

      它们在生态系统中帮助分解有机物、循环养分

      6)食物链中的位置

      蛞蝓是许多动物的食物,包括:

      • 鸟类
      • 青蛙、蝾螈
      • 蟾蜍
      • 昆虫
      • 小型哺乳动物

      是生态系统中重要的一环。

      7)繁殖方式

      多数蛞蝓是雌雄同体,交配后在潮湿环境中产下胶状卵团。 因此在条件合适时,数量会迅速增加。


      如果你愿意,我可以把这套内容整理成 Science 10 / Biology 图文 PPTHTML 教学页面(双语、可直接课堂使用)。

    2. sculpin

      “Sculpin” is the common English name for several kinds of small, bottom-dwelling fish (mostly in the Northern Hemisphere) known for their big heads, spiny fins, and excellent camouflage.

      Core meaning

      • sculpin (noun): a fish—typically a benthic (bottom-living) predator/scavenger—that rests on rocks, sand, or mud and often blends into its surroundings.
      • Plural: sculpins (also sometimes used as a collective: “sculpin”).
      • Pronunciation: SKUHL-pin (IPA: /ˈskʌl.pɪn/)

      What sculpins are like (typical traits)

      • Body shape: large head, wide mouth, tapering body.
      • Spines: many species have sharp spines on the head and dorsal fin for defense.
      • Camouflage: mottled patterns that match the seafloor or riverbed.
      • Behavior: often sit still, then ambush prey (small fish, crustaceans, worms, insect larvae).
      • Where they live: depending on the group, sculpins may live in coastal oceans, tide pools, estuaries, or cold freshwater streams/lakes.

      Freshwater vs. marine usage

      • In North American freshwater, “sculpin” often refers to Cottus species (family Cottidae), common in cool, rocky streams.
      • In marine/coastal contexts, “sculpin” can refer to many “sculpin-like” fishes within the broader order Scorpaeniformes, including groups sometimes called sea scorpions.

      Examples of fish commonly called “sculpin”

      • Mottled sculpin (Cottus bairdii) — freshwater streams (North America).
      • Slimy sculpin (Cottus cognatus) — lakes/streams (North America).
      • Shorthorn sculpin (Myoxocephalus scorpius) — cold marine waters (North Atlantic/Arctic).
      • Cabezon (Scorpaenichthys marmoratus) — large Pacific coastal “sculpin” (often called a sculpin though it’s a distinct type).

      Word origin (etymology) The exact origin is debated, but “sculpin” appears in English as a fish name centuries ago and is often linked to older dialect forms used for spiny, broad-headed fishes. (It’s a folk/common-name term more than a single scientific category.)

      How it’s used in sentences

      • “We found a sculpin hiding under the rocks in the tide pool.”
      • “The stream has sculpins and trout, which usually indicates cold, well-oxygenated water.”
      • “That fish isn’t a ‘goby’—it’s a sculpin (bigger head, spiny profile).”

      Common confusions

      • Gobies/blennies vs. sculpins: gobies often have fused pelvic fins forming a “suction cup,” while sculpins typically have large pectoral fins and a spiny, armored head.
      • “Sculpin” is not one single species; it’s a common name applied to multiple related (and sometimes loosely related) fishes.

      Good image links (reliable, free-to-view)

      If you tell me whether you mean freshwater stream sculpins or ocean/tide-pool sculpins, I can tailor the explanation (habitat, typical size, and the best representative images) to that context.

    3. sculpin

      English (thorough explanation)

      Image

      Image

      Image

      Image

      1) What “sculpin” literally means

      A sculpin is a small, bottom-dwelling fish found in cold freshwater streams and coastal marine waters, especially in the Northern Hemisphere. It belongs to a group of fishes known for their broad, flattened heads and excellent camouflage.

      In simple terms: sculpin = a small fish that lives on the bottom and blends in with rocks


      2) Key physical characteristics

      • Large, flat head compared to body size
      • Big pectoral fins used to rest on or grip the bottom
      • No swim bladder (in many species), so they stay on the bottom
      • Mottled colors (brown, gray, green) for camouflage
      • Often spiny or rough-skinned

      3) Habitat

      Sculpins live:

      • In cold, fast-flowing streams and rivers (freshwater sculpins)
      • Along rocky ocean bottoms and tide pools (marine sculpins)
      • Usually under rocks or among gravel

      They prefer clean, well-oxygenated water, so their presence often indicates good water quality.


      4) Behavior and diet

      • Bottom dwellers: they sit still and rely on camouflage
      • Carnivorous: eat insects, larvae, small crustaceans, and tiny fish
      • Usually poor swimmers compared to open-water fish

      5) Ecological importance

      Sculpins:

      • Are an important food source for trout and other larger fish
      • Help control insect populations
      • Serve as indicator species for healthy streams

      6) How the word is used if it appears alone

      If you see “sculpin” by itself, it usually refers to:

      • a type of fish in a freshwater or marine ecosystem,
      • a species example in ecology or biology,
      • a label on a stream or coastal food-web diagram.

      7) Pronunciation

      • sculpin /ˈskʌl.pɪn/ Sounds like SKULL-pin

      中文(详细解释)

      Image

      Image

      Image

      1)“sculpin”的基本意思

      sculpin杜父鱼,是一类 生活在水底的小型鱼类,常见于 寒冷、清澈的溪流近海岩石区

      简单说: sculpin = 生活在水底、善于伪装的小鱼


      2)主要特征

      • 头部宽大扁平
      • 胸鳍大,可支撑身体贴近水底
      • 体色斑驳,与石头、沙砾融为一体
      • 多数 没有鱼鳔,不容易漂浮

      3)栖息环境

      • 寒冷清洁的溪流和河流
      • 岩石或砾石底部
      • 近岸海域、潮池

      它们通常出现在 水质良好、含氧量高 的地方。


      4)行为与食性

      • 底栖鱼类,行动隐蔽
      • 肉食性:吃昆虫幼虫、小型甲壳动物、小鱼
      • 游泳能力较弱,但 贴底生活能力强

      5)生态意义

      • 鲑鳟类等大鱼的重要食物来源
      • 帮助控制 水生昆虫数量
      • 常被用作 水生态健康的指示物种

      6)单独出现 “sculpin” 的含义

      如果只看到 sculpin,通常表示:

      • 一种鱼类名称
      • 溪流或海岸生态系统成员
      • 食物网示例或图片标签

      7)发音

      • sculpin /ˈskʌlpɪn/ 中文可理解为:“斯卡尔平 / 杜父鱼”

      If you want, you can paste the full sentence, food-web diagram, or test question (e.g., stream ecosystems, indicator species, or predator–prey relationships), and I’ll explain exactly why sculpin is mentioned and what concept it illustrates(中英对照、直击考点).

    4. jaguars

      English (thorough explanation)

      Image

      Image

      Image

      Image

      1) What “jaguars” literally means

      Jaguars is the plural noun of jaguar. Jaguars are large wild cats native to the Americas, especially Central and South American rainforests. They are the largest big cat in the Americas.

      In simple terms: jaguars = powerful spotted big cats of the American tropics


      2) Key identifying features

      • Coat pattern: Yellow-gold fur with large rosettes that usually have a black spot in the center (this helps distinguish jaguars from leopards).
      • Build: Very stocky and muscular with an extremely strong bite—one of the strongest among big cats.
      • Head & jaws: Broad head, powerful jaws adapted for crushing.

      3) Habitat and range

      • Found from southern Mexico through Central America into South America.
      • Prefer dense rainforests, but also live in wetlands, riverbanks, and grasslands.
      • Closely associated with water—unusual for big cats.

      4) Hunting style and diet

      • Apex predators and mostly solitary.
      • Ambush hunters: rely on stealth rather than long chases.
      • Unique killing method: often bite directly through the skull of prey.
      • Diet includes deer, capybaras, peccaries, caimans, fish, and turtles.

      5) Jaguars vs. leopards (common confusion)

      | Feature | Jaguars | Leopards | | -------- | ------------------------ | ---------------------------- | | Region | Americas | Africa & Asia | | Build | Stocky, very powerful | Slimmer | | Rosettes | Larger, with central dot | Smaller, no central dot | | Water | Strong swimmers | Avoid water | | Bite | Extremely strong | Strong, but less than jaguar |


      6) Ecological role

      Jaguars are top predators that:

      • Control prey populations
      • Maintain ecosystem balance
      • Indicate healthy rainforest and wetland ecosystems

      7) How the word is used if it appears alone

      If you see “jaguars” by itself, it usually functions as:

      • a topic heading (rainforest animals),
      • an example of apex predators,
      • a label under an image,
      • a vocabulary term in biology or geography.

      8) Pronunciation

      • jaguar /ˈdʒæɡ.juː.ɑːr/ or /ˈdʒæɡ.wɑːr/
      • jaguars /ˈdʒæɡ.wɑːrz/

      中文(详细解释)

      Image

      Image

      Image

      Image

      1)“jaguars”的基本意思

      jaguarsjaguar(美洲豹) 的复数形式。 美洲豹是 美洲体型最大的猫科动物,主要生活在 中美洲和南美洲

      简单说: jaguars = 生活在美洲热带地区的强壮大型猫科动物


      2)主要特征

      • 花纹: 玫瑰状斑点,内部常有黑点
      • 体型: 结实、力量极强
      • 咬合力大:能咬穿猎物头骨
      • 善游泳:常在水边活动、捕猎

      3)栖息地与分布

      • 热带雨林(最典型)
      • 河流、沼泽、湿地
      • 草原

      美洲豹常被视为 雨林顶级捕食者的代表


      4)捕猎方式

      • 伏击 为主
      • 捕食鹿、水豚、凯门鳄、鱼类等
      • 独居、领地性强

      5)与豹子(leopard)的区别(常考)

      | 特点 | 美洲豹 | 豹子 | | -- | ----- | ----- | | 分布 | 美洲 | 非洲、亚洲 | | 体型 | 更强壮 | 较瘦 | | 斑点 | 中央有黑点 | 无黑点 | | 水性 | 喜水 | 避水 |


      6)生态系统中的作用

      • 属于 顶级捕食者
      • 控制猎物数量
      • 维持雨林生态平衡

      7)单独出现 “jaguars” 的含义

      如果只看到 jaguars,通常表示:

      • 雨林动物举例
      • 捕食者分类
      • 图片说明
      • 生物或地理词汇

      8)发音

      • jaguars 中文可近似理解为:“杰格瓦兹 / 美洲豹”

      If you want, paste the full sentence, diagram, or exam question (e.g., rainforest food webs or canopy adaptations), and I’ll explain exactly why jaguars are mentioned and what concept they illustrate(中英对照、直击考点).

    5. three-toed sloth

      English (thorough explanation)

      Image

      Image

      Image

      Image

      1) What “three-toed sloth” literally means

      A three-toed sloth is a slow-moving tree-dwelling mammal found in Central and South American rainforests. The name comes from the fact that it has three toes on each limb.

      In simple terms: three-toed sloth = a very slow rainforest animal that lives in trees and has three toes


      2) Key physical characteristics

      • Three toes: All four limbs have three long, curved claws used to hang from branches.
      • Body shape: Short neck, rounded head, and long limbs adapted for hanging upside down.
      • Fur: Long and coarse; often looks greenish because algae grow in it.
      • Face: Often appears to be “smiling,” giving sloths a gentle look.

      3) Behavior and lifestyle

      • Extremely slow: One of the slowest mammals on Earth.
      • Arboreal: Lives almost entirely in the canopy of tropical forests.
      • Upside-down living: Eats, sleeps, and even gives birth while hanging from branches.
      • Low energy lifestyle: Moves slowly to conserve energy due to a low-calorie diet.

      4) Diet and digestion

      • Herbivore: Eats mainly leaves, plus some buds and flowers.
      • Slow digestion: Food can take weeks to digest.
      • Low metabolism: This explains their slow movement and calm behavior.

      5) Camouflage and survival

      Three-toed sloths survive by not being noticed:

      • Slow movement avoids attracting attention
      • Greenish fur blends with moss and leaves
      • Algae and insects living in the fur help camouflage the sloth

      This is a classic example of adaptation to the canopy environment.


      6) Difference from two-toed sloths (important clarification)

      | Feature | Three-toed sloth | Two-toed sloth | | -------- | ---------------------------- | ---------------- | | Toes | 3 on all limbs | 2 on front limbs | | Neck | Longer (can rotate head far) | Shorter | | Activity | More diurnal | More nocturnal | | Diet | Mostly leaves | Leaves + fruit |

      Despite the names, both have three toes on their hind feet.


      7) Ecological role

      • Part of rainforest food webs
      • Provide habitat for algae, insects, and fungi in their fur
      • Help illustrate co-evolution and symbiosis

      8) How the term is used if it appears alone

      If you see “three-toed sloth” by itself, it usually functions as:

      • a species example in rainforest or canopy lessons,
      • an example of extreme adaptation,
      • a label or caption under a picture,
      • a biology vocabulary term.

      9) Pronunciation

      • three-toed sloth /θriː toʊd slɒθ/ Sounds like: THREE-tohd sloth

      中文(详细解释)

      Image

      Image

      Image

      Image

      1)“three-toed sloth”的基本意思

      three-toed sloth三趾树懒,是一种生活在 中美洲和南美洲热带雨林树栖哺乳动物,因 每只脚有三根脚趾 而得名。

      简单说: three-toed sloth = 生活在雨林树上的慢动作动物


      2)主要身体特征

      • 三根脚趾:四肢都有三根长而弯曲的爪
      • 适合倒挂:身体结构非常适合长期倒挂在树枝上
      • 毛发粗长:常因藻类生长而呈绿色
      • 表情温和

      3)生活方式

      • 动作极慢:是地球上最慢的哺乳动物之一
      • 树栖生活:几乎一生都在 树冠层
      • 节能生存:慢速行动有助于节省能量
      • 倒挂生活:吃、睡、繁殖都在树上完成

      4)饮食与消化

      • 食草动物:主要吃树叶
      • 消化极慢:消化一次食物可能需要数周
      • 新陈代谢低:这正是行动缓慢的原因

      5)伪装与生存策略

      三趾树懒的生存策略是 “不被发现”

      • 移动慢,不易被察觉
      • 绿色毛发与森林环境融为一体
      • 毛发中的藻类和昆虫形成 共生关系

      6)与二趾树懒的区别(常见考点)

      | 特征 | 三趾树懒 | 二趾树懒 | | ---- | ---- | ------ | | 前肢趾数 | 3 | 2 | | 颈部 | 较灵活 | 较短 | | 活动时间 | 偏白天 | 偏夜间 | | 食性 | 以叶为主 | 叶 + 果实 |

      注意:两种树懒的后肢都是三趾


      7)生态意义

      • 属于雨林 食物网的一部分
      • 体现 共生与适应性进化
      • 是研究 树冠层生态 的经典物种

      8)单独出现该词的含义

      如果只看到 three-toed sloth,通常表示:

      • 雨林动物示例
      • 树冠层适应性代表物种
      • 图片说明或词汇条目

      9)发音

      • three-toed sloth 中文可理解为:“斯里-托德 斯洛斯 / 三趾树懒”

      If you want, you can paste the full sentence, diagram, or exam question (for example about rainforest canopy adaptations or food webs), and I’ll explain exactly why the three-toed sloth is mentioned and what concept it illustrates(中英对照,直击考点).

    6. asters

      English (thorough explanation)

      Image

      Image

      Image

      Image

      1) What “asters” literally means

      Asters is the plural noun of aster. It refers to a group of wildflower plants in the daisy family (Asteraceae), known for their star-shaped flowers.

      The word aster comes from Greek, meaning “star”, describing the flower’s shape.

      In simple terms: asters = star-shaped wildflowers, often blooming in late summer and fall


      2) Key characteristics of asters

      • Flower shape: Looks like a single flower but is actually a composite flower head

      • Disk florets in the center

      • Ray florets (petal-like) around the edge
      • Colors: Purple, blue, pink, white (yellow center is common)
      • Blooming season: Late summer to autumn
      • Habitat: Meadows, grasslands, forest edges, roadsides

      3) Asters and pollination

      Asters are typically insect-pollinated:

      • Attract bees, butterflies, and other insects
      • Produce nectar and pollen
      • Important late-season food sources for pollinators

      This makes asters ecologically important in temperate ecosystems.


      4) Asters in ecosystems (Science 10 relevance)

      Asters:

      • Support pollinator populations
      • Contribute to biodiversity
      • Are part of wildflower communities
      • Often appear in discussions of wind- vs. insect-pollinated plants (asters = insect-pollinated)

      5) How the word is used if it appears alone

      If the excerpt is just “asters”, it usually functions as:

      • a plant category name,
      • an example of insect-pollinated wildflowers,
      • a label or caption under an image,
      • a vocabulary term in a science or ecology text.

      6) Pronunciation

      • aster /ˈæs.tɚ/
      • asters /ˈæs.tɚz/ Sounds like “ASS-ter(s)”.

      中文(详细解释)

      Image

      Image

      Image

      Image

      1)“asters”的基本意思

      astersaster(紫菀) 的复数形式,指一类 野生开花植物,属于 菊科

      “aster” 一词源自希腊语,意思是 “星星”,形容花朵呈放射状。

      简单说: asters = 紫菀类野花,花形像星星


      2)紫菀的主要特征

      • 复合花序:看起来像一朵花,实际上由很多小花组成

      • 中央是 管状花

      • 周围是 舌状花
      • 颜色:紫、蓝、粉、白色居多
      • 花期:夏末到 秋季
      • 生境:草地、林缘、路旁

      3)紫菀与授粉

      紫菀属于 虫媒花

      • 吸引 蜜蜂、蝴蝶等昆虫
      • 提供 花蜜和花粉
      • 是秋季重要的 传粉植物

      4)生态系统中的作用(考试点)

      • 支持传粉昆虫
      • 增加植物多样性
      • 野花群落 的组成部分
      • 常作为 虫媒授粉植物 的例子

      5)单独出现 “asters” 的含义

      如果只看到 asters,通常表示:

      • 植物名称 / 分类名称
      • 野花举例
      • 图片说明
      • 科学词汇

      6)发音

      • asters /ˈæs.tɚz/ 中文可近似读作:“阿斯特兹”

      If you want, you can paste the 完整句子或课文段落(例如 Science 10 的野花或授粉内容),我可以帮你 逐词拆解 + 标注考点(中英对照)。

    7. blue jays

      English (thorough explanation)

      Image

      Image

      Image

      Image

      1) What “blue jays” literally means

      Blue jays is the plural noun of blue jay, a medium-sized songbird native to North America. They are famous for their bright blue feathers, white chest, black markings, and loud, intelligent behavior.

      In simple terms: blue jays = bright blue birds known for loud calls and high intelligence.


      2) Key characteristics of blue jays

      • Color: Blue on top, white underneath, with black bars and a black “necklace”
      • Crest: A small crest on the head that rises when the bird is alert
      • Voice: Very vocal—can squawk loudly and even imitate hawks
      • Diet: Omnivorous—nuts (especially acorns), seeds, insects, fruit, eggs
      • Intelligence: Highly intelligent; excellent memory for food storage

      3) Behavior and habits

      • Food caching: Blue jays hide food (especially acorns) and remember where they put it
      • Territorial: Can be aggressive toward other birds
      • Social: Often seen alone or in small groups; sometimes form flocks in migration
      • Seasonal movement: Some migrate, some stay year-round (common in Canada)

      4) Ecological role

      Blue jays are important to ecosystems because they:

      • Help spread oak trees by burying acorns
      • Control insect populations
      • Act as alarm birds, warning other animals of predators

      5) How the phrase might be used if it appears alone

      If the excerpt is just “blue jays”, it most likely functions as:

      • a heading or topic label (e.g., forest animals),
      • an item in a list (e.g., blue jays, chipmunks, cougars),
      • a caption under a photo,
      • or a vocabulary term in a science reading.

      6) Pronunciation

      • blue jay /bluː dʒeɪ/
      • blue jays /bluː dʒeɪz/

      中文(详细解释)

      Image

      Image

      Image

      Image

      1)“blue jays”的基本意思

      blue jaysblue jay(蓝松鸦) 的复数形式,指一种生活在 北美 的鸟类,以 鲜艳的蓝色羽毛吵闹、聪明 而著名。

      简单说: blue jays = 蓝松鸦,蓝色、聪明、声音很大的鸟


      2)蓝松鸦的主要特征

      • 羽毛颜色: 蓝色为主,腹部白色,带黑色条纹
      • 冠羽: 头顶有可竖起的小羽冠
      • 叫声: 非常吵,能模仿猛禽的叫声
      • 食性: 杂食(坚果、昆虫、水果等)
      • 智力高: 善于记忆和储存食物

      3)行为习性

      • 储存食物:会把橡子等藏起来,之后再取用
      • 领地意识强:对其他鸟类有攻击性
      • 部分迁徙:有的迁徙,有的全年居住(加拿大常见)

      4)生态作用

      • 帮助 传播橡树种子
      • 控制害虫数量
      • 作为“警戒鸟”,提醒周围动物有危险

      5)单独出现 “blue jays” 的含义

      如果只看到 blue jays,通常表示:

      • 标题 / 分类名称
      • 动物列表中的一项
      • 图片说明
      • 科学或阅读理解词汇

      6)发音

      • blue jays /bluː dʒeɪz/ 中文可近似读作:“布鲁 杰兹”

      If you’d like, paste the full sentence or paragraph (for example from a Science 10 ecosystem or food-web passage), and I’ll explain their exact role and meaning in context in both English and Chinese.

    1. Reviewer #1 (Public review):

      Summary:

      Authors explore how sex-peptide (SP) affects post-mating behaviours in adult females, such as receptivity and egg laying. This study identifies different neurons in the adult brain and the VNC that become activated by SP, largely by using an intersectional gene expression approach (split-GAL4) to narrow down the specific neurons involved. They confirm that SP binds to the well-known Sex Peptide Receptor (SPR), initiating a cascade of physiological and behavioural changes related to receptivity and egg laying.

      Comments on revised version:

      The authors have substantially strengthened the manuscript in response to our main concerns.

      In particular, they now explicitly test multiple established PMR nodes (including SAG/SPSN as well as pC1, OviDN/OviEN/OviIN and vpoDN), which helps separate direct SP targets from downstream PMR circuitry and supports their interpretation that some of these known nodes can affect receptivity without necessarily inducing oviposition. They also addressed key technical/clarity points: the requested head/trunk expression controls are provided (Suppl Fig S1), and the VT003280 annotation is corrected (now FD6 rather than "SAG driver"). Overall, these additions make the central conclusion, that distinct CNS neuron subsets ("SPRINz") are sufficient to elicit PMR components, more convincing, and the added comparisons with genital tract expressing lines further argue against a simple "periphery only" explanation.

    2. Reviewer #2 (Public review):

      Sex peptide (SP) transferred during mating from male to female induces various physiological responses in the receiving female. Among those, the increase in oviposition and decrease in sexual receptivity are very remarkable. Naturally, a long standing and significant question is the identify of the underlying sex peptide target neurons that express the SP receptor and are underlying these responses. Identification of these neurons will eventually lead to the identification of the underlying neuronal circuitry.

      The Soller lab has addressed this important question already several years ago (Haussmann et al. 2013), using relevant GAL4-lines and membrane-tethered SP. The results already showed that the action of SP on receptivity and oviposition is mediated by different neuronal subsets and hence can be separated. The GAL4-lines used at that time were, however, broad, and the individual identity of the relevant neurons remained unclear.

      In the present paper, Nallasivan and colleagues carried this analysis a significant step further, using new intersectional approaches and transsynaptic tracing.

      Strength:

      The intersectional approach is appropriate and state-of-the art. The analysis is a very comprehensive tour-de-force and experiments are carefully performed to a high standard. The authors also produced a useful new transgenic line (UAS-FRTstopFRT mSP). The finding that neurons in the brain (head) mediate the SP effect on receptivity, while neurons in the abdomen and thorax (ventral nerve cord or peripheral neurons) mediate the SP effect on oviposition, is a significant step forward in the endavour to identify the underlying neuronal networks and hence a mechanistic understanding of SP action. The analysis identifies a small set of neurons underlying SP responses. Some are part of the post-mating circuitry aind influence receptivity, while other are likely involved in higher order sensory processing. Though these results are not entirely unexpected, they are novel and represent a significant step forwards as the analysis is at a much higher resolution as previous work.

      Weakness:

      Though the analysis is at a much higher resolution as previous work on SP targets, it does not yet reach the resolution of single neuronal cell types. The last paragraph in the discussion rightfully speculates about the neurochemical identity of some of the intersection neurons (e.g. dopaminergic P1 neurons, NPF neurons). These suggested identities could have been confirmed by straight-forward immunostainings agains NPF or TH, for which antisera are available. Moreover, specific GAL4 lines for NPF or P1 or at least TH neurons are available which could be used to express mSP to test whether SP activation of those neurons is sufficient to trigger the SP effect. Moreover, the conclusion that SP target neurons operate as key integrators of sensory information for decision of behavioural outputs needs further experimental confirmation.

    3. Reviewer #3 (Public review):

      Summary:

      This paper reports new findings regarding neuronal circuitries responsible for female post-mating responses (PMRs) in Drosophila. The PMRs are induced by sex peptide (SP) transferred from males during mating. The authors sought to identify SP target neurons using a membrane-tethered SP (mSP) and a collection of GAL4 lines, each containing a fragment derived from the regulatory regions of the SPR, fru, and dsx genes involved in PMR. They identified several lines that induced PMR upon expression of mSP. Using split-GAL4 lines, they identified distinct SP-sensing neurons in the central brain and ventral nerve cord. Analyses of pre- and post-synaptic connection using retro- and trans-Tango placed SP target neurons at the interface of sensory processing interneurons that connect to two common post-synaptic processing neuronal populations in the brain. The authors proposed that SP interferes with the processing of sensory inputs from multiple modalities.

      Strengths:

      Besides the main results described in the summary above, the authors discovered the following:

      (1) Reduction of receptivity and induction of egg-laying are separable by restricting the expression of membrane-tethered SP (mSP): head-specific expression of mSP induces reduction of receptivity only, whereas trunk-specific expression of mSP induces oviposition only. Also, they identified a GAL4 line (SPR12) that induced egg laying but did not reduce receptivity.

      (2) Expression of mSP in the genital tract sensory neurons does not induce PMR. The authors identified three GAL4 drivers (SPR3, SPR 21, and fru9), which robustly expressed mSP in genital tract sensory neurons but did not induce PMRs. Also, SPR12 does not express in genital tract neurons but induces egg laying by expressing mSP.

    4. Author response:

      Public Reviews:

      Reviewer #1 (Public Review):

      Areas of improvement and suggestions:

      (1) "These results suggest the SP targets interneurons in the brain that feed into higher processing centers from different entry points likely representing different sensory input" and "All together, these data suggest that the abdominal ganglion harbors several distinct type of neurons involved in directing PMRs"

      The characterization of the post-mating circuitry has been largely described by the group of Barry Dickson and other labs. I suggest ruling out a potential effect of mSP in any of the well-known post-mating neuronal circuitry, i.e: SPSN, SAG, pC1, vpoDN or OviDNs neurons. A combination of available split-Gal4 should be sufficient to prove this.

      We agree that this information is important to distinguish neurons which are direct SP targets from neurons which are involved in directing reproductive behaviors. We have now tested drivers for these neurons and added these data in Fig 3 (SAG neurons) and as Suppl Figs S4 (SPSN and genital tract neuron drivers SPR3 and SPR21), Suppl Fig S6 (overlap in single cell expression atlas), Suppl Fig S7 (overlap of SPSN split drivers with SPR8, fru11/12 and dsx split drivers in the brain inducing PMRs) and Suppl Fig S9 (pC1, OviDNs, OviENs, OviINs and vpoDN).  

      The newly added data are in full support of our conclusion that SP targets central nervous system neurons, which we termed SP Response Inducing Neurons (SPRINz). In particular, we find lines that express in genital tract neurons, but do not induce an SP response (Supp Figs S4, S7 and S10) or do not express in genital tract neurons and induce an SP response (Fig 2 and Supp Fig S2).

      We have analysed the expression of SPSN in the brain and VNC and find expression in few neurons (Suppl Fig S4). This result is consistent with expression of the genes driving SPSN expression in the single cell expression atlas indicating overlap of expression in very few neurons (Suppl Fig S6). We have already shown that FD6 (VT003280) which is part of the SPSN splitGal4 driver, expresses in the brain and VNC and can induce PMRs from SP expression (Fig 4).

      We have taken this further to test another SPSN driver (VT058873) in combination with SPR8, fru11/12 and dsx and find PMRs induced by mSP expression (Suppl Fig S7). Moreover, if we restrict expression of mSP to the brain with otdflp we can induce PMRs from mSP expression and obtain the same response by activating these brain neurons (Suppl Fig S7). We note that the VT058873 ∩ fru11/12 intersection in combination with otdflp stopmSP or stopTrpA1 in the head, did not result in PMRs. Here, PMR inducing neurons likely reside in the VNC, but currently no tools are available to test this further.

      We further tested pC1, OviDNs, OviENs, OviINs and vpoDN for induction of PMRs from expression of mSP. We are pleased to see that OviEN-SS2s, OviIN-SS1 and vpoDN splitGAl4 drivers can reduce receptivity, but not induce oviposition (Suppl Fig S8). We predicted such drivers based on previously published data (Haussmann et al. 2013), which we now validated.

      (2) Authors must show how specific is their "head" (elav/otd-flp) and "trunk" (elav/tsh) expression of mSP by showing images of the same constructs driving GFP.

      The expression pattern for tshGAL, which expresses in the trunk is already published (Soller et al., 2006). We have added images for “head” expression for tshGAL and adjusted our statement to be pre-dominantly expressed in the VNC in Suppl Fig 1.

      (3) VT3280 is termed as a SAG driver. However, VT3280 is a SPSN specific driver (Feng et al., 2014; Jang et al., 2017; Scheunemann et al., 2019; Laturney et al., 2023). The authors should clarify this.

      According to the reviewers suggestion, we have clarified the specificity of VT003280 and now say that this is FD6.

      (4) Intersectional approaches must rule out the influence of SP on sex-peptide sensing neurons (SPSN) in the ovary by combining their constructs with SPSN-Gal80 construct. In line with this, most of their lines targets the SAG circuit (4I, J and K). Again, here they need to rule out the involvement of SPSN in their receptivity/egg laying phenotypes. Especially because "In the female genital tract, these split-Gal4 combinations show expression in genital tract neurons with innervations running along oviduct and uterine walls (Figures S3A-S3E)".

      We agree with this reviewer that we need a higher resolution of expression to only one cell type. However, this is a major task that we will continue in follow up studies.

      In principal, use of GAL80 is a valid approach to restrict expression, if levels of GAL80 are higher than those of GAL4, because GAL80 binds GAL4 to inhibit its activity. Hence, if levels of GAL80 are lower, results could be difficult to interpret.

      (5) The authors separate head (brain) from trunk (VNC) responses, but they don't narrow down the neural circuits involved on each response. A detailed characterization of the involved circuits especially in the case of the VNC is needed to (a) show that the intersectional approach is indeed labelling distinct subtypes and (b) how these distinct neurons influence oviposition.

      Again, we agree with this reviewer that we need a higher resolution of expression to only one cell type. However, this is a major task that we will continue in follow up studies.

      Reviewer #2 (Public Review):

      Strength:

      The intersectional approach is appropriate and state-of-the art. The analysis is a very comprehensive tour-de-force and experiments are carefully performed to a high standard. The authors also produced a useful new transgenic line (UAS-FRTstopFRT mSP). The finding that neurons in the brain (head) mediate the SP effect on receptivity, while neurons in the abdomen and thorax (ventral nerve cord or peripheral neurons) mediate the SP effect on oviposition, is a significant step forward in the endavour to identify the underlying neuronal networks and hence a mechanistic understanding of SP action. Though this result is not entirely unexpected, it is novel as it was not shown before.

      We thank reviewer 2 for recognizing the advance of our work.

      Weakness:

      Though the analysis identifies a small set of neurons underlying SP responses, it does not go the last step to individually identify at least a few of them. The last paragraph in the discussion rightfully speculates about the neurochemical identity of some of the intersection neurons (e.g. dopaminergic P1 neurons, NPF neurons). At least these suggested identities could have been confirmed by straight-forward immunostainings agains NPF or TH, for which antisera are available. Moreover, specific GAL4 lines for NPF or P1 or at least TH neurons are available which could be used to express mSP to test whether SP activation of those neurons is sufficient to trigger the SP effect.

      We appreciate this reviewers recognition of our previous work showing that receptivity and oviposition are separable. As pointed out we have now gone one step further and identified in a tour de force approach subsets of neurons in the brain and VNC.

      We agree with this reviewer that we need a higher resolution of expression to only one cell type. As pointed out by this reviewer, the neurochemical identity is an excellent suggestions and will help to further restrict expression to just one type of neuron. However, this is a major task that we will continue in follow up studies.

      Reviewer #3 (Public Review):

      Strengths:

      Besides the main results described in the summary above, the authors discovered the following:

      (1) Reduction of receptivity and induction of egg-laying are separable by restricting the expression of membrane-tethered SP (mSP): head-specific expression of mSP induces reduction of receptivity only, whereas trunk-specific expression of mSP induces oviposition only. Also, they identified a GAL4 line (SPR12) that induced egg laying but did not reduce receptivity.

      (2) Expression of mSP in the genital tract sensory neurons does not induce PMR. The authors identified three GAL4 drivers (SPR3, SPR 21, and fru9), which robustly expressed mSP in genital tract sensory neurons but did not induce PMRs. Also, SPR12 does not express in genital tract neurons but induces egg laying by expressing mSP.

      We thank reviewer 2 for recognizing these two important points regarding the SP response that point to a revised model for how the underlying circuitry induces the post-mating response. To further substantiate these findings we now have added a splitGal4 nSyb ∩ ppk which expresses in genital tract neurons, but does not induce PMRs from mSP expression.

      Weaknesses:

      (1) Intersectional expression involving ppk-GAL4-DBD was negative in all GAL4AD lines (Supp. Fig.S5). As the authors mentioned,   neurons may not intersect with SPR, fru, dsx, and FD6 neurons in inducing PMRs by mSP. However, since there was no PMR induction and no GAL4 expression at all in any combination with GAL4-AD lines used in this study, I would like to have a positive control, where intersectional expression of mSP in ppk-GAL4-DBD and other GAL4-AD lines (e.g., ppk-GAL4-AD) would induce PMR.

      We have added a positive control for ppk expression by combining the ppk-DBD line with a nSyb-AD which expresses in all neurons in Supp Fig S8. This experiment confirms our previous observations that ppk splitGal4 in combination with other drivers does not induce an SP response despite driving expression in genital tract neurons. We have expanded the discussion section to point out that we have identified additional cells in the brain expressing ppkGAL4, but expression of split-GAL4 ppk is absent in these cells. Part of this work has previously been published (Nallasivan et al. 2021). Accordingly, we amended the text to say when expression was achieved with ppkGAL or ppk splitGAL4.

      (2) The results of SPR RNAi knock-down experiments are inconclusive (Figure 5). SPR RNAi cancelled the PMR in dsx ∩ fru11/12 and partially in SPR8 ∩ fru 11/12 neurons. SPR RNAi in dsx ∩ SPR8 neurons turned virgin females unreceptive; it is unclear whether SPR mediates the phenotype in SPR8 ∩ fru 11/12 and dsx ∩ SPR8 neurons.

      We agree with this reviewer that the interpretation of the SPR RNAi results are complicated by the fact that SP has additional receptors (Haussmann et al 2013). The results are conclusive for all three intersections when expressing UAS mSP in SPR RNAi with respect to oviposition, e.g. egg laying is not induced in the absence of SPR. For receptivity, the results are conclusive for dsx ∩ fru11/12 and partially for SPR8 ∩ fru 11/12.

      Potentially, SPR RNAi knock-down does not sufficiently reduce SPR levels to completely reduce receptivity in some intersection patterns, likely also because splitGal4 expression is less efficient.

      Why SPR RNAi in dsx ∩ SPR8 neurons turned virgin females unreceptive is unclear, but we anticipate that we need a higher resolution of expression to only one cell type to resolve this unexpected result. However, this is a major task that we will continue in follow up studies.

      SPR RNAi knock-down experiments may also help clarify whether mSP worked autocrine or juxtacrine to induce PMR. mSP may produce juxtacrine signaling, which is cell non-autonomous.

      Whether membrane-tethered SP induces the response in a autocrine manner is an import aspect in the interpretation of the results from mSP expression.

      Removing SPR by SPR RNAi and expression of mSP in the same neurons did not induce egg laying for all three intersection and did not reduce receptivity for dsx ∩ fru11/12 and for SPR8 ∩ fru 11/12. Accordingly, we can conclude that for these neurons the response is induced in an autocrine manner.

      We have added this aspect to the discussion section.

  5. Dec 2025
    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, Chengjian Zhao et al. focused on the interactions between vascular, biliary, and neural networks in the liver microenvironment, addressing the critical bottleneck that the lack of high-resolution 3D visualization has hindered understanding of these interactions in liver disease.

      Strengths:

      This study developed a high-resolution multiplex 3D imaging method that integrates multicolor metallic compound nanoparticle (MCNP) perfusion with optimized CUBIC tissue clearing. This method enables the simultaneous 3D visualization of spatial networks of the portal vein, hepatic artery, bile ducts, and central vein in the mouse liver. The authors reported a perivascular structure termed the Periportal Lamellar Complex (PLC), which is identified along the portal vein axis. This study clarifies that the PLC comprises CD34⁺Sca-1⁺ dual-positive endothelial cells with a distinct gene expression profile, and reveals its colocalization with terminal bile duct branches and sympathetic nerve fibers under physiological conditions.<br />

      Weaknesses:

      This manuscript is well-written, organized, and informative. However, there are some points that need to be clarified.

      (1) After MCNP-dye injection, does it remain in the blood vessels, adsorb onto the cell surface, or permeate into the cells? Does the MCNP-dye have cell selectivity?

      The experimental results showed that after injection, the MCNP series nanoparticles predominantly remained within the lumens of blood vessels and bile ducts, with their tissue distribution determined by physical perfusion. No diffusion of the dye signal into the surrounding parenchymal tissue was observed, nor was there any evidence of adsorption onto the cell surface or entry into cells. The newly added Supplementary Figure S2A–H further confirmed this feature, demonstrating that the dye signals were strictly confined to the luminal space, clearly delineating the continuous course of blood vessels and the branching morphology of bile ducts. These findings strongly support the conclusion that “MCNP dyes are distributed exclusively within the luminal compartments.”

      Therefore, the MCNP dyes primarily serve as intraluminal tracers within the tissue rather than as labels for specific cell types.

      (2) All MCNP-dyes were injected after the mice were sacrificed, and the mice's livers were fixed with PFA. After the blood flow had ceased, how did the authors ensure that the MCNP-dyes were fully and uniformly perfused into the microcirculation of the liver?

      Thank you for the reviewer’s valuable comments. Indeed, since all MCNP dyes were perfused after the mice were euthanized and blood circulation had ceased, we cannot fully ensure a homogeneous distribution of the dye within the hepatic microcirculation. The vascular labeling technique based on metallic nanoparticle dyes used in this study offers clear imaging, stable fluorescence intensity, and multiplexing advantages; however, it also has certain limitations. The main issue is that the dye distribution within the hepatic parenchyma can be affected by factors such as lobular overlap, local tissue compression, and variations in vascular pathways, resulting in regional inhomogeneity of dye perfusion. This is particularly evident in areas where multiple lobes converge or where anatomical structures are complex, leading to local dye accumulation or over-perfusion.

      In our experiments, we attempted to minimize local blockage or over-perfusion by performing PBS pre-flushing and low-pressure, constant-speed perfusion. Nevertheless, localized dye accumulation or uneven distribution may still occur in lobe junctions or structurally complex regions. Such variation represents one of the methodological limitations. Overall, the dye signals in most samples remained confined to the vascular and biliary lumens, and the distribution pattern was highly reproducible.

      We have addressed this issue in the Discussion section but would like to emphasize here that, although this system has clear advantages, it remains sensitive to anatomical variability in the liver—such as lobular overlap and vascular heterogeneity. At vascular junctions, local perfusion inhomogeneity or dye accumulation may occur; therefore, injection strategies and perfusion parameters should be adjusted according to liver size and vascular condition to improve reproducibility and imaging quality. It should also be noted that the results obtained using this method primarily aim to visualize the overall and fine anatomical structures of the hepatic vascular system rather than to quantitatively reflect hemodynamic processes. In the future, we plan to combine in vivo perfusion or dynamic fluid modeling to further validate the diffusion characteristics of the dyes within the hepatic microcirculation.

      (3) It is advisable to present additional 3D perspective views in the article, as the current images exhibit very weak 3D effects. Furthermore, it would be better to supplement with some videos to demonstrate the 3D effects of the stained blood vessels.

      Thank you for the reviewer’s valuable comments. In response to the suggestion, we have added perspective-rendered images generated from the 3D staining datasets to provide a more intuitive visualization of the spatial morphology of the hepatic vasculature. These images have been included in Figure S2A–J. In addition, we have prepared supplementary videos (available upon request) that dynamically display the three-dimensional distribution of the stained vessels, further enhancing the spatial perception and visualization of the results.

      (4) In Figure 1-I, the authors used MCNP-Black to stain the central veins; however, in addition to black, there are also yellow and red stains in the image. The authors need to explain what these stains are in the legend.

      Thank you for the reviewer’s constructive comment. In Figure 1I, MCNP-Black labels the central vein (black), MCNP-Yellow labels the portal vein (yellow), MCNP-Pink labels the hepatic artery (pink), and MCNP-Green labels the bile duct (green). We have revised the Figure 1 legend to include detailed descriptions of the color signals and their corresponding structures to avoid any potential confusion.

      (5) There is a typo in the title of Figure 4F; it should be "stem cell".

      Thank you for the reviewer’s careful correction. We have corrected the spelling error in the title of Figure 4F to “stem cell” and updated it in the revised manuscript.

      (6) Nuclear staining is necessary in immunofluorescence staining, especially for Figure 5e. This will help readers distinguish whether the green color in the image corresponds to cells or dye deposits.

      We thank the reviewer for the valuable suggestion. We understand that nuclear staining can help determine the origin of fluorescence signals. However, in our three-dimensional imaging system, the deep signal acquisition range after tissue clearing often causes nuclear dyes such as DAPI to generate highly dense and widespread fluorescence, especially in regions rich in vascular structures, which can obscure the fine vascular and perivascular details of interest. Therefore, this study primarily focuses on high-resolution visualization of the spatial architecture of the vascular and biliary systems. We have added an explanation regarding this point in Figures S2I–J.

      Reviewer #2 (Public review):

      Summary:

      The present manuscript of Xu et al. reports a novel clearing and imaging method focusing on the liver. The authors simultaneously visualized the portal vein, hepatic artery, central vein, and bile duct systems by injecting metal compound nanoparticles (MCNPs) with different colors into the portal vein, heart left ventricle, inferior vena cava, and the extrahepatic bile duct, respectively. The method involves: trans-cardiac perfusion with 4% PFA, the injection of MCNPs with different colors, clearing with the modified CUBIC method, cutting 200 micrometer thick slices by vibratome, and then microscopic imaging. The authors also perform various immunostaining (DAB or TSA signal amplification methods) on the tissue slices from MCNP-perfused tissue blocks. With the application of this methodical approach, the authors report dense and very fine vascular branches along the portal vein. The authors name them as 'periportal lamellar complex (PLC)' and report that PLC fine branches are directly connected to the sinusoids. The authors also claim that these structures co-localize with terminal bile duct branches and sympathetic nerve fibers, and contain endothelial cells with a distinct gene expression profile. Finally, the authors claim that PLC-s proliferate in liver fibrosis (CCl4 model) and act as a scaffold for proliferating bile ducts in ductular reaction and for ectopic parenchymal sympathetic nerve sprouting.

      Strengths:

      The simultaneous visualization of different hepatic vascular compartments and their combination with immunostaining is a potentially interesting novel methodological approach.

      Weaknesses:

      This reviewer has several concerns about the validity of the microscopic/morphological findings as well as the transcriptomics results. In this reviewer's opinion, the introduction contains overstatements regarding the potential of the method, there are severe caveats in the method descriptions, and several parts of the Results are not fully supported by the documentation. Thus, the conclusions of the paper may be critically viewed in their present form and may need reconsideration by the authors.

      We sincerely thank the reviewer for the thorough evaluation and constructive comments on our study. We fully understand and appreciate the reviewer’s concerns regarding the methodological validity and interpretation of the results. In response, we have made comprehensive revisions and additions to the manuscript as follows:

      First, we have carefully revised the Introduction and Discussion sections to provide a more balanced description of the methodological potential, removing statements that might be considered overstated, and clarifying the applicable scope and limitations of our approach (see the revised Introduction and Discussion).

      Second, we have substantially expanded the Methods section with detailed information on model construction, imaging parameters, data processing workflow, and technical aspects of the single-cell transcriptomic reanalysis, to enhance the transparency and reproducibility of the study.

      Third, we have added additional references and explanatory notes in the Results section to better support the main conclusions (see Section 6 of the Results).

      Finally, we have rechecked and validated all experimental data, and conducted a verification analysis using an independent single-cell RNA-seq dataset (Figure S6). The results confirm that the morphological observations and transcriptomic findings are consistent and reproducible across independent experiments.

      We believe these revisions have greatly strengthened the reliability of our conclusions and the overall scientific rigor of the manuscript. Once again, we sincerely appreciate the reviewer’s valuable comments, which have been very helpful in improving the logic and clarity of our work.

      Reviewer #3 (Public review):

      Summary:

      In the reviewed manuscript, researchers aimed to overcome the obstacles of high-resolution imaging of intact liver tissue. They report successful modification of the existing CUBIC protocol into Liver-CUBIC, a high-resolution multiplex 3D imaging method that integrates multicolor metallic compound nanoparticle (MCNP) perfusion with optimized liver tissue clearing, significantly reducing clearing time and enabling simultaneous 3D visualization of the portal vein, hepatic artery, bile ducts, and central vein spatial networks in the mouse liver. Using this novel platform, the researchers describe a previously unrecognized perivascular structure they termed Periportal Lamellar Complex (PLC), regularly distributed along the portal vein axis. The PLC originates from the portal vein and is characterized by a unique population of CD34⁺Sca-1⁺ dual-positive endothelial cells. Using available scRNAseq data, the authors assessed the CD34⁺Sca-1⁺ cells' expression profile, highlighting the mRNA presence of genes linked to neurodevelopment, biliary function, and hematopoietic niche potential. Different aspects of this analysis were then addressed by protein staining of selected marker proteins in the mouse liver tissue. Next, the authors addressed how the PLC and biliary system react to CCL4-induced liver fibrosis, implying PLC dynamically extends, acting as a scaffold that guides the migration and expansion of terminal bile ducts and sympathetic nerve fibers into the hepatic parenchyma upon injury.

      The work clearly demonstrates the usefulness of the Liver-CUBIC technique and the improvement of both resolution and complexity of the information, gained by simultaneous visualization of multiple vascular and biliary systems of the liver at the same time. The identification of PLC and the interpretation of its function represent an intriguing set of observations that will surely attract the attention of liver biologists as well as hepatologists; however, some claims need more thorough assessment by functional experimental approaches to decipher the functional molecules and the sequence of events before establishing the PLC as the key hub governing the activity of biliary, arterial, and neuronal liver systems. Similarly, the level of detail of the methods section does not appear to be sufficient to exactly recapitulate the performed experiments, which is of concern, given that the new technique is a cornerstone of the manuscript.

      Nevertheless, the work does bring a clear new insight into the liver structure and functional units and greatly improves the methodological toolbox to study it even further, and thus fully deserves the attention of readers.

      Strengths:

      The authors clearly demonstrate an improved technique tailored to the visualization of the liver vasulo-biliary architecture in unprecedented resolution.

      This work proposes a new biological framework between the portal vein, hepatic arteries, biliary tree, and intrahepatic innervation, centered at previously underappreciated protrusions of the portal veins - the Periportal Lamellar Complexes (PLCs).

      Weaknesses:

      Possible overinterpretation of the CD34+Sca1+ findings was built on re-analysis of one scRNAseq dataset.

      Lack of detail in the materials and methods section greatly limits the usefulness of the new technique to other researchers.

      We thank the reviewer for this important comment. We agree that when conclusions are mainly based on a single dataset, overinterpretation should be avoided. In response to this concern, we have carefully re-evaluated and clearly limited the scope of our interpretation of the scRNA-seq analysis. In addition, we performed a validation analysis using an independent single-cell RNA-seq dataset (see new Figure S6), which consistently confirmed the presence and characteristic transcriptional profile of the periportal CD34⁺Sca1⁺ endothelial cell population. These supplementary analyses strengthen the robustness of our findings and address the reviewer’s concern regarding potential overinterpretation.

      In the revised manuscript, we have also greatly expanded the Materials and Methods section by providing detailed information on sample preparation, imaging parameters, data processing workflow, and single-cell reanalysis procedures. These revisions substantially improve the transparency and reproducibility of our methodology, thereby enhancing the usability and reference value of this technique for other researchers.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      Introduction

      (1) In general, the Introduction is very lengthy and repetitive. It needs extensive shortening to a maximum of 2 A4 pages.

      We thank the reviewer for the valuable suggestions. We have thoroughly condensed and restructured the Introduction, removing redundant content and merging related paragraphs to make the theme more focused and the logic clearer. The revised Introduction has been shortened to within two A4 pages, emphasizing the scientific question, innovation, and technical approach of the study.

      (2) Please correct this erroneous sentence:

      '...the liver has evolved the most complex and densely n organized vascular network in the body, consisting primarily of the portal vein system, central vein system, hepatic artery system, biliary system, and intrahepatic autonomic nerve network [6, 7].'

      We thank the reviewer for pointing out this spelling error. The revised sentence is as follows:

      “…the liver has evolved the most complex and densely organized ductal-vascular network in the body, consisting primarily of the portal vein system, central vein system, hepatic artery system, biliary system, and intrahepatic autonomic nerve network [6, 7].”

      (3) '...we achieved a 63.89% improvement in clearing efficiency and a 20.12% increase in tissue transparency'

      Please clarify what you exactly mean by 'clearing efficiency' and 'increased tissue transparency'.

      We thank the reviewer for the valuable comments and have clarified the relevant terminology in the revised manuscript.

      “Clearing efficiency” refers to the improvement in the time required for the liver tissue to become completely transparent when treated with the optimized Liver-CUBIC protocol (40% urea + H₂O₂), compared with the conventional CUBIC method. In this study, the clearing time was reduced from 9 days to 3.25 days, representing a 63.89% increase in time efficiency.

      “Tissue transparency” refers to the ability of the cleared tissue to transmit visible light. We quantified the optical transparency by measuring light transmittance across the 400–900 nm wavelength range using a microplate reader. The results showed that the average transmittance increased by 20.12%, indicating that Liver-CUBIC treatment markedly enhanced the optical clarity of the liver tissue.

      (4) I am concerned about claiming this imaging method as real '3D imaging'. Namely, while the authors clear full lobes, they actually cut the cleared lobes into 200-micrometer-thick slices and perform further microscopy imaging on these slices. Considering that they focus on ductular structures of the liver (such as vasculature, bile duct system, and innervations), 200 micrometer allows a very limited 3D overview, particularly in comparison with the whole-mount immuno-imaging methods combined with light sheet microscopy (such as Adori 2021, Liu 2021, etc). In this context, I feel several parts of the Introduction to be an overstatement: besides of emphasizing the advantages of the technique (such as simultaneous visualization of different hepatic vascular compartments and the bile duct system by MCNPs, the combination with immunostainings), the authors must honestly discuss the limitations (such as limited tissue overview, potential dye perfusion problems - uneven distribution of the dye etc).

      We appreciate the reviewer’s insightful comments. It is true that most of the imaging depth in this study was limited to approximately 200 μm, and thus it could not achieve whole-liver three-dimensional imaging comparable to light-sheet microscopy. However, the primary focus of our study was to resolve the microscopic intrahepatic architecture, particularly the spatial relationships among blood vessels, bile ducts, and nerve fibers. Through high-resolution imaging of thick tissue sections, combined with MCNP-based multichannel labeling and immunofluorescence co-staining, we were able to accurately delineate the three-dimensional distribution of these microstructures within localized regions.

      In addition to thick-section imaging, we also obtained whole-lobe dye perfusion data (as shown in Figure S1F), which comprehensively depict the three-dimensional branching patterns and distribution of the vascular systems within the liver lobe. These images were acquired from intact liver lobes perfused with MCNP dyes, revealing a continuous vascular network extending from major trunks to peripheral branches, thereby demonstrating that our approach is also capable of achieving organ-level visualization.

      We have added this image and a corresponding description in the revised manuscript to more comprehensively present the coverage of our imaging system, and we have incorporated this clarification into the Discussion section.

      Method

      (5) More information may be needed about MCNPs:

      a) As reported, there are nanoparticles with different colors in brightfield microscopy, but the particles are also excitable in fluorescence microscopy. Would you please provide a summary about excitation/emission wavelengths of the different MCNPs? This is crucial to understand to what extent the method is compatible with fluorescence immunohistochemistry.

      We thank the reviewer for the careful attention and professional suggestion. We fully agree that this issue is critical for evaluating the compatibility of our method with fluorescent immunohistochemistry. Different types of metal compound nanoparticles (MCNPs) have clearly distinguishable spectral properties:

      - MCNP-Green and MCNP-Yellow: AF488-matched spectra, with excitation/emission wavelengths of 495/519 nm.

      - MCNP-Pink: Designed for far-red spectra, with excitation/emission wavelengths of 561/640 nm.

      - MCNP-Black: Non-fluorescent, appearing black under bright-field microscopy only.

      The above information has been added to the Materials and Methods section.

      b) Also, is there more systematic information available concerning the advantage of these particles compared to 'traditional' fluorescence dyes, such as Alexa fluor or Cy-dyes, in fluorescence microscopy and concerning their compatibility with various tissue clearing methods (e.g., with the frequently used organic-solvent-based methods)?

      We thank the reviewer for the detailed question. Compared with conventional organic fluorescent dyes, MCNP offers the following advantages:

      - Enhanced photostability: Its inorganic core-shell structure resists fading even after hydrogen peroxide bleaching.

      - High signal stability: Fluorescence is maintained during aqueous-based clearing (e.g., CUBIC) and multiple rounds of staining without quenching.

      We appreciate the reviewer’s suggestion. In our Liver-CUBIC system, MCNP nanoparticles exhibited excellent multi-channel labeling stability and fluorescence signal retention. Regarding compatibility with other clearing methods (e.g., SCAFE, SeeDB, CUBIC), since these methods have limited effectiveness for whole-liver clearing (see Figure 2 of Tainaka, et al. 2014) and cannot meet the requirements for high-resolution microstructural imaging in this study, we consider further testing of their compatibility unnecessary.

      In summary, MCNP dye demonstrates superior signal stability and spectral separation compared with conventional organic fluorescent dyes in multi-channel, long-term, high-transparency three-dimensional tissue imaging.

      c) When you perfuse these particles, to which structures do they bind inside the ducts (vessels, bile ducts)? Is the 48h post-fixation enough to keep them inside the tubes/bind them to the vessel walls? Is there any 'wash-out' during the complex cutting/staining procedure? E.g., in Figure 2D: the 'classical' hepatic artery in the portal triad is not visible - but the MCNP apparently penetrated to the adjacent sinusoids at the edge of the lobulus. Also, in Figure 3B, there is a significant mismatch between the MNCP-green (bile duct) signal and the CD19 (epithelium marker) immunostaining. Please discuss these.

      The experimental results showed that following injection, MCNP nanoparticles primarily remained within the vascular and biliary lumens, and their tissue distribution depended on physical perfusion. No dye signal was observed to diffuse into the surrounding parenchyma, nor did the particles adhere to cell surfaces or enter cells. The newly added Supplementary Figures S2A–H further confirm this feature: the dye signal is strictly confined within the lumens, clearly delineating continuous vascular paths and biliary branching patterns, strongly supporting the conclusion that “MCNP dye is distributed only within luminal spaces.”

      Thus, MCNP dye mainly serves as an intraluminal tracer rather than a label for specific cell types.

      We provide the following explanations and analyses regarding MCNP distribution in the hepatic vascular and biliary systems and its post-fixation stability:

      - Potential signal displacement during sectioning/immunostaining: During slicing and immunostaining, a small number of particles may be washed away due to mechanical cutting or washing steps; however, the overall three-dimensional structure retains high spatial fidelity.

      - Observation in Figure 2D: MCNP was seen entering the sinusoidal spaces at the lobule periphery, but hepatic arteries were not visible, likely due to limitations in section thickness. Although arteries were not apparent in this slice, arterial distribution around the portal vein is visible in Figure 2C. It should be noted that Figures 2C, D, and E do not represent whole-liver imaging, so not all regions necessarily contain visible hepatic arteries. For easier identification, the main hepatic artery trunk is highlighted in cyan in Figure 2E.

      - Incomplete biliary signal in Figure 3B: This may be because CK19 labeling only covers biliary epithelial cells, whereas MCNP-green distributes throughout the biliary lumen. In Figure 3B, the terminal MCNP-green signal exhibits irregular polygonal structures, which we interpret as the canalicular regions.

      (6) Which fixative was used for 48h of postfixation (step 6) after MCNP injections?

      After MCNP injection, mouse livers were post-fixed in 4% paraformaldehyde (PFA) for 48 hours. This fixation condition effectively “locks” the MCNP particles within the vascular and biliary lumens, maintaining their spatial positions, while also being compatible with subsequent sectioning and multi-channel immunostaining analyses.

      The above information has been added to the Materials and Methods section

      (7) What is the 'desired thickness' in step 7? In the case of immunostained tissue, a 200-micrometer slice thickness is mentioned. However, based on the Methods, it is not completely clear what the actual thickness of the tissue was that was examined ultimately in the microscopes, and whether or not the clearing preceded the cutting or vice versa.

      We appreciate the reviewer’s question. The “desired thickness” referred to in step 7 of the manuscript corresponds to the thickness of tissue sections used for immunostaining and high-resolution microscopic imaging, which is typically around 200 µm. We selected 200 µm because this thickness is sufficient to observe the PLC structure in its entirety, allows efficient staining, and preserves tissue architecture well. Other researchers may choose different section thicknesses according to their experimental needs.

      In this study, the processing order for immunostained tissue samples was sectioning followed by clearing, as detailed below:

      Section Thickness

      To ensure antibody penetration and preservation of three-dimensional structure, tissue sections were typically cut to ~200 µm. Thicker sections can be used if more complete three-dimensional structures are required, but adjustments may be needed based on antibody penetration and fluorescence detection conditions.

      Clearing Sequence

      After sectioning, slices were processed using the Liver-CUBIC aqueous-based clearing system.

      (8) More information is needed concerning the 'deep-focus microscopy' (Keyence), the applied confocal system, and the THUNDER 'high resolution imaging system': basic technical information, resolutions, objectives (N.A., working distance), lasers/illumination, filters, etc.

      In this study, all liver lobes (left, right, caudate, and quadrate lobes) were subjected to Liver-CUBIC aqueous-based clearing to ensure uniform visualization of MCNP fluorescence and immunolabeling throughout the three-dimensional imaging of the entire liver.

      The above information has been added to the Materials and Methods section.

      Imaging Systems and Settings

      VHX-6000 Extended Depth-of-Field Microscope: Objective: VH-Z100R, 100×–1000×; resolution: 1 µm (typical); illumination: coaxial reflected; transmitted illumination on platform: ON.

      Zeiss Confocal Microscope (980): Objectives: 20× or 40×; image size: 1024 × 1024. Fluorescence detection was set up in three channels:

      - Channel 1: 639 nm laser, excitation 650 nm, emission 673 nm, detection range 673–758 nm, corresponding to Cy5-T1 (red).

      - Channel 2: 561 nm laser, excitation 548 nm, emission 561 nm, detection range 547–637 nm, corresponding to Cy3-T2 (orange).

      - Channel 3: 488 nm laser, excitation 493 nm, emission 517 nm, detection range 490–529 nm, corresponding to AF488-T3 (green).

      Leica THUNDER Imager 3D Tissue: Fluorescence detection in two channels:

      - Channel 1: FITC channel (excitation 488 nm, emission ~520 nm).

      - Channel 2: Orange-red channel (excitation/emission 561/640 nm).<br /> Equipped with matching filter sets to ensure signal separation.

      The above information has been added to the Materials and Methods section.

      (9) Liver-CUBIC, step 2: which lobe(s) did you clear (...whole liver lobes...).

      In this study, all liver lobes (left, right, caudate, and quadrate lobes) were subjected to Liver-CUBIC aqueous-based clearing to ensure uniform visualization of MCNP fluorescence and immunolabeling throughout the three-dimensional imaging of the entire liver.

      The above information has been added to the Materials and Methods section.

      (10) For the DAB and TSA IHC stainings, did you use free-floating slices, or did you mount the vibratome sections and do the staining on mounted sections?

      In this study, fixed livers were first sectioned into thick slices (~200 µm) using a vibratome. Subsequently, DAB and TSA immunohistochemical (IHC) staining were performed on free-floating sections. During the entire staining process, the slices were kept floating in the solutions, ensuring thorough antibody penetration in the thick sections while preserving the three-dimensional tissue architecture, thereby facilitating multiple rounds of staining and three-dimensional imaging.

      (11) Regarding the 'transmission quantification': this was measured on 1 mm thick slices. While it is interesting to make a comparison between different clearing methods in general, one must note that it is relatively easy to clear 1mm thick tissue slices with almost any kind of clearing technique and in any tissues. The 'real' differences come with thicker blocks, such as >5mm in the thinnest dimension. Do you have such experiences (e.g., comparison in whole 'left lateral liver lobes')?

      In this study, we performed three-dimensional visualization of entire liver lobes to depict the distribution of MCNPs and the overall spatial architecture of the vascular and biliary systems (Figure S1F). However, due to the limitations of the plate reader and fluorescence imaging systems in terms of spatial resolution and light penetration depth, quantitative analyses were conducted only on tissue sections approximately 1 mm thick.

      Regarding the comparative quantification of different clearing methods, as the reviewer noted, nearly all aqueous- or organic solvent–based clearing techniques can achieve relatively uniform transparency in 1 mm-thick tissue sections, so differences at this thickness are limited. We have not yet conducted systematic comparisons on whole-lobe sections thicker than 5 mm and therefore cannot provide “true” difference data for thicker tissues.

      (12) There is no method description for the ELMI studies in the Methods.

      Transmission Electron Microscopy (TEM) Analysis of MCNPs

      Before imaging, the MCNP dye solution was centrifuged at 14,000 × g for 10 minutes at 4 °C to remove aggregates and impurities. The supernatant was collected, diluted 50-fold, and 3–4 μL of the sample was applied onto freshly glow-discharged Quantifoil R1.2/1.3 copper grids (Electron Microscopy Sciences, 300 mesh). The sample was allowed to sit for 30 seconds to enable particle adsorption, after which excess liquid was gently wicked away with filter paper and the grid was air-dried at room temperature. The sample was then negatively stained with 1% uranyl acetate for 30 seconds and air-dried again before imaging.

      Negative-stain TEM images were acquired using a JEOL JEM-1400 transmission electron microscope operating at 120 kV and equipped with a CCD camera. Data acquisition followed standard imaging conditions.

      The above information has been added to the Materials and Methods section.

      (13) Please, provide a method description for the applied CCl4 cirrhosis model. This is completely missing.

      (1) Under a fume hood, carbon tetrachloride (CCl₄) was dissolved in corn oil at a 1:3 volume ratio to prepare a working solution, which was filtered through a 0.2 μm filter into a 30 mL glass vial. In our laboratory, to mimic chronic injury, mice in the experimental group were intraperitoneally injected at a dose of 1 mL/kg body weight per administration.

      (2) Mice were carefully removed from the cage and placed on a scale to record body weight for calculation of the injection volume.

      (3) The needle cap was carefully removed, and the required volume of the pre-prepared CCl₄ solution was drawn into the syringe. The syringe was gently flicked to remove any air bubbles.

      (4) Mice were placed on a textured surface (e.g., wire cage) and restrained. When the mouse was properly positioned, ideally with the head lowered about 30°, the left lower or right lower abdominal quadrant was identified.

      (5) Holding the syringe at a 45° angle, with the bevel facing up, the needle was inserted approximately 4–5 mm into the abdominal wall, and the calculated volume of CCl₄ was injected.

      (6) Mice were returned to their cage and observed for any signs of discomfort.

      (7) Needles and syringes were disposed of in a sharps container without recapping. A new syringe or needle was used for each mouse.

      (8) To establish a progressive liver fibrosis model, injections were administered twice per week (e.g., Monday and Thursday) for 3 or 6 consecutive weeks (n=3 per group). Control mice were injected with an equal volume of corn oil for 3 or 6 weeks (n=3 per group).

      (9) Forty-eight hours after the last injection, mice were euthanized by cervical dislocation, and livers were rapidly harvested. Portions of the liver were processed for paraffin embedding and histological sectioning, while the remaining tissue was either immediately frozen or used for subsequent molecular biology analyses.

      The above information has been added to the Materials and Methods section.

      (14) Please provide a method description for the quantifications reported in Figures 5D, 5F, and 6E.

      ImageJ software was used to analyze 3D stained images (Figs. 5F, 6E), and the ultra-depth-of-field 3D analysis module was used to analyze 3D DAB images (Fig. 5D). The specific steps are as follows:

      Figure 5D: DAB-stained 3D images from the control group and the CCl<sub>4</sub> 6-week (CCl<sub>4</sub>-6W) group were analyzed. For each group, 20 terminal bile duct branch nodes were randomly selected, and the actual path distance along the branch to the nearest portal vein surface was measured. All measurements were plotted as scatter plots to reflect the spatial extension of bile ducts relative to the portal vein under different conditions.

      Figure 5F: TSA 3D multiplex-stained images from the control group, CCl<sub>4</sub> 3-week (CCl<sub>4</sub>-3W), and CCl<sub>4</sub> 6-week (CCl<sub>4</sub>-6W) groups were analyzed. For each group, 5 terminal bile duct branch nodes were randomly selected, and the actual path distance along the branch to the nearest portal vein surface was measured. Measurements were plotted as scatter plots to illustrate bile duct spatial extension.

      Figure 6E: TSA 3D multiplex-stained images from the control, CCl<sub>4</sub>-3W, and CCl<sub>4</sub>-6W groups were analyzed. For each group, 5 terminal nerve branch nodes were randomly selected, and the actual path distance along the branch to the nearest portal vein surface was measured. Scatter plots were generated to depict the spatial distribution of nerves under different treatment conditions.

      (15) Please provide a method description for the human liver samples you used in Figure S6. Patient data, fixation, etc...

      The human liver tissue samples shown in Figure S6 were obtained from adjacent non-tumor liver tissues resected during surgical operations at West China Hospital, Sichuan University. All samples used were anonymized archived tissues, which were applied for scientific research in accordance with institutional ethical guidelines and did not involve any identifiable patient information. After being fixed in 10% neutral formalin for 24 hours, the tissues were routinely processed for paraffin embedding (FFPE), and sectioned into 4 μm-thick slices for immunostaining and fluorescence imaging.

      Results

      (16) While it is stated in the Methods that certain color MCNPs were used for labelling different structures (i.e., yellow: hepatic artery; green: bile duct; portal vein: pink; central veins: black), in some figures, apparently different color MCNPs are used for the respective structures. E.g., in Figure 1J, the artery is pink and the portal vein is green. Please clarify this.

      The color assignment of MCNP dyes is not fixed across different experiments or schematic illustrations. MCNP dyes of different colors are fundamentally identical in their physical and chemical properties and do not exhibit specific binding or affinity for particular vascular structures. We select different colors based on experimental design and imaging presentation needs to facilitate distinction and visualization, thereby enhancing recognition in 3D reconstruction and image display. Therefore, the color labeling in Figure 1F is primarily intended to illustrate the distribution of different vascular systems, rather than indicating a fixed correspondence to a specific dye or injection color.

      (17) In Figure 1J, the hepatic artery is extremely shrunk, while the portal vein is extremely dilated - compared to the physiological situation. Does it relate to the perfusion conditions?

      We appreciate the reviewer’s attention. In fact, under normal physiological conditions, the hepatic arteries labeled by CD31 are naturally narrow. Therefore, the relatively thin hepatic arteries and thicker portal veins shown in Figure 1J are normal and unrelated to the perfusion conditions. See figure 1E of Adori et al., 2021.

      (18) Re: MCNP-black labelled 'oval fenestrae': the Results state 50-100 nm, while they are apparently 5-10-micron diameter in Figure 1I. Accordingly, the comparison with the ELMI studies in the subsequent paragraph is inappropriate.

      We thank the reviewer for the correction. The previous statement was a typographical error. In fact, the diameter of the “elliptical windows” marked by MCNP-black is 5–10 μm, so the diameter of 5–10 μm shown in Figure 1I is correct.

      (19) Please, correct this erroneous sentence: 'Pink marked the hepatic arterial system by injection extrahepatic duct (Figure 2B).'

      Original sentence: “The hepatic arterial system was labeled in pink by injection through the extrahepatic duct (Figure 2B).”

      Revised sentence: “The hepatic arterial system was labeled in pink by injection through the left ventricle (Figure 2B).”

      (20) How do you define the 'primary portal vein tract'?

      We thank the reviewer for the question. The term “primary portal vein tract” refers to the first-order branches of the portal vein that enter the liver from the hepatic hilum. These are the major branches arising directly from the main portal vein trunk and are responsible for supplying blood to the respective hepatic lobes. This definition corresponds to the concept of the first-order portal vein in hepatic anatomy.

      (21) I am concerned that the 'periportal lamellar complex (PLC)' that the Authors describe really exists as a distinct anatomical or functional unit. I also see these in 3D scans - in my opinion, these are fine, lower-order portal vein branches that connect the portal veins to the adjacent sinusoid. The strong MCNP-labelling of these structures may be caused by the 'sticking' of the perfused MCNP solutions in these 'pockets' during the perfusion process. What do these structures look like with SMA or CD31 immunostaining? Also, one may consider that the anatomical evaluation of these structures may have limitations in tissue slices. Have you ever checked MCNP-perfused, cleared full live lobes in light sheet microscope scans? I think this would be very useful to have a comprehensive morphological overview. Unfortunately, based on the presented documentation, I am also not convinced that PLCs are 'co-localize' with fine terminal bile duct branches (Figure 3E, S3C), or with TH+ 'neuronal bead chain networks' (Fig 6C). More detailed and more convincing documentation is needed here.

      We thank the reviewer for the detailed comments. Regarding the existence and function of the periportal lamellar complex (PLC), our observations are based on MCNP-Pink labeling of the portal vein, through which we were able to identify the PLC structure surrounding the portal branches. It should be noted that the PLC represents a very small anatomical structure. Although we have not yet performed light-sheet microscopy scanning, we anticipate that such imaging would primarily visualize larger portal vein branches. Nevertheless, this does not affect our overall conclusions.

      We also appreciate the reviewer’s suggestion that the observed structures might result from MCNP adherence during perfusion. To verify the structural characteristics of the PLC, we performed immunostaining for SMA and CD31, which revealed a specific arrangement pattern of smooth muscle and endothelial markers rather than simple perfusion-induced deposition (Figures 4F and S6B).

      Regarding the apparent colocalization of the PLC with terminal bile duct branches (Figures 3E and S3C) and TH⁺ neuronal bead-like networks (Figure 6C), we acknowledge that current literature evidence remains limited. Therefore, we have carefully described these observations as possible spatial associations rather than definitive conclusions. Future studies integrating high-resolution three-dimensional imaging with functional analyses will help to further clarify the anatomical and physiological significance of the PLC.

      (22) 'Extended depth-of-field three-dimensional bright-field imaging revealed a strict 1:1 anatomical association between the primary portal vein trunk (diameter 280 {plus minus} 32 μm) and the first-order bile duct (diameter 69 {plus minus} 8 μm) (Figures 3A and S3A)'.

      How do you define '1:1 anatomical association'? How do you define and identify the 'order' (primary, secondary) of vessel and bile duct branches in 200-micrometer slices?

      We thank the reviewer for the question. In this study, the term “1:1 anatomical correlation” refers to the stable paired spatial relationship between the main portal vein trunk and its corresponding primary bile duct within the same portal territory. In other words, each main portal vein branch is accompanied by a primary bile duct of matching branching order and trajectory, together forming a “vascular–biliary bundle.”

      The definitions of “primary” and “secondary” branches were based on extended-depth 3D bright-field reconstructions, considering both branching hierarchy and vessel/duct diameters: primary branches arise directly from the main trunk at the hepatic hilum and exhibit the largest diameters (averaging 280 ± 32 μm for the portal vein and 69 ± 8 μm for the bile duct), whereas secondary branches extend from the primary branches toward the lobular interior with smaller calibers.

      (23) In my opinion, the applied methodical approach in the single cell transcriptomics part (data mining in the existing liver single cell database and performing Venn diagram intersection analysis in hepatic endothelial subpopulations) is largely inappropriate and thus, all the statements here are purely speculative. In my opinion, to identify the molecular characteristics of such small and spatially highly organized structures like those fine radial portal branches, the only way is to perform high-resolution spatial transcriptomic.

      We thank the reviewer for the comment. We fully acknowledge the importance of high-resolution spatial transcriptomics in identifying the fine structural characteristics of portal vein branches. Due to current funding and technical limitations, we were unable to perform such high-resolution spatial transcriptomic analyses. However, we validated the molecular features of the PLC using another publicly available liver single-cell RNA-sequencing dataset, which provided preliminary supporting evidence (Figures S6B and S6C). In the manuscript, we have carefully stated that this analysis is exploratory in nature and have avoided overinterpretation. In future studies, high-resolution spatial omics approaches will be invaluable for more precisely delineating the molecular characteristics of these fine structures.

      (24) 'How the autonomic nervous system regulates liver function in mice despite the apparent absence of substantive nerve fiber invasion into the parenchyma remains unclear.'

      Please consider the role of gap junctions between hepatocytes (e.g., Miyashita, 1991; Seseke, 1992).

      In this study, we analyzed the spatial distribution of hepatic nerves in mice using immunofluorescence staining and found that nerve fibers were almost exclusively confined to the portal vein region (Figure S6A). Notably, this distribution pattern differs markedly from that in humans. Previous studies have shown that, in human livers, nerves are not only located around the portal veins but also present along the central veins, interlobular septa, and within the parenchymal connective tissue (Miller et al., 2021; Yi, la Fleur, Fliers & Kalsbeek, 2010).

      Further research has provided a physiological explanation for this interspecies difference: even among species with distinct sympathetic innervation patterns in the parenchyma—i.e., with or without direct sympathetic input—the sympathetic efferent regulatory functions may remain comparable (Beckh, Fuchs, Ballé & Jungermann, 1990). This is because signals released from aminergic and peptidergic nerve terminals can be transmitted to hepatocytes through gap junctions as electrical signals (Hertzberg & Gilula, 1979; Jensen, Alpini & Glaser, 2013; Seseke, Gardemann & Jungermann, 1992; Taher, Farr & Adeli, 2017).

      However, the scarcity of nerve fibers within the mouse hepatic parenchyma suggests that the mechanisms by which the autonomic nervous system regulates liver function in mice may differ from those in humans. This observation prompted us to further investigate the potential role of PLC endothelial cells in this process.

      (25) Please, correct typos throughout the text.

      We thank the reviewer for this comment. We have carefully proofread the entire manuscript and corrected all typographical errors and minor language issues throughout the text.

      Reviewer #3 (Recommendations for the authors):

      (1) A strong recommendation - the authors ought to challenge their scRNAsq- re-analysis with another scRNAseq dataset, namely a recently published atlas of adult liver endothelial, but also mesenchymal, immune, and parenchymal cell populations https://pubmed.ncbi.nlm.nih.gov/40954217/, performed with Smart-seq2 approach, which is perfectly suitable as it brings higher resolution data, and extensive cluster identity validation with stainings. Pietilä et al. indicate a clear distinction of portal vein endothelial cells into two populations that express Adgrg6, Jag1 (e2c), from Vegfc double-positive populations (e5c and e2c). Moreover, the dataset also includes the arterial endothelial cells that were shown to be part of the PLC, but were not followed up with the scRNAseq analysis. This distinction could help the authors to further validate their results, better controlling for cross-contaminations that may occur during scRNAseq preparation.

      We thank the reviewer for the valuable suggestion. As noted, we have further validated the molecular characteristics of the PLC using a recently published atlas of adult liver endothelial cells (Pietilä et al., 2023, PMID: 40954217). This dataset, generated using the Smart-seq2 technique, provides high-resolution transcriptomic profiles. By analyzing this dataset, we identified a CD34⁺LY6A⁺ portal vein endothelial cell population within the e2 cluster, which is localized around the portal vein. We then examined pathways and gene expression patterns related to hematopoiesis, bile duct formation, and neural signaling within these cells. The results revealed gene enrichment patterns consistent with those observed in our primary dataset, further supporting the robustness of our analysis of the PLC’s molecular characteristics.

      (2) Improving the methods section is highly recommended, this includes more detailed information for material and protocols used - catalog numbers; protocol details of the usage - rocking platforms, timing, and tubes used for incubations; GitHub or similar page with code used for the scRNA seq re-analysis.

      We thank the reviewer for the valuable suggestion. We have added more detailed information regarding the materials and experimental procedures in the Methods section, including catalog numbers, incubation conditions (such as the type of shaker, incubation time, and tube specifications), and other relevant parameters.

      (3) In Figure 2A, the authors claim the size of the nanoparticle is 100nm, while based on the image, the size is ~150-180nm. A more thorough quantification of the particle size would help users estimate the usability of their method for further applications.

      We thank the reviewer for the comment. In the TEM image shown in Figure 2A, the nanoparticles indeed appear to be approximately 150–200 nm in size. We have re-verified the particle dimensions and will update the corresponding description in the Methods section to allow readers to more accurately assess the applicability of this approach.

      (4) In Figure 3E, it is not clear what is labeled by the pink signal. Please consider labeling the structures in the figure.

      We thank the reviewer for the valuable comment. The pink signal in Figure 3E was originally intended to label the hepatic artery. However, a slight spatial misalignment occurred during the labeling process, making its position appear closer to the central vein rather than the portal vein in the image. To avoid misunderstanding, we will add clear annotations to the image and clarify this deviation in the figure legend in the revised version. It should also be noted that this figure primarily aims to illustrate the spatial relationship between the bile duct and the portal vein, and this minor deviation does not affect the reliability of our experimental conclusions.

      (5) The following statement is not backed by quantification as it ought to be „Dual-channel three-dimensional confocal imaging combined with CK19 immunostaining revealed that the sites of dye leakage did not coincide with the CK19-positive terminal bile duct epithelium, but instead were predominantly localized within regions adjacent to the PLC structures".

      We thank the reviewer for the valuable comment. We have added the corresponding quantitative analysis to support this conclusion. Quantitative assessment of the extended-depth imaging data revealed that dye leakage predominantly occurred in regions adjacent to the PLC structure, rather than in the perivenous sinusoidal areas. The corresponding results have been presented in the revised Figure 3G.

      (6) Similarly, Figure 4F is central to the Sca1CD34 cell type identification but lacks any quantification, providing it would strengthen the key statement of the article. A possible way to approach this is also by FACS sorting the double-positive cells and bluk/qRT validation.

      We thank the reviewer for raising this point. We agree that quantitative validation of the Sca1⁺CD34⁺ population by FACS sorting could further support our conclusions. However, the primary focus of this study is on the spatial localization and transcriptional features of PLC endothelial cells. The identification of the Sca1⁺CD34⁺ subset is robustly supported by multiple complementary approaches, including three-dimensional imaging, co-staining with pan-endothelial markers, and projection mapping analyses. Collectively, these lines of evidence provide a solid basis for characterizing this unique endothelial population.

      (7) The images in Figure S4D are not comparable, as the Sca1-stained image shows a longitudinal section of the PV, but the other stainings are cross-sections of PVs.

      We thank the reviewer for the careful comment. We agree that the original Sca1-stained image, being a longitudinal section of the portal vein, was not optimal for direct comparison with other cross-sectional images. We have replaced it with a cross-sectional image of the portal vein to ensure comparability across all images. The updated image has been included in the revised Supplementary Figure S4D.

      (8) I might be wrong, but Figure 4J is entirely missing, and only a cartoon is provided. Either remove the results part or provide the data.

      We appreciate the reviewer’s careful observation. Figure 4J was intentionally designed as a schematic illustration to summarize the structural relationships and spatial organization of the portal vein, hepatic artery, and PLC identified in the previous panels (Figures 4A–4I). It does not represent newly acquired experimental data, but rather serves to provide a conceptual overview of the findings.

      To avoid misunderstanding, we have clarified this point in the figure legend and the main text, stating that Figure 4J is a schematic summary rather than an experimental image. Therefore, we respectfully prefer to retain the schematic figure to aid readers’ interpretation of the preceding results.

      (9) The methods section lacks information about the CCL4concentration, and it is thus hard to estimate the dosage of CCL4 received (ml/kg). This is important for the interpretation of the severity of the fibrosis and presence of cirrhosis, as different doses may or may not lead to cirrhosis within the short regimen performed by the authors [PMID: 16015684 DOI: 10.3748/wjg.v11.i27.4167]. Validation of the fibrosis/cirrhosis severity is, in this case, crucial for the correct interpretation of the results. If the level of cirrhosis is not confirmed, only progressive fibrosis should be mentioned in the manuscript, as these two terms cannot be used interchangeably.

      Thank you for the reviewer’s comment. We indeed omitted the information on the concentration of carbon tetrachloride (CCl<sub>4</sub>) in the Methods section. In our experiments, mice received intraperitoneal injections of CCl<sub>4</sub> at a dose of 1 mL/kg body weight, twice per week, for a total of six weeks. We have revised the manuscript accordingly, using the term “progressive fibrosis” to avoid confusion between fibrosis and cirrhosis.

      (10) The following statement is not backed by any correlation analysis: "Particularly during liver fibrosis progression, the PLC exhibits dynamic structural extension correlating with fibrosis severity,.. ".

      We thank the reviewer for the comment. The original statement that the “PLC correlates with fibrosis severity” lacked support from quantitative analysis. To ensure a precise description, we have revised the sentence as follows: “During liver fibrosis progression, the PLC exhibits dynamic structural extension.”

      (11) Similarly, the following statement is not followed by data that would address the impact of innervation on liver function: "How the autonomic nervous system regulates liver function in mice despite the apparent absence of substantive nerve fiber invasion into the parenchyma remains unclear.".

      This section has been revised. In this study, we analyzed the spatial distribution of nerves in the mouse liver using immunofluorescence staining. The results showed that nerve fibers were almost entirely confined to the portal vein region (Figure S6A). Notably, this distribution pattern differs significantly from that in humans. Previous studies have demonstrated that in the human liver, nerves are not only distributed around the portal vein but also present in the central vein, interlobular septa, and connective tissue of the hepatic parenchyma (Miller et al., 2021; Yi, la Fleur, Fliers & Kalsbeek, 2010).

      Previous studies have further explained the physiological basis for this difference: even among species with differences in parenchymal sympathetic innervation (i.e., species with or without direct sympathetic input), their sympathetic efferent regulatory functions may still be similar (Beckh, Fuchs, Ballé & Jungermann, 1990). This is because signals released by adrenergic and peptidergic nerve terminals can be transmitted to hepatocytes as electrical signals through intercellular gap junctions (Hertzberg & Gilula, 1979; Jensen, Alpini & Glaser, 2013; Seseke, Gardemann & Jungermann, 1992; Taher, Farr & Adeli, 2017). However, the scarcity of nerve fibers in the mouse hepatic parenchyma suggests that the mechanism by which the autonomic nervous system regulates liver function in mice may differ from that in humans. This finding also prompts us to further explore the potential role of PLC endothelial cells in this process.

      (12) Could the authors discuss their interpretation of the results in light of the fact that the innervation is lower in cirrhotic patients? https://pmc.ncbi.nlm.nih.gov/articles/PMC2871629/. Also, while ADGRG6 (Gpr126) may play important roles in liver Schwann cells, it is likely not through affecting myelination of the nerves, as the liver nerves are not myelinated https://pubmed.ncbi.nlm.nih.gov/2407769/ and https://www.pnas.org/doi/10.1073/pnas.93.23.13280.

      We have revised the text to state that although most hepatic nerves are unmyelinated, GPR126 (ADGRG6) may regulate hepatic nerve distribution via non-myelination-dependent mechanisms. Studies have shown that GPR126 exerts both Schwann cell–dependent and –independent functions during peripheral nerve repair, influencing axon guidance, mechanosensation, and ECM remodeling (Mogha et al., 2016; Monk et al., 2011; Paavola et al., 2014).

      (13) The manuscript would benefit from text curation that would:

      a) Unify the language describing the PLC, so it is clear that (if) it represents protrusions of the portal veins.

      We have standardized the description of the PLC throughout the manuscript, clearly specifying its anatomical relationship with the portal vein. Wherever appropriate, we indicate that the PLC represents protrusions associated with the portal vein, avoiding ambiguous or inconsistent statements.

      b) Increase the accuracy of the statements.

      Examples: "bile ducts, and the central vein in adult mouse livers."

      We have refined all statements for accuracy.

      c) Reduce the space given to discussion and results in the introduction, moving them to the respective parts. The same applies to the results section, where discussion occurs at more places than in the Discussion part itself.

      We have edited the Introduction, removing detailed results and functional explanations, and retaining only a concise overview.

      Examples: "The formation of PLC structures in the adventitial layer may participate in local blood flow regulation, maintenance of microenvironmental homeostasis, and vascular-stem cell interactions."

      "This finding suggests that PLC endothelial cells not only regulate the periportal microcirculatory blood flow, but also establish a specialized microenvironment that supports periportal hematopoietic regulation, contributing to stem cell recruitment, vascular homeostasis, and tissue repair. "

      "Together, these findings suggest the PLC endothelium may act as a key regulator of bile duct branching and fibrotic microenvironment remodeling in liver cirrhosis. " This one in particular would require further validation with protein stainings and similar, directly in your model.

      d) Provide a clear reference for the used scRNA seq so it's clear that the data were re-analyzed.

      Example: "single-cell transcriptomic analysis revealed significant upregulation of bile duct-related genes in the CD34<sup>+</sup>Sca-1<sup>+</sup> endothelium of PLC in cirrhotic liver, with notably high expression of Lgals1 (Galectin-1) and HGF(Figure 5G) "

      When describing the transcriptional analysis of PLC endothelial cells, we explicitly cited the original scRNA-seq dataset (Su et al., 2021), clarifying that these data were reanalyzed rather than newly generated.

      e) Introducing references for claims that, in places, are crucial for further interpretation of experiments.

      Examples: "It not only guides bile duct branching during development but also"; the authors show no data from liver development.

      Thank you for pointing this out. We have revised the relevant statement to ensure that the claim is accurate and well-supported.

      f) Results sentence "Instead, bile duct epithelial cells at the terminal ducts extended partially along the canalicular network without directly participating in the formation of the bile duct lumen." Lacks a callout to the respective Figure.

      We would like to thank the reviewers for pointing out this issue. In the revised manuscript, the relevant image (Figure 3D) has been clearly annotated with white arrows to indicate the phenomenon of terminal cholangiocytes extending along the bile canaliculi network. Additionally, the schematic diagram on the right side clearly shows the bile canaliculi, cholangiocytes, and bile flow direction using arrows and color coding, thus intuitively corresponding to the textual description.

      (14) Formal text suggestions: The manuscript text contains a lot of missed or excessive spaces and several typos that ought to be fixed. A few examples follow:

      a) "densely n organized vascular network "

      b) "analysis, while offering high spatial "

      c) "specific differences, In the human liver, "

      d) Figure 4F has a typo in the description.

      e) "generation of high signal-to-noise ratio, multi-target " SNR abbreviation was introduced earlier.

      f) Canals of Hering, CoH abbreviation comes much later than the first mention of the Canals of Hering.

      We thank the reviewer for the helpful comment regarding textual consistency. We have carefully reviewed and revised the entire manuscript to improve the accuracy, clarity, and consistency of the text.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study used explicit-solvent simulations and coarse-grained models to identify the mechanistic features that allow for the unidirectional motion of SMC on DNA. Shorter explicit-solvent models describe relevant hydrogen bond energetics, which were then encoded in a coarse-grained structure-based model. In the structure-based model, the authors mimic chemical reactions as signaling changes in the energy landscape of the assembly. By cycling through the chemical cycle repeatedly, the authors show how these time-dependent energetic shifts naturally lead SMC to undergo translocation steps along DNA that are on a length scale that has been identified.

      Strengths:

      Simulating large-scale conformational changes in complex assemblies is extremely challenging. This study utilizes highly-detailed models to parameterize a coarse-grained model, thereby allowing the simulations to connect the dynamics of precise atomistic-level interactions with a large-scale conformational rearrangement. This study serves as an excellent example for this overall methodology, where future studies may further extend this approach to investigated any number of complex molecular assemblies.

      We thank the reviewer for careful reading of our manuscript and highlighting the value of our bottom-up multiscale simulation approach.

      Weaknesses:

      The only relative weakness is that the text does not always clearly communicate which aspects of the dynamics are expected to be robust. That is, which aspects of the dynamics/energetics are less precisely described by this model? Where are the limits of the models, and why should the results be considered within the range of applicability of the models?

      We appreciate this insightful comment and agree that it is important to more explicitly describe the robustness and limitations of the simulation model used in this study. In response to this comment, we have revised the Discussion section of our manuscript.

      First, to clarify the robust aspects of our model, we have added a new subsection titled “Parametric choices and robustness of simulation model” to the Discussion, which is as follows:

      “The switching Gō approach adopted in this study is a powerful tool for providing the relationship between known large-scale conformational changes and the resulting functional and mechanical dynamics of the molecular machine (Brandani and Takada, 2018b; Koga and Takada, 2006b; Nagae et al., 2025). In this study, we mimic conformational change induced by ATP binding and hydrolysis events by instantaneously switching the potential energy function from one that stabilized a given conformation to another that stabilized a different conformation. This drives the protein to undergo a conformational transition toward the minimum of the new energy landscape.

      This approach is particularly well suited to investigate whether a given conformational change in a subunit of a molecular machine can produce the overall motion observed, and whether this process is mechanically feasible. Therefore, the fundamental mechanisms identified in this study, i.e., DNA segment capture mechanism, the correlation between step size and loop length, and the unidirectional translocation mechanism originating from the asymmetric kleisin path, can be considered as robust, as they emerge directly from the structural and topological constraints of the SMC-kleisin architecture rather than from tuned parameters.”

      Additionally, to more clearly define the limits of our model, we have expanded the "Limitations in current simulations" subsection. Specifically, we have added a detailed discussion regarding the energetics and transition pathways inherent to the switching Gō approach, which is as follows:

      “First, use of switching potentials to trigger conformational changes impose a limitation on predictive power for energetics and transition pathways. The switching of potentials is akin to a “vertical excitation” from one energy landscape to another, rather than a thermally activated crossing of an energy barrier. Consequently, the model cannot provide quantitative predictions of the transition rates or the free energy barriers associated with these changes. Furthermore, while the subsequent relaxation follows the new potential landscape, it is not guaranteed to reproduce the unique, physically correct transition pathway. Nevertheless, this simplification is justified because conformational changes within the protein are expected to occur on a much faster timescale than the large-scale motion of the DNA. Thus, this simplification has a limited impact on our main conclusions regarding the functional DNA dynamics driven by these large-scale conformational changes.”

      We have not made any additions regarding the timescale and dwell times for each ATP state, as these were already discussed in the original manuscript.

      Reviewer #2 (Public review):

      Summary:

      The authors perform coarse grained and all atom simulations to provide a mechanism for loop extrusion that is involved in genome compaction.

      Strengths:

      The simulations are very thoughtful. They provide insights into the translocation process, which is only one of the mechanisms. Much of the analyses is very good. Over all the study advances the use of simulations in this complicated systems.

      We sincerely thank the reviewer for their thoughtful and encouraging comments.

      Weaknesses:

      Even the authors point out several limitations, which cannot be easily overcome in the paper because of the paucity of experimental data. Nevertheless, the authors could have done so to illustrate the main assertion that loop extrusion occurs by the motor translocating on DNA. They should mention more clearly that there are alternative theories that have accounted for a number of experimental data.

      We thank the reviewer for these constructive suggestions. As the reviewer pointed out, it is important to state more explicitly how the unidirectional DNA translocation revealed in this study relates to the widely recognized loop-extrusion hypothesis of genome organization and situate our findings with the context of major alternative theories.

      To address this, we first clarify the relationship between the translocation mechanism we observed and the phenomenon of loop extrusion. We emphasize that our simulations were designed to elucidate the core motor activity of the SMC complex, and we explicitly state our view that loop extrusion is a functional consequence of this motor activity when the complex is anchored to DNA.

      Second, as the reviewer also suggested, we addressed alternative models of loop extrusion that also have experimental support in more details. We have revised the Discussion accordingly to provide a more balanced and comprehensive context. Further details are provided in our separate response to the comment below.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript, Yamauchi and colleagues combine all-atom and coarse-grained MD simulations to investigate the mechanism of DNA translocation by prokaryotic SMC complexes. Their multiscale approach is well-justified and supports a segment-capture model in which ATP-dependent conformational changes lead to the unidirectional translocation of DNA. A key insight from the study is that asymmetry in the kleisin path enforces directionality. The work introduces an innovative computational framework that captures key features of SMC motor action, including DNA binding, conformational switching, and translocation.

      This work is well executed and timely, and the methodology offers a promising route for probing other large molecular machines where ATP activity is essential.

      Strengths:

      This manuscript introduces an innovative yet simple method that merges all-atom and coarse-grained, purely equilibrium, MD simulations to investigate DNA translocation by SMC complexes, which is triggered by activated ATP processes. Investigating the impact of ATP on large molecular motors like SMC complexes is extremely challenging, as ATP catalyses a series of chemical reactions that take and keep the system out of equilibrium. The authors simulate the ATP cycle by cycling through distinct equilibrium simulations where the force field changes according to whether the system is assumed to be in the disengaged, engaged, and V-shaped states; this is very clever as it avoids attempting to model the non-equilibrium process of ATP hydrolysis explicitly. This equilibrium switching approach is shown to be an effective way to probe the mechanistic consequences of ATP binding and hydrolysis in the SMC complex system.

      The simulations reveal several important features of the translocation mechanism. These include identifying that a DNA segment of ~200 bp is captured in the engaged state and pumped forward via coordinated conformational transitions, yielding a translocation step size in good agreement with experimental estimates. Hydrogen bonding between DNA and the top of the ATPase heads is shown to be critical for segment capturtrans, as without it, translocation is shown to fail. Finally, asymmetry in the kleisin subunit path is shown to be responsible for unidirectionally.

      This work highlights how molecular simulations are an excellent complement to experiments, as they can exploit experimental findings to provide high-resolution mechanistic views currently inaccessible to experiments. The findings of these simulations are plausible and expand our understanding of how ATP hydrolysis induces directional motion of the SMC complex.

      We thank the reviewer for the thoughtful and encouraging assessment of our work. We appreciate the reviewer’s summary of our key contributions, especially our switching Gō strategy, the segment-capture mechanism of SMC translocation, and the role of kleisin-path asymmetry in ensuring unidirectionality.

      Weaknesses:

      There are aspects of the methodology and modelling assumptions that are not clear and could be better justified. The major ones are listed below:

      (1) The all-atom MD simulations involve a 47-bp DNA duplex interacting with the ATPase heads, from which key residues involved in hydrogen bonding are identified. However, DNA mechanics-including flexibility and hydrogen bond formation-are known to be sequence-dependent. The manuscript uses a single arbitrary sequence but does not discuss potential biases. Could the authors comment on how sequence variability might affect binding geometry or the number of hydrogen bonds observed?

      We thank the reviewer for this insightful comment regarding the potential effects of DNA sequence.

      The primary biological role of the SMC complex is to organize genome architecture on a global scale; as such, its fundamental interaction with DNA is considered not to be sequence-specific. Our all-atom MD simulations and analysis pipeline were designed to probe the nature of this general interaction. Our approach confirms this rationale: the analysis exclusively identified hydrogen bonds formed between amino acid residues and the phosphate groups of the DNA's sugar-phosphate backbone. As shown in Figs. 1B and 1C, the results confirm that the key stabilizing interactions occur between basic residues on the SMC head surface and the DNA backbone. Since the backbone is chemically uniform, the stable binding mode we characterized is inherently sequence-independent.

      While the final bound state is likely sequence-independent, we agree that sequence-dependent properties such as local DNA flexibility or intrinsic curvature could influence the kinetics of the binding process. For example, the rate of initial recognition or the ease of DNA bending on the head surface might vary between AT-rich and GC-rich regions. However, once the DNA is bound, we expect the stable binding geometry and the identity of the key interacting residues to be conserved across different sequences.

      Therefore, we are confident that using a single, representative DNA sequence is a valid approach for elucidating the fundamental, non-sequence-specific aspects of SMC-DNA interaction and does not alter the general validity of the translocation mechanism proposed in this work.

      (2) A key feature of the coarse-grained model is the inclusion of a specific hydrogen-bonding potential between DNA and residues on the ATPase heads. The authors select the top 15 hydrogen-bond-forming residues from the all-atom simulations (with contact probability > 0.05), but the rationale for this cutoff is not explained. Also, the strength of hydrogen bonds in coarse-grained models can be sensitive to context. How did the authors calibrate the strength of this interaction relative to electrostatics, and did they test its robustness (e.g., by varying epsilon or residue set)? Could this interaction be too strong or too weak under certain ionic conditions? What happens when salt is changed?

      Thank you for these comments. We provide our rationale for the parameter choices below.

      The contact probability cutoff of 0.05 was chosen to create a comprehensive set of residues that form physically robust interactions with DNA. To establish this robustness, we performed a parallel set of all-atom simulations using a different force field (see Fig. S2). This cross-validation revealed two key points. First, the top six residues (Arg120, Arg123, Ile63, Arg111, Arg62, and Lys56), which include experimentally confirmed DNA-binding sites, consistently exhibited the highest contact probabilities in both force fields, confirming the reliability of our identification. Second, and just as importantly, many residues with lower contact probabilities (e.g., Trp115, Tyr107, Arg105, Ser124, and Ser54) were also consistently detected across both simulations. This reproducibility suggests that these interactions are physically robust and not artifacts of a specific force field. We therefore concluded that a 0.05 cutoff is a well-balanced threshold that ensures the inclusion of not only the primary anchor residues but also the secondary, moderately interacting residues that are crucial for cooperatively stabilizing the DNA. We discussed this point in Method in the revised manuscript, which is as follows:

      “The rationale for this cutoff is the physical robustness of the identified interactions; all-atom simulations using a different force field confirmed that the same set of key interacting residues, including both strong and moderate binders, was consistently identified (Fig. S2).”

      The strength of the hydrogen bond potential was set to ϵ = 4.0 k​T (≈2.4 kcal/mol), a physically plausible value corresponding to an ideal hydrogen bond. To test the robustness of this parameterization, we performed preliminary simulations where we varied these parameters by (i) reducing the value of ϵ and (ii) restricting the interaction to only the top six anchor residues. In both test cases, while a short DNA duplex (47 bp) could still bind to the ATPase heads, simulations with a long DNA (800 bp) failed to form a stable DNA loop after initial docking. These tests demonstrated that a larger set of cooperative interactions with a physically realistic strength was necessary for the full segment capture mechanism. Our final parameter set (15 residues at ϵ = 4.0 k​T) was thus chosen as the parameter set required to capture both the initial anchoring of DNA and the subsequent cooperative stabilization of the captured loop.

      As correctly pointed out, ionic conditions are a critical factor. Our simulations revealed that the salt concentration had a more pronounced effect on the kinetics of the DNA finding its correct binding site rather than on the thermodynamic stability of the final bound state. During our parameter tuning, we found that at physiological salt conditions (150 mM), long-range electrostatic interactions become dominant. This caused the DNA to be non-specifically captured by positively charged patches on the sides of the heads, which are not the functional binding sites. This off-pathway trapping kinetically prevented the DNA from reaching its proper location within the simulation timeframe. In contrast, the high-salt conditions (300 mM) used in this study screen these long-range interactions, suppressing non-specific trapping and allowing the DNA to efficiently explore the protein surface. This enables the correct binding to be established via the specific, short-range hydrogen bonds. Therefore, the ion concentration in our model is more as a crucial kinetic control factor to reproduce correct binding pathway within a realistic simulation timeframe. This point is discussed in the new subsection entitled “Parametric choices and robustness of simulation model”.

      (3) To enhance sampling, the translocation simulations are run at 300 mM monovalent salt. While this is argued to be physiological for Pyrococcus yayanosii, such a concentration also significantly screens electrostatics, possibly altering the interaction landscape between DNA and protein or among protein domains. This may significantly impact the results of the simulations. Why did the authors not use enhanced sampling methods to sample rare events instead of relying on a high-salt regime to accelerate dynamics?

      We agree that enhanced sampling methods are powerful for exploring rare events. However, many of these techniques require the pre-definition of a suitable, low-dimensional reaction coordinate (RC) to guide the simulation. The primary goal of our study was to discover the DNA translocation mechanism as it emerges naturally from fundamental physical interactions, without imposing a priori assumptions about the specific pathway.

      The DNA segment capture process is complex, involving the coordinated motion of a long DNA polymer and multiple protein domains. Defining a simple RC in advance was not feasible and would have carried a significant risk of biasing the system toward an artificial pathway. Therefore, to avoid such bias, we chose to perform direct, unbiased molecular dynamics simulations. Using a physiologically relevant high-salt concentration (300 mM) for Pyrococcus yayanosii was a strategy to accelerate the system's natural dynamics, allowing us to observe these unbiased trajectories within a feasible computational timescale.

      Because our current work has elucidated the fundamental steps of this mechanism, we agree that this work provides a foundation for more quantitative analyses. As suggested, future studies using methods like Markov State Model analysis or enhanced sampling techniques, guided by more sophisticated RCs defined from the insights of this work, would be a valuable next step for characterizing the free-energy landscape of the process or longer time scale dynamics.

      (4) Only a small fraction of the simulated trajectories complete successful translocation (e.g., 45 of 770 in one set), and this is attributed to insufficient simulation time. While the authors are transparent about this, it raises questions about the reliability of inferred success rates and about possible artefacts (e.g., DNA trapping in coiled-coil arms). Could the authors explore or at least discuss whether alternative sampling strategies (e.g., Markov State Models, transition path sampling) might address this limitation more systematically?

      We thank the reviewer for raising this point that is crucial for considering limitations and future directions of our study.

      As we noted in a previous response, the primary reason we did not employ such enhanced sampling methods was the limited prior knowledge available to define previously uncharacterized DNA translocation process. Therefore, we first try to define the key conformational states and transitions without the potential bias of a predefined model or reaction coordinate. This approach was successful, as it allowed us to identify critical on-pathway states like “DNA segment capture” and significant off-pathway or kinetically trapped states such as 'DNA trapping' between the coiled-coil arms.

      We fully agree that the low success rate observed is a key finding that points to significant kinetic bottlenecks, and that a more systematic analysis is required. Having identified the essential states, applying techniques such as Markov State Models (MSMs) or transition path sampling represents a powerful and logical next step. These methods, using a state-space definition based on our findings, will enable a quantitative characterization of the free-energy landscape and the transition rates between states. This will provide a rigorous understanding of the kinetic factors, such as the depth of the trapped-state energy well, that underlie the low translocation efficiency.

      In the revised manuscript, we discuss the application of these advanced sampling methods as a feasible and promising future direction, which is as follows:

      “Future studies can leverage the insights from this work to overcome the current timescale limitations. Techniques such as Markov state modeling (Husic and Pande, 2018; Prinz et al., 2011) or enhanced sampling methods (Hénin et al., 2022) may be employed to quantitatively characterize the free-energy landscape and transition rates. Such an approach would provide a rigorous understanding of the kinetic barriers, such as the stability of the trapped state, that govern the efficiency of SMC translocation.”

      Reviewer #1 (Recommendations for the authors):

      As noted in the public review, there could be a more systematic description of the limits of the model. The model appears to be carefully crafted, though every model has limits. It could be helpful for the general readership to give some idea of which parametric choices are more critical, and which mechanistic features should be robust to minor changes in parameters.

      We sincerely thank the reviewer for this constructive comment. We agree that clarifying which aspects of our model is robust and sensitive to specific parameter choices is crucial for the reader's understanding.

      We have expanded the Discussion to clarify how specific simulation parameters affect the efficiency and success rate of DNA translocation in our coarse-grained simulations. In particular, we have added a description of the parametric choices for (i) selection and strength of hydrogen bonds, (ii) ionic strength, and (iii) interaction strength between the coiled-coil arms. The discussion can be found in subsection entitled “Parametric choice and robustness of simulation model” in the Discussion, which is as follows:

      “On the other hand, the efficiency and success rate of DNA translocation in our simulations are more sensitive to certain parametric choices. For instance, the selection and strength of hydrogen bond-like interactions are a key factor. Our model incorporates specific hydrogen bonds between the upper surface of the ATPase heads and DNA, based on all-atom simulations. These interactions are essential for initiating segment capture; without them, DNA fails to migrate to the correct binding surface. While the identification of these key residues is a robust finding—persisting across different all-atom force fields (Fig. S2)—their strength and number in the coarse-grained potential are critical parameters that directly influence the probability and kinetics of DNA capture. Another critical parameter is the ionic strength. We performed translocation simulations at an ionic strength of 300 mM to accelerate DNA dynamics. At lower concentrations, non-specific electrostatic interactions between DNA and positively charged patches on the sides of the ATPase heads or coiled-coil arm became dominant, hindering the efficient migration of DNA to its functional binding site. Using a higher-than-physiological ionic strength is a justified practice in coarse-grained simulations employing the Debye-Hückel approximation, as it serves as a first-order correction to mimic the strong local charge screening by condensed counterions that is not explicitly captured by the mean-field model (Brandani et al., 2021; Niina et al., 2017b). Finaly, the interaction strength between the coiled-coil arms is also important. In our model, once the arms closed during the transition from the V-shaped to the disengaged state, they remained closed on the simulated timescale, frequently trapping DNA pushed from the hinge and thereby leading to failed translocation. This behavior suggests that the arm–arm interactions may be overestimated. A parameterization that allows for more frequent, transient opening of the arms could increase the success rate of DNA pumping.”

      Reviewer #2 (Recommendations for the authors):

      This paper reports simulations (all atom and coarse grained) to provide molecular details of loop extrusion. In general, it is a well done paper. There are a few issues that the authors should address.

      (1) The study supposes that loop extrusion occurs by translocation. Although they point out alternate models like scrunching (C Dekker; the theory by Takaki is also based on the scrunching model that the authors should mention), they should discuss this further. After all, the Takaki theory does predict several experimental outcomes very accurately. The precise mechanism has not been nailed down - The paper by Terakawa in Science suggests the extrusion is by translocation, but the evidence is not clear.

      We thank the reviewer for this insightful comment. We agree that our discussion should briefly acknowledge alternative models such as scrunching. We have therefore revised the manuscript to mention the theory by Takaki et al. (Nat. Commun., 2021), which reproduces several experimental outcomes.

      Because our present work specifically addresses the translocation mechanism based on DNA segment capture, we now state that scrunching and related models represent alternative proposals for loop extrusion.

      In this revision, we have added discussion to the end of the subsection titled "DNA segment capture as the mechanism of the DNA translocation by SMC complexes." in the Discussion section, which is as follows:

      “Turning to loop extrusion mechanisms, alternative mechanisms have been proposed in addition to the DNA-segment capture model. For example, Takaki et al. developed a scrunching-based theory that quantitatively accounts for several experimental observations, including force-velocity relationships and step-size distributions. While our present study focuses on the DNA translocation mechanism via segment capture, it is important to note that scrunching and other models remain plausible alternatives for loop extrusion. The precise mechanism may depends on the specific SMC complex and their subunits and remains to be fully resolved.”

      (2) It is unclear how one can say from Figure 4I and J that translocation has taken place. These panels show that the base pair length increases. This should be explained more clearly. They should also simultaneously plot the location of the heads (2D plot).

      Thank you for this valuable suggestion. In response to the comment on how translocation is presented in Fig. 4I and J, we have revised the text to make it clear that the SMC complex moves along DNA in subsection entitled “DNA translocation via DNA-segment capture”, as follows:

      “Fig. 4I represents the one-dimensional contour coordinate of the DNA molecule, indexed by base pairs (1-800). In this plot, translocation is visualized as a discontinuous shift in the range of base-pair indices that the SMC complex contacts over one complete ATP cycle”

      “This translocation is recorded in Fig. 4I as the average coordinate of the kleisin contact region (red dots) jumps from ~400 bp before the cycle to ~600bp after, which corresponds to a translocation event of ~200 bp”

      We believe that adding this explanation makes it clearer to readers that Fig. 4I and 4J provide direct evidence for unidirectional translocation of the SMC complex.

      (3) The transitions between the states are very abrupt (see Figure 2). Please explain. Also, in which state does extrusion take place? What is the role of the V-shape - is it part of the ATPase cycle?

      We thank the reviewer for raising these questions.

      In our simulation, we implemented ATP-binding state change by instantaneously switching the structure-based (Gō-type) potential between reference conformations for the disengaged (apo), engaged (ATP-bound), and V-shaped (ADP-bound) states at predetermined times. The system rapidly relaxes along the new funnel-shaped potential energy surface toward its minimum. This rapid relaxation is why the transition appears abrupt in metrics such as the Q-score in Fig.2.

      The V-shaped state corresponds to a key ADP-bound intermediate within the ATP hydrolysis cycle. Its primary role in our model is preparatory; it establishes the necessary open geometry that allows for the subsequent "zipping" of the coiled-coil arms. Crucially, unidirectional pumping motion is generated during the transition from the V-shaped state to the disengaged state. That is, the zipping motion of the coiled-coil arm pushes the captured DNA segment forward, resulting in a net translocation along the DNA.

      (4) It appears the heads do not move between the disengaged to engaged states. Why not in their model?

      Thank you for pointing out the lack of clarity in explanation of the SMC head movement in our simulations.

      In our model, the transition from the disengaged to the engaged state involves a dynamic rearrangement of the SMC heads. Specifically, one ATPase head slides (~10 Å) and rotates (~85°) relative to the other ATPase head to re-associate at a new dimer interface. This movement drives the global conformational change of the complex from a rod-like shape to an open ring, a mechanism proposed in a previous structural study (Diebold-Durand et al., Mol. Cell, 2017).

      As reviewer 2 noted, this crucial motion, which is reflected in the changing head-head distance and hinge angle in Fig. 2A, was not sufficiently highlighted in the text. We have therefore revised the manuscript to explicitly describe this head rearrangement to improve clarity, which is as follows:

      “Upon transition to the engaged state, the two ATPase heads were quickly rearranged to form the new inter-subunit contacts. Specifically, this rearrangement involves one ATPase head sliding by approximately 10 Å and rotating by 85° relative to the other, allowing it to associate through a different interface (Diebold-Durand et al., 2017b). The fractions of formed contacts, Q-scores, that exist at the disengaged (engaged) states quickly decreased (increased) (Fig. 2A, top two plots).”

      (5) What is pumping - it has been used in Marko NAR in the DNA capture model. How is that illustrated in the simulations?

      We thank the reviewer for raising this point. In the context of the DNA segment-capture model by Marko et al. (NAR, 2019), "pumping" refers to the conceptual process where a DNA loop, captured in an upper compartment of the SMC ring, is transferred to a lower compartment, resulting in net translocation.

      Our simulations provide a direct, molecular-resolution visualization of the physical mechanism underlying this concept. We illustrate that the "pumping" action is not a passive transfer but an active, mechanical process driven by a specific conformational change. This occurs during the transition from the V-shaped (ADP-bound) to the disengaged state. As shown in our trajectories, the two coiled-coil arms close in a zipper-like manner, beginning from the hinge and progressing toward the ATPase heads. This zipping motion physically pushes the captured DNA segment from the hinge region toward the kleisin ring.

      This process is visualized in our simulations as a clear, unidirectional translocation step (see Figs. 4B–D, 4I, and S6). The result is a net forward movement of the DNA by a distance that corresponds to the length of the initially captured loop, a key prediction of the Marko’s model that we quantify in our step-size analysis (Figs. 4K–L and S8).

      To make this point clearer for the reader, we have revised the manuscript. We have explicitly defined this "zipping and pushing" action as the physical basis for the "pumping" mechanism in the subsection titled "Zipping motion of coiled-coil arms pushes the DNA from hinge domain toward kleisin ring", which is as follows:.

      “This active, mechanical pushing of the DNA loop, driven by the sequential closing of the coiled-coil arm, constitutes the physical basis of the “pumping” mechanism that drives unidirectional translocation. Our simulations thus provide a concrete, molecular-level visualization for this key step in the DNA segment-capture model.”

      (6) The length of DNA simulated is small for understandable reasons. Both experiments and theory show that loop extrusion sizes can be very large, far exceeding the sizes of the SMA complex. Could the small size of DNA be affecting the results?

      We thank the reviewer for this important comment. The relationship between our simulated system size and the large-scale phenomena observed experimentally is a key point.

      Our study was specifically designed to elucidate the fundamental mechanism of the elementary, single-cycle translocation step at near-atomic resolution. For this purpose, the 800 bp DNA length was sufficient. The observed translocation step size per cycle was 216 ± 71 bp, which is substantially smaller than the total length of the simulated DNA. This confirms that the boundaries of our system did not artificially constrain the core translocation process we aimed to investigate. Therefore, we think that the DNA length used in this study did not systematically bias our main findings regarding the motor mechanism itself.

      As the reviewer pointed out, on the other hand, our current setup cannot reproduce the formation of kilobase-scale loops. We hypothesize that these large-scale events are intrinsically linked to the stochastic nature of the ATP hydrolysis cycle, which was simplified in our simulation model. We used fixed durations for each state for computational feasibility. In a more realistic scenario, a stochastically prolonged engaged state would provide a larger duration time for a captured DNA loop to grow via thermal diffusion. This could lead to occasional, much larger translocation steps upon ATP hydrolysis, contributing to the large loop sizes seen experimentally.

      (7) Minor point: The first CG model using three sites was introduced in PNAS vol 102, 6789 2005. The authors should consider citing it.

      Thank you for this suggestion. We have now cited the paper the reviewer recommended. Please find subsection entitled Coarse-grained simulations in Materials and Methods.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Weaknesses:

      (1) Weaknesses of this study include a proposed mechanism underlying the sexual dimorphism phenotype based on experimentation in only males, and widespread reliance on over-expression when investigating protein-protein interaction and localization. Additionally, a minor weakness is that the text describing the identification of cyp17a2 as a candidate contains errors that are confusing.

      We thank the reviewer for these insightful comments, which have helped us improve the manuscript.

      (1) Experimentation in males. We focused on male zebrafish for our mechanistic studies to preclude potential confounding effects from female hormones and to directly interrogate the basis of the observed male-biased resistance. As confirmed in the manuscript (lines 151-153), both wild-type and cyp17a2⁻/⁻ males developed normal male sex organs and exhibited comparable androgen levels. This crucial control gives us confidence that the differences in antiviral immunity we observed are a direct consequence of Cyp17a2 loss-of-function, rather than secondary to developmental or hormonal abnormalities. We fully agree that elucidating the mechanism in females represents a valuable and interesting direction for future research.

      (2) Over-expression studies. We acknowledge that overexpression approaches can have inherent limitations. To mitigate this and strengthen our conclusions, we complemented these experiments with loss-of-function data from both knockout zebrafish and knockdown cells, as well as validation at the endogenous level (e.g., Fig. 4J and S4C). The consistent results obtained across these diverse experimental models collectively reinforce our conclusion that Cyp17a2 interacts with and stabilizes STING.

      (3) We thank the reviewer for pointing out the lack of clarity in the text regarding the selection process of Cyp17a2. We have thoroughly revised the manuscript to provide a precise and accurate description of our methodology. The relevant text is now as follows: “Differential expression analysis identified 1511 upregulated and 1117 downregulated genes (Fig. 2A and Table S2). We then focused on a subset of known or putative sexrelated genes. Among these eight candidates, cyp17a2 exhibited the most significant male-biased upregulation, a finding that was subsequently confirmed by qPCR (Fig. 2B and S1A)” (lines 142-144).

      (2) Lines 139-140 describe the data for Figure 2 as deriving from "healthy hermaphroditic adult zebrafish". This appears to be a language error and should be corrected to something that specifies that the comparison made is between healthy adult male and female kidneys.

      We thank the reviewer for pointing out this inaccuracy. This was a terminological error, and we have corrected the text to accurately state “transcriptome sequencing was performed on head-kidney tissues from healthy adult male and female zebrafish” (lines 139-140). We have carefully reviewed the manuscript to ensure no similar errors are present.

      (3) In Figure 2A and associated text cyp17a2 is highlighted but the volcano plot does not indicate why this was an obvious choice. For example, many other genes are also highly induced in male vs female kidneys. Figure 2B and line 143 describe a subset of "eight sex-related genes" but it is not clear how these relate to Figure 2A. The narrative could be improved to clarify how cyp17a2 was selected from Figure 2A and it seems that the authors made an attempt to do this with Figure 2B but it is not clear how these are related. This is important because the available data do not rule out the possibility that other factors also mediate the sexual dimorphism they observed either in combination, in a redundant fashion, or in a more complex genetic fashion. The narrative of the text and title suggests that they consider this to be a monogenic trait but more evidence is needed.

      We thank the reviewer for raising these important points. We have revised the manuscript to clarify the candidate gene selection process and to avoid any implication that the trait is monogenic.

      The selection of cyp17a2 was not based solely on its position in the volcano plot (Fig. 2A), but on a multi-faceted rationale. We first prioritized genes with known or putative sex-related functions from the pool of differentially expressed genes. From this subset, cyp17a2 emerged as the lead candidate due to a combination of unique attributes, it exhibited the most significant and consistent male-biased upregulation among the validated candidates (Fig. 2B and S1A); it is a teleost-specific autosomal gene, suggesting a novel mechanism for sexual dimorphism independent of canonical sex chromosomes; and it showed conserved male-biased expression across multiple tissues (Fig. 2C and 2D). Regarding its representation in the volcano plot, cyp17a2 was included in the underlying dataset but was not explicitly labeled in the revised Figure 2A to maintain visual clarity, as the plot aimed to illustrate the global transcriptomic landscape rather than highlight individual genes.

      We agree with the reviewer that other genetic factors may contribute to the observed sexual dimorphism. Accordingly, we have modified the text throughout the manuscript to remove any suggestion of a purely monogenic trait. Our functional data position cyp17a2 as a key and sufficient factor, as its knockout in males was sufficient to ablate the antiviral resistance phenotype (Fig. 2E-G), demonstrating a major, nonredundant role without precluding potential contributions from other genes.

      The following specific changes have been made to the text.

      (1) The title has been revised by replacing “governs” with “orchestrates.” (line 1)  

      (2) The abstract now states “the male-biased gene cyp17a2 as a critical mediator of this enhanced response” instead of “which are driven by the male-biased gene Cyp17a2 rather than by hormones or sex chromosomes.” (lines 33-34)

      (3) The discussion now states “Our study leverages this unique context to demonstrate that enhanced antiviral immunity in males is mediated by the male-biased expression of the autosomal gene cyp17a2,” removing the comparative phrasing regarding hormones or sex chromosomes. (lines 364-366)

    1. At the time shepublished these essays, she was chief ofthe reference service at the BibliothequeNationale in Paris. She had already beenheavily involved in the development ofthe documentation profession, includingbeing one of the founders and leadersof the Union Francaise des Organismesde Documentation. However, only threeyears after publishing Qu’est-ce que la docu-mentation?, Briet took early retirement

      Briet published Qu'est-ce que la documentation? at the height of her professional life, working at the national library as head of the reference service, and 3 yrs before her early retirement.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Zhou and colleagues developed a computational model of replay that heavily builds on cognitive models of memory in context (e.g., the context-maintenance and retrieval model), which have been successfully used to explain memory phenomena in the past. Their model produces results that mirror previous empirical findings in rodents and offers a new computational framework for thinking about replay.

      Strengths:

      The model is compelling and seems to explain a number of findings from the rodent literature. It is commendable that the authors implement commonly used algorithms from wakefulness to model sleep/rest, thereby linking wake and sleep phenomena in a parsimonious way. Additionally, the manuscript's comprehensive perspective on replay, bridging humans and non-human animals, enhanced its theoretical contribution.

      Weaknesses:

      This reviewer is not a computational neuroscientist by training, so some comments may stem from misunderstandings. I hope the authors would see those instances as opportunities to clarify their findings for broader audiences.

      (1) The model predicts that temporally close items will be co-reactivated, yet evidence from humans suggests that temporal context doesn't guide sleep benefits (instead, semantic connections seem to be of more importance; Liu and Ranganath 2021, Schechtman et al 2023). Could these findings be reconciled with the model or is this a limitation of the current framework?

      We appreciate the encouragement to discuss this connection. Our framework can accommodate semantic associations as determinants of sleep-dependent consolidation, which can in principle outweigh temporal associations. Indeed, prior models in this lineage have extensively simulated how semantic associations support encoding and retrieval alongside temporal associations. It would therefore be straightforward to extend our model to simulate how semantic associations guide sleep benefits, and to compare their contribution against that conferred by temporal associations across different experimental paradigms. In the revised manuscript, we have added a discussion of how our framework may simulate the role of semantic associations in sleep-dependent consolidation.

      “Several recent studies have argued for dominance of semantic associations over temporal associations in the process of human sleep-dependent consolidation (Schechtman et al., 2023; Liu and Ranganath 2021; Sherman et al., 2025), with one study observing no role at all for temporal associations (Schechtman et al., 2023). At first glance, these findings appear in tension with our model, where temporal associations drive offline consolidation. Indeed, prior models have accounted for these findings by suppressing temporal context during sleep (Liu and Ranganath 2024; Sherman et al., 2025). However, earlier models in the CMR lineage have successfully captured the joint contributions of semantic and temporal associations to encoding and retrieval (Polyn et al., 2009), and these processes could extend naturally to offline replay. In a paradigm where semantic associations are especially salient during awake learning, the model could weight these associations more and account for greater co-reactivation and sleep-dependent memory benefits for semantically related than temporally related items. Consistent with this idea, Schechtman et al. (2023) speculated that their null temporal effects likely reflected the task’s emphasis on semantic associations. When temporal associations are more salient and task-relevant, sleep-related benefits for temporally contiguous items are more likely to emerge (e.g., Drosopoulos et al., 2007; King et al., 2017).”

      The reviewer’s comment points to fruitful directions for future work that could employ our framework to dissect the relative contributions of semantic and temporal associations to memory consolidation.

      (2) During replay, the model is set so that the next reactivated item is sampled without replacement (i.e., the model cannot get "stuck" on a single item). I'm not sure what the biological backing behind this is and why the brain can't reactivate the same item consistently.

      Furthermore, I'm afraid that such a rule may artificially generate sequential reactivation of items regardless of wake training. Could the authors explain this better or show that this isn't the case?

      We appreciate the opportunity to clarify this aspect of the model. We first note that this mechanism has long been a fundamental component of this class of models (Howard & Kahana 2002). Many classic memory models (Brown et al., 2000; Burgess & Hitch, 1991; Lewandowsky & Murdock 1989) incorporate response suppression, in which activated items are temporarily inhibited. The simplest implementation, which we use here, removes activated items from the pool of candidate items. Alternative implementations achieve this through transient inhibition, often conceptualized as neuronal fatigue (Burgess & Hitch, 1991; Grossberg 1978). Our model adopts a similar perspective, interpreting this mechanism as mimicking a brief refractory period that renders reactivated neurons unlikely to fire again within a short physiological event such as a sharp-wave ripple. Importantly, this approach does not generate spurious sequences. Instead, the model’s ability to preserve the structure of wake experience during replay depends entirely on the learned associations between items (without these associations, item order would be random). Similar assumptions are also common in models of replay. For example, reinforcement learning models of replay incorporate mechanisms such as inhibition to prevent repeated reactivations (e.g., Diekmann & Cheng, 2023) or prioritize reactivation based on ranking to limit items to a single replay (e.g., Mattar & Daw, 2018). We now discuss these points in the section titled “A context model of memory replay”

      “This mechanism of sampling without replacement, akin to response suppression in established context memory models (Howard & Kahana 2002), could be implemented by neuronal fatigue or refractory dynamics (Burgess & Hitch, 1991; Grossberg 1978). Non-repetition during reactivation is also a common assumption in replay models that regulate reactivation through inhibition or prioritization (Diekmann & Cheng 2023; Mattar & Daw 2018; Singh et al., 2022).”

      (3) If I understand correctly, there are two ways in which novelty (i.e., less exposure) is accounted for in the model. The first and more talked about is the suppression mechanism (lines 639-646). The second is a change in learning rates (lines 593-595). It's unclear to me why both procedures are needed, how they differ, and whether these are two different mechanisms that the model implements. Also, since the authors controlled the extent to which each item was experienced during wakefulness, it's not entirely clear to me which of the simulations manipulated novelty on an individual item level, as described in lines 593-595 (if any).

      We agree that these mechanisms and their relationships would benefit from clarification. As noted, novelty influences learning through two distinct mechanisms. First, the suppression mechanism is essential for capturing the inverse relationship between the amount of wake experience and the frequency of replay, as observed in several studies. This mechanism ensures that items with high wake activity are less likely to dominate replay. Second, the decrease in learning rates with repetition is crucial for preserving the stochasticity of replay. Without this mechanism, the model would increase weights linearly, leading to an exponential increase in the probability of successive wake items being reactivated back-to-back due to the use of a softmax choice rule. This would result in deterministic replay patterns, which are inconsistent with experimental observations.

      We have revised the Methods section to explicitly distinguish these two mechanisms:

      “This experience-dependent suppression mechanism is distinct from the reduction of learning rates through repetition; it does not modulate the update of memory associations but exclusively governs which items are most likely to initiate replay.”

      We have also clarified our rationale for including a learning rate reduction mechanism:

      “The reduction in learning rates with repetition is important for maintaining a degree of stochasticity in the model’s replay during task repetition, since linearly increasing weights would, through the softmax choice rule, exponentially amplify differences in item reactivation probabilities, sharply reducing variability in replay.”

      Finally, we now specify exactly where the learning-rate reduction applied, namely in simulations where sequences are repeated across multiple sessions:

      “In this simulation, the learning rates progressively decrease across sessions, as described above.“

      As to the first mechanism - experience-based suppression - I find it challenging to think of a biological mechanism that would achieve this and is selectively activated immediately before sleep (somehow anticipating its onset). In fact, the prominent synaptic homeostasis hypothesis suggests that such suppression, at least on a synaptic level, is exactly what sleep itself does (i.e., prune or weaken synapses that were enhanced due to learning during the day). This begs the question of whether certain sleep stages (or ultradian cycles) may be involved in pruning, whereas others leverage its results for reactivation (e.g., a sequential hypothesis; Rasch & Born, 2013). That could be a compelling synthesis of this literature. Regardless of whether the authors agree, I believe that this point is a major caveat to the current model. It is addressed in the discussion, but perhaps it would be beneficial to explicitly state to what extent the results rely on the assumption of a pre-sleep suppression mechanism.

      We appreciate the reviewer raising this important point. Unlike the mechanism proposed by the synaptic homeostasis hypothesis, the suppression mechanism in our model does not suppress items based on synapse strength, nor does it modify synaptic weights. Instead, it determines the level of suppression for each item based on activity during awake experience. The brain could implement such a mechanism by tagging each item according to its activity level during wakefulness. During subsequent consolidation, the initial reactivation of an item during replay would reflect this tag, influencing how easily it can be reactivated.

      A related hypothesis has been proposed in recent work, suggesting that replay avoids recently active trajectories due to spike frequency adaptation in neurons (Mallory et al., 2024). Similarly, the suppression mechanism in our model is critical for explaining the observed negative relationship between the amount of recent wake experience and the degree of replay.

      We discuss the biological plausibility of this mechanism and its relationship with existing models in the Introduction. In the section titled “The influence of experience”, we have added the following:

      “Our model implements an activity‑dependent suppression mechanism that, at the onset of each offline replay event, assigns each item a selection probability inversely proportional to its activation during preceding wakefulness. The brain could implement this by tagging each memory trace in proportion to its recent activation; during consolidation, that tag would then regulate starting replay probability, making highly active items less likely to be reactivated. A recent paper found that replay avoids recently traversed trajectories through awake spike‑frequency adaptation (Mallory et al., 2025), which could implement this kind of mechanism. In our simulations, this suppression is essential for capturing the inverse relationship between replay frequency and prior experience. Note that, unlike the synaptic homeostasis hypothesis (Tononi & Cirelli 2006), which proposes that the brain globally downscales synaptic weights during sleep, this mechanism leaves synaptic weights unchanged and instead biases the selection process during replay.”

      (4) As the manuscript mentions, the only difference between sleep and wake in the model is the initial conditions (a0). This is an obvious simplification, especially given the last author's recent models discussing the very different roles of REM vs NREM. Could the authors suggest how different sleep stages may relate to the model or how it could be developed to interact with other successful models such as the ones the last author has developed (e.g., C-HORSE)? 

      We appreciate the encouragement to comment on the roles of different sleep stages in the manuscript, especially since, as noted, the lab is very interested in this and has explored it in other work. We chose to focus on NREM in this work because the vast majority of electrophysiological studies of sleep replay have identified these events during NREM. In addition, our lab’s theory of the role of REM (Singh et al., 2022, PNAS) is that it is a time for the neocortex to replay remote memories, in complement to the more recent memories replayed during NREM. The experiments we simulate all involve recent memories. Indeed, our view is that part of the reason that there is so little data on REM replay may be that experimenters are almost always looking for traces of recent memories (for good practical and technical reasons).

      Regarding the simplicity of the distinction between simulated wake and sleep replay, we view it as an asset of the model that it can account for many of the different characteristics of awake and NREM replay with very simple assumptions about differences in the initial conditions. There are of course many other differences between the states that could be relevant to the impact of replay, but the current target empirical data did not necessitate us taking those into account. This allows us to argue that differences in initial conditions should play a substantial role in an account of the differences between wake and sleep replay.

      We have added discussion of these ideas and how they might be incorporated into future versions of the model in the Discussion section:

      “Our current simulations have focused on NREM, since the vast majority of electrophysiological studies of sleep replay have identified replay events in this stage. We have proposed in other work that replay during REM sleep may provide a complementary role to NREM sleep, allowing neocortical areas to reinstate remote, already-consolidated memories that need to be integrated with the memories that were recently encoded in the hippocampus and replayed during NREM (Singh et al., 2022). An extension of our model could undertake this kind of continual learning setup, where the student but not teacher network retains remote memories, and the driver of replay alternates between hippocampus (NREM) and cortex (REM) over the course of a night of simulated sleep. Other differences between stages of sleep and between sleep and wake states are likely to become important for a full account of how replay impacts memory. Our current model parsimoniously explains a range of differences between awake and sleep replay by assuming simple differences in initial conditions, but we expect many more characteristics of these states (e.g., neural activity levels, oscillatory profiles, neurotransmitter levels, etc.) will be useful to incorporate in the future.”

      Finally, I wonder how the model would explain findings (including the authors') showing a preference for reactivation of weaker memories. The literature seems to suggest that it isn't just a matter of novelty or exposure, but encoding strength. Can the model explain this? Or would it require additional assumptions or some mechanism for selective endogenous reactivation during sleep and rest?

      We appreciate the encouragement to discuss this, as we do think the model could explain findings showing a preference for reactivation of weaker memories, as in Schapiro et al. (2018). In our framework, memory strength is reflected in the magnitude of each memory’s associated synaptic weights, so that stronger memories yield higher retrieved‑context activity during wake encoding than weaker ones. Because the model’s suppression mechanism reduces an item’s replay probability in proportion to its retrieved‑context activity, items with larger weights (strong memories) are more heavily suppressed at the onset of replay, while those with smaller weights (weaker memories) receive less suppression. When items have matched reward exposure, this dynamic would bias offline replay toward weaker memories, therefore preferentially reactivating weak memories. 

      In the section titled “The influence of experience”, we updated a sentence to discuss this idea more explicitly: 

      “Such a suppression mechanism may be adaptive, allowing replay to benefit not only the most recently or strongly encoded items but also to provide opportunities for the consolidation of weaker or older memories, consistent with empirical evidence (e.g., Schapiro et al. 2018; Yu et al., 2024).”

      (5) Lines 186-200 - Perhaps I'm misunderstanding, but wouldn't it be trivial that an external cue at the end-item of Figure 7a would result in backward replay, simply because there is no potential for forward replay for sequences starting at the last item (there simply aren't any subsequent items)? The opposite is true, of course, for the first-item replay, which can't go backward. More generally, my understanding of the literature on forward vs backward replay is that neither is linked to the rodent's location. Both commonly happen at a resting station that is further away from the track. It seems as though the model's result may not hold if replay occurs away from the track (i.e. if a0 would be equal for both pre- and post-run).

      In studies where animals run back and forth on a linear track, replay events are decoded separately for left and right runs, identifying both forward and reverse sequences for each direction, for example using direction-specific place cell sequence templates. Accordingly, in our simulation of, e.g., Ambrose et al. (2016), we use two independent sequences, one for left runs and one for right runs (an approach that has been taken in prior replay modeling work). Crucially, our model assumes a context reset between running episodes, preventing the final item of one traversal from acquiring contextual associations with the first item of the next. As a result, learning in the two sequences remains independent, and when an external cue is presented at the track’s end, replay predominantly unfolds in the backward direction, only occasionally producing forward segments when the cue briefly reactivates an earlier sequence item before proceeding forward.

      We added a note to the section titled “The context-dependency of memory replay” to clarify this:

      “In our model, these patterns are identical to those in our simulation of Ambrose et al. (2016), which uses two independent sequences to mimic the two run directions. This is because the drifting context resets before each run sequence is encoded, with the pause between runs acting as an event boundary that prevents the final item of one traversal from associating with the first item of the next, thereby keeping learning in each direction independent.”

      To our knowledge, no study has observed a similar asymmetry when animals are fully removed from the track, although both types of replay can be observed when animals are away from the track. For example, Gupta et al. (2010) demonstrated that when animals replay trajectories far from their current location, the ratio of forward vs. backward replay appears more balanced. We now highlight this result in the manuscript and explain how it aligns with the predictions of our model:

      “For example, in tasks where the goal is positioned in the middle of an arm rather than at its end, CMR-replay predicts a more balanced ratio of forward and reverse replay, whereas the EVB model still predicts a dominance of reverse replay due to backward gain propagation from the reward. This contrast aligns with empirical findings showing that when the goal is located in the middle of an arm, replay events are more evenly split between forward and reverse directions (Gupta et al., 2010), whereas placing the goal at the end of a track produces a stronger bias toward reverse replay (Diba & Buzsaki 2007).” 

      Although no studies, to our knowledge, have observed a context-dependent asymmetry between forward and backward replay when the animal is away from the track, our model does posit conditions under which it could. Specifically, it predicts that deliberation on a specific memory, such as during planning, could generate an internal context input that biases replay: actively recalling the first item of a sequence may favor forward replay, while thinking about the last item may promote backward replay, even when the individual is physically distant from the track.

      We now discuss this prediction in the section titled “The context-dependency of memory replay”:

      “Our model also predicts that deliberation on a specific memory, such as during planning, could serve to elicit an internal context cue that biases replay: actively recalling the first item of a sequence may favor forward replay, while thinking about the last item may promote backward replay, even when the individual is physically distant from the track. While not explored here, this mechanism presents a potential avenue for future modeling and empirical work.”

      (6) The manuscript describes a study by Bendor & Wilson (2012) and tightly mimics their results. However, notably, that study did not find triggered replay immediately following sound presentation, but rather a general bias toward reactivation of the cued sequence over longer stretches of time. In other words, it seems that the model's results don't fully mirror the empirical results. One idea that came to mind is that perhaps it is the R/L context - not the first R/L item - that is cued in this study. This is in line with other TMR studies showing what may be seen as contextual reactivation. If the authors think that such a simulation may better mirror the empirical results, I encourage them to try. If not, however, this limitation should be discussed.

      Although our model predicts that replay is triggered immediately by the sound cue, it also predicts a sustained bias toward the cued sequence. Replay in our model unfolds across the rest phase as multiple successive events, so the bias observed in our sleep simulations indeed reflects a prolonged preference for the cued sequence.

      We now discuss this issue, acknowledging the discrepancy:

      “Bendor and Wilson (2012) found that sound cues during sleep did not trigger immediate replay, but instead biased reactivation toward the cued sequence over an extended period of time. While the model does exhibit some replay triggered immediately by the cue, it also captures the sustained bias toward the cued sequence over an extended period.”

      Second, within this framework, context is modeled as a weighted average of the features associated with items. As a result, cueing the model with the first R/L item produces qualitatively similar outcomes as cueing it with a more extended R/L cue that incorporates features of additional items. This is because both approaches ultimately use context features unique to the two sides.

      (7) There is some discussion about replay's benefit to memory. One point of interest could be whether this benefit changes between wake and sleep. Relatedly, it would be interesting to see whether the proportion of forward replay, backward replay, or both correlated with memory benefits. I encourage the authors to extend the section on the function of replay and explore these questions.

      We thank the reviewer for this suggestion. Regarding differences in the contribution of wake and sleep to memory, our current simulations predict that compared to rest in the task environment, sleep is less biased toward initiating replay at specific items, leading to a more uniform benefit across all memories. Regarding the contributions of forward and backward replay, our model predicts that both strengthen bidirectional associations between items and contexts, benefiting memory in qualitatively similar ways. Furthermore, we suggest that the offline learning captured  by our teacher-student simulations reflects consolidation processes that are specific to sleep.

      We have expanded the section titled The influence of experience to discuss these predictions of the model: 

      “The results outlined above arise from the model's assumption that replay strengthens bidirectional associations between items and contexts to benefit memory. This assumption leads to several predictions about differences across replay types. First, the model predicts that sleep yields different memory benefits compared to rest in the task environment: Sleep is less biased toward initiating replay at specific items, resulting in a more uniform benefit across all memories. Second, the model predicts that forward and backward replay contribute to memory in qualitatively similar ways but tend to benefit different memories. This divergence arises because forward and backward replay exhibit distinct item preferences, with backward replay being more likely to include rewarded items, thereby preferentially benefiting those memories.”

      We also updated the “The function of replay” section to include our teacher-student speculation:

      “We speculate that the offline learning observed in these simulations corresponds to consolidation processes that operate specifically during sleep, when hippocampal-neocortical dynamics are especially tightly coupled (Klinzing et al., 2019).”

      (8) Replay has been mostly studied in rodents, with few exceptions, whereas CMR and similar models have mostly been used in humans. Although replay is considered a good model of episodic memory, it is still limited due to limited findings of sequential replay in humans and its reliance on very structured and inherently autocorrelated items (i.e., place fields). I'm wondering if the authors could speak to the implications of those limitations on the generalizability of their model. Relatedly, I wonder if the model could or does lead to generalization to some extent in a way that would align with the complementary learning systems framework.

      We appreciate these insightful comments. Traditionally, replay studies have focused on spatial tasks with autocorrelated item representations (e.g., place fields). However, an increasing number of human studies have demonstrated sequential replay using stimuli with distinct, unrelated representations. Our model is designed to accommodate both scenarios. In our current simulations, we employ orthogonal item representations while leveraging a shared, temporally autocorrelated context to link successive items. We anticipate that incorporating autocorrelated item representations would further enhance sequence memory by increasing the similarity between successive contexts. Overall, we believe that the model generalizes across a broad range of experimental settings, regardless of the degree of autocorrelation between items. Moreover, the underlying framework has been successfully applied to explain sequential memory in both spatial domains, explaining place cell firing properties (e.g., Howard et al., 2004), and in non-spatial domains, such as free recall experiments where items are arbitrarily related. 

      In the section titled “A context model of memory replay”, we added this comment to address this point:

      “Its contiguity bias stems from its use of shared, temporally autocorrelated context to link successive items, despite the orthogonal nature of individual item representations. This bias would be even stronger if items had overlapping representations, as observed in place fields.”

      Since CMR-replay learns distributed context representations where overlap across context vectors captures associative structure, and replay helps strengthen that overlap, this could indeed be viewed as consonant with complementary learning systems integration processes. 

      Reviewer #2 (Public Review):

      This manuscript proposes a model of replay that focuses on the relation between an item and its context, without considering the value of the item. The model simulates awake learning, awake replay, and sleep replay, and demonstrates parallels between memory phenomenon driven by encoding strength, replay of sequence learning, and activation of nearest neighbor to infer causality. There is some discussion of the importance of suppression/inhibition to reduce activation of only dominant memories to be replayed, potentially boosting memories that are weakly encoded. Very nice replications of several key replay findings including the effect of reward and remote replay, demonstrating the equally salient cue of context for offline memory consolidation.

      I have no suggestions for the main body of the study, including methods and simulations, as the work is comprehensive, transparent, and well-described. However, I would like to understand how the CMRreplay model fits with the current understanding of the importance of excitation vs inhibition, remembering vs forgetting, activation vs deactivation, strengthening vs elimination of synapses, and even NREM vs REM as Schapiro has modeled. There seems to be a strong association with the efforts of the model to instantiate a memory as well as how that reinstantiation changes across time. But that is not all this is to consolidation. The specific roles of different brain states and how they might change replay is also an important consideration.

      We are gratified that the reviewer appreciated the work, and we agree that the paper would benefit from comment on the connections to these other features of consolidation.

      Excitation vs. inhibition: CMR-replay does not model variations in the excitation-inhibition balance across brain states (as in other models, e.g., Chenkov et al., 2017), since it does not include inhibitory connections. However, we posit that the experience-dependent suppression mechanism in the model might, in the brain, involve inhibitory processes. Supporting this idea, studies have observed increased inhibition with task repetition (Berners-Lee et al., 2022). We hypothesize that such mechanisms may underlie the observed inverse relationship between task experience and replay frequency in many studies. We discuss this in the section titled “A context model of memory replay”:

      “The proposal that a suppression mechanism plays a role in replay aligns with models that regulate place cell reactivation via inhibition (Malerba et al., 2016) and with empirical observations of increased hippocampal inhibitory interneuron activity with experience (Berners-Lee et al., 2022). Our model assumes the presence of such inhibitory mechanisms but does not explicitly model them.”

      Remembering/forgetting, activation/deactivation, and strengthening/elimination of synapses: The model does not simulate synaptic weight reduction or pruning, so it does not forget memories through the weakening of associated weights. However, forgetting can occur when a memory is replayed less frequently than others, leading to reduced activation of that memory compared to its competitors during context-driven retrieval. In the Discussion section, we acknowledge that a biologically implausible aspect of our model is that it implements only synaptic strengthening: 

      “Aspects of the model, such as its lack of regulation of the cumulative positive weight changes that can accrue through repeated replay, are biologically implausible (as biological learning results in both increases and decreases in synaptic weights) and limit the ability to engage with certain forms of low level neural data (e.g., changes in spine density over sleep periods; de Vivo et al., 2017; Maret et al., 2011). It will be useful for future work to explore model variants with more elements of biological plausibility.” Different brain states and NREM vs REM: Reviewer 1 also raised this important issue (see above). We have added the following thoughts on differences between these states and the relationship to our prior work to the Discussion section:

      “Our current simulations have focused on NREM, since the vast majority of electrophysiological studies of sleep replay have identified replay events in this stage. We have proposed in other work that replay during REM sleep may provide a complementary role to NREM sleep, allowing neocortical areas to reinstate remote, already-consolidated memories that need to be integrated with the memories that were recently encoded in the hippocampus and replayed during NREM (Singh et al., 2022). An extension of our model could undertake this kind of continual learning setup, where the student but not teacher network retains remote memories, and the driver of replay alternates between hippocampus (NREM) and cortex (REM) over the course of a night of simulated sleep. Other differences between stages of sleep and between sleep and wake states are likely to become important for a full account of how replay impacts memory. Our current model parsimoniously explains a range of differences between awake and sleep replay by assuming simple differences in initial conditions, but we expect many more characteristics of these states (e.g., neural activity levels, oscillatory profiles, neurotransmitter levels, etc.) will be useful to incorporate in the future.”

      We hope these points clarify the model’s scope and its potential for future extensions.

      Do the authors suggest that these replay systems are more universal to offline processes beyond episodic memory? What about procedural memories and working memory?

      We thank the reviewer for raising this important question. We have clarified in the manuscript:

      “We focus on the model as a formulation of hippocampal replay, capturing how the hippocampus may replay past experiences through simple and interpretable mechanisms.”

      With respect to other forms of memory, we now note that:

      “This motor memory simulation using a model of hippocampal replay is consistent with evidence that hippocampal replay can contribute to consolidating memories that are not hippocampally dependent at encoding (Schapiro et al., 2019; Sawangjit et al., 2018). It is possible that replay in other, more domain-specific areas could also contribute (Eichenlaub et al., 2020).”

      Though this is not a biophysical model per se, can the authors speak to the neuromodulatory milieus that give rise to the different types of replay?

      Our work aligns with the perspective proposed by Hasselmo (1999), which suggests that waking and sleep states differ in the degree to which hippocampal activity is driven by external inputs. Specifically, high acetylcholine levels during waking bias activity to flow into the hippocampus, while low acetylcholine levels during sleep allow hippocampal activity to influence other brain regions. Consistent with this view, our model posits that wake replay is more biased toward items associated with the current resting location due to the presence of external input during waking states. In the Discussion section, we have added a comment on this point:

      “Our view aligns with the theory proposed by Hasselmo (1999), which suggests that the degree of hippocampal activity driven by external inputs differs between waking and sleep states: High acetylcholine levels during wakefulness bias activity into the hippocampus, while low acetylcholine levels during slow-wave sleep allow hippocampal activity to influence other brain regions.”

      Reviewer #3 (Public Review):

      In this manuscript, Zhou et al. present a computational model of memory replay. Their model (CMR-replay) draws from temporal context models of human memory (e.g., TCM, CMR) and claims replay may be another instance of a context-guided memory process. During awake learning, CMR replay (like its predecessors) encodes items alongside a drifting mental context that maintains a recency-weighted history of recently encoded contexts/items. In this way, the presently encoded item becomes associated with other recently learned items via their shared context representation - giving rise to typical effects in recall such as primacy, recency, and contiguity. Unlike its predecessors, CMR-replay has built-in replay periods. These replay periods are designed to approximate sleep or wakeful quiescence, in which an item is spontaneously reactivated, causing a subsequent cascade of item-context reactivations that further update the model's item-context associations.

      Using this model of replay, Zhou et al. were able to reproduce a variety of empirical findings in the replay literature: e.g., greater forward replay at the beginning of a track and more backward replay at the end; more replay for rewarded events; the occurrence of remote replay; reduced replay for repeated items, etc. Furthermore, the model diverges considerably (in implementation and predictions) from other prominent models of replay that, instead, emphasize replay as a way of predicting value from a reinforcement learning framing (i.e., EVB, expected value backup).

      Overall, I found the manuscript clear and easy to follow, despite not being a computational modeller myself. (Which is pretty commendable, I'd say). The model also was effective at capturing several important empirical results from the replay literature while relying on a concise set of mechanisms - which will have implications for subsequent theory-building in the field.

      With respect to weaknesses, additional details for some of the methods and results would help the readers better evaluate the data presented here (e.g., explicitly defining how the various 'proportion of replay' DVs were calculated).

      For example, for many of the simulations, the y-axis scale differs from the empirical data despite using comparable units, like the proportion of replay events (e.g., Figures 1B and C). Presumably, this was done to emphasize the similarity between the empirical and model data. But, as a reader, I often found myself doing the mental manipulation myself anyway to better evaluate how the model compared to the empirical data. Please consider using comparable y-axis ranges across empirical and simulated data wherever possible.

      We appreciate this point. As in many replay modeling studies, our primary goal is to provide a qualitative fit that demonstrates the general direction of differences between our model and empirical data, without engaging in detailed parameter fitting for a precise quantitative fit. Still, we agree that where possible, it is useful to better match the axes. We have updated figures 2B and 2C so that the y-axis scales are more directly comparable between the empirical and simulated data. 

      In a similar vein to the above point, while the DVs in the simulations/empirical data made intuitive sense, I wasn't always sure precisely how they were calculated. Consider the "proportion of replay" in Figure 1A. In the Methods (perhaps under Task Simulations), it should specify exactly how this proportion was calculated (e.g., proportions of all replay events, both forwards and backwards, combining across all simulations from Pre- and Post-run rest periods). In many of the examples, the proportions seem to possibly sum to 1 (e.g., Figure 1A), but in other cases, this doesn't seem to be true (e.g., Figure 3A). More clarity here is critical to help readers evaluate these data. Furthermore, sometimes the labels themselves are not the most informative. For example, in Figure 1A, the y-axis is "Proportion of replay" and in 1C it is the "Proportion of events". I presumed those were the same thing - the proportion of replay events - but it would be best if the axis labels were consistent across figures in this manuscript when they reflect the same DV.

      We appreciate these useful suggestions. We have revised the Methods section to explain in detail how DVs are calculated for each simulation. The revisions clarify the differences between related measures, such as those shown in Figures 1A and 1C, so that readers can more easily see how the DVs are defined and interpreted in each case. 

      Reviewer #4/Reviewing Editor (Public Review):

      Summary:

      With their 'CMR-replay' model, Zhou et al. demonstrate that the use of spontaneous neural cascades in a context-maintenance and retrieval (CMR) model significantly expands the range of captured memory phenomena.

      Strengths:

      The proposed model compellingly outperforms its CMR predecessor and, thus, makes important strides towards understanding the empirical memory literature, as well as highlighting a cognitive function of replay.

      Weaknesses:

      Competing accounts of replay are acknowledged but there are no formal comparisons and only CMR-replay predictions are visualized. Indeed, other than the CMR model, only one alternative account is given serious consideration: A variant of the 'Dyna-replay' architecture, originally developed in the machine learning literature (Sutton, 1990; Moore & Atkeson, 1993) and modified by Mattar et al (2018) such that previously experienced event-sequences get replayed based on their relevance to future gain. Mattar et al acknowledged that a realistic Dyna-replay mechanism would require a learned representation of transitions between perceptual and motor events, i.e., a 'cognitive map'. While Zhou et al. note that the CMR-replay model might provide such a complementary mechanism, they emphasize that their account captures replay characteristics that Dyna-replay does not (though it is unclear to what extent the reverse is also true).

      We thank the reviewer for these thoughtful comments and appreciate the opportunity to clarify our approach. Our goal in this work is to contrast two dominant perspectives in replay research: replay as a mechanism for learning reward predictions and replay as a process for memory consolidation. These models were chosen as representatives of their classes of models because they use simple and interpretable mechanisms that can simulate a wide range of replay phenomena, making them ideal for contrasting these two perspectives.

      Although we implemented CMR-replay as a straightforward example of the memory-focused view, we believe the proposed mechanisms could be extended to other architectures, such as recurrent neural networks, to produce similar results. We now discuss this possibility in the revised manuscript (see below). However, given our primary goal of providing a broad and qualitative contrast of these two broad perspectives, we decided not to undertake simulations with additional individual models for this paper.

      Regarding the Mattar & Daw model, it is true that a mechanistic implementation would require a mechanism that avoids precomputing priorities before replay. However, the "need" component of their model already incorporates learned expectations of transitions between actions and events. Thus, the model's limitations are not due to the absence of a cognitive map.

      In contrast, while CMR-replay also accumulates memory associations that reflect experienced transitions among events, it generates several qualitatively distinct predictions compared to the Mattar & Daw model. As we note in the manuscript, these distinctions make CMR-replay a contrasting rather than complementary perspective.

      Another important consideration, however, is how CMR replay compares to alternative mechanistic accounts of cognitive maps. For example, Recurrent Neural Networks are adept at detecting spatial and temporal dependencies in sequential input; these networks are being increasingly used to capture psychological and neuroscientific data (e.g., Zhang et al, 2020; Spoerer et al, 2020), including hippocampal replay specifically (Haga & Fukai, 2018). Another relevant framework is provided by Associative Learning Theory, in which bidirectional associations between static and transient stimulus elements are commonly used to explain contextual and cue-based phenomena, including associative retrieval of absent events (McLaren et al, 1989; Harris, 2006; Kokkola et al, 2019). Without proper integration with these modeling approaches, it is difficult to gauge the innovation and significance of CMR-replay, particularly since the model is applied post hoc to the relatively narrow domain of rodent maze navigation.

      First, we would like to clarify our principal aim in this work is to characterize the nature of replay, rather than to model cognitive maps per se. Accordingly, CMR‑replay is not designed to simulate head‐direction signals, perform path integration, or explain the spatial firing properties of neurons during navigation. Instead, it focuses squarely on sequential replay phenomena, simulating classic rodent maze reactivation studies and human sequence‐learning tasks. These simulations span a broad array of replay experimental paradigms to ensure extensive coverage of the replay findings reported across the literature. As such, the contribution of this work is in explaining the mechanisms and functional roles of replay, and demonstrating that a model that employs simple and interpretable memory mechanisms not only explains replay phenomena traditionally interpreted through a value-based lens but also accounts for findings not addressed by other memory-focused models.

      As the reviewer notes, CMR-replay shares features with other memory-focused models. However, to our knowledge, none of these related approaches have yet captured the full suite of empirical replay phenomena, suggesting the combination of mechanisms employed in CMR-replay is essential for explaining these phenomena. In the Discussion section, we now discuss the similarities between CMR-replay and related memory models and the possibility of integrating these approaches:

      “Our theory builds on a lineage of memory-focused models, demonstrating the power of this perspective in explaining phenomena that have often been attributed to the optimization of value-based predictions. In this work, we focus on CMR-replay, which exemplifies the memory-centric approach through a set of simple and interpretable mechanisms that we believe are broadly applicable across memory domains. Elements of CMR-replay share similarities with other models that adopt a memory-focused perspective. The model learns distributed context representations whose overlaps encodes associations among items, echoing associative learning theories in which overlapping patterns capture stimulus similarity and learned associations (McLaren & Mackintosh 2002). Context evolves through bidirectional interactions between items and their contextual representations, mirroring the dynamics found in recurrent neural networks (Haga & Futai 2018; Levenstein et al., 2024). However, these related approaches have not been shown to account for the present set of replay findings and lack mechanisms—such as reward-modulated encoding and experience-dependent suppression—that our simulations suggest are essential for capturing these phenomena. While not explored here, we believe these mechanisms could be integrated into architectures like recurrent neural networks (Levenstein et al., 2024) to support a broader range of replay dynamics.”

      Recommendations For The Authors

      Reviewer #1 (Recommendations For The Authors):

      (1) Lines 94-96: These lines may be better positioned earlier in the paragraph.

      We now introduce these lines earlier in the paragraph.

      (2) Line 103 - It's unclear to me what is meant by the statement that "the current context contains contexts associated with previous items". I understand why a slowly drifting context will coincide and therefore link with multiple items that progress rapidly in time, so multiple items will be linked to the same context and each item will be linked to multiple contexts. Is that the idea conveyed here or am I missing something? I'm similarly confused by line 129, which mentions that a context is updated by incorporating other items' contexts. How could a context contain other contexts?

      In the model, each item has an associated context that can be retrieved via Mfc. This is true even before learning, since Mfc is initialized as an identity matrix. During learning and replay, we have a drifting context c that is updated each time an item is presented. At each timestep, the model first retrieves the current item’s associated context cf by Mfc, and incorporates it into c. Equation #2 in the Methods section illustrates this procedure in detail. Because of this procedure, the drifting context c is a weighted sum of past items’ associated contexts. 

      We recognize that these descriptions can be confusing. We have updated the Results section to better distinguish the drifting context from items’ associated context. For example, we note that:

      “We represent the drifting context during learning and replay with c and an item's associated context with cf.”

      We have also updated our description of the context drift procedure to distinguish these two quantities: 

      “During awake encoding of a sequence of items, for each item f, the model retrieves its associated context cf via Mfc. The drifting context c incorporates the item's associated context cf and downweights its representation of previous items' associated contexts (Figure 1c). Thus, the context layer maintains a recency weighted sum of past and present items' associated contexts.”

      (3) Figure 1b and 1d - please clarify which axis in the association matrices represents the item and the context.

      We have added labels to show what the axes represent in Figure 1.

      (4) The terms "experience" and "item" are used interchangeably and it may be best to stick to one term.

      We now use the term “item” wherever we describe the model results. 

      (5) The manuscript describes Figure 6 ahead of earlier figures - the authors may want to reorder their figures to improve readability.

      We appreciate this suggestion. We decided to keep the current figure organization since it allows us to group results into different themes and avoid redundancy. 

      (6) Lines 662-664 are repeated with a different ending, this is likely an error.

      We have fixed this error.

      Reviewer #3 (Recommendations For The Authors):

      Below, I have outlined some additional points that came to mind in reviewing the manuscript - in no particular order.

      (1) Figure 1: I found the ordering of panels a bit confusing in this figure, as the reading direction changes a couple of times in going from A to F. Would perhaps putting panel C in the bottom left corner and then D at the top right, with E and F below (also on the right) work?

      We agree that this improves the figure. We have restructured the ordering of panels in this figure. 

      (2) Simulation 1: When reading the intro/results for the first simulation (Figure 2a; Diba & Buszaki, 2007; "When animals traverse a linear track...", page 6, line 186). It wasn't clear to me why pre-run rest would have any forward replay, particularly if pre-run implied that the animal had no experience with the track yet. But in the Methods this becomes clearer, as the model encodes the track eight times prior to the rest periods. Making this explicit in the text would make it easier to follow. Also, was there any reason why specifically eight sessions of awake learning, in particular, were used?

      We now make more explicit that the animals have experience with the track before pre-run rest recording:

      “Animals first acquire experience with a linear track by traversing it to collect a reward. Then, during the pre-run rest recording, forward replay predominates.”

      We included eight sessions of awake learning to match with the number of sessions in Shin et al. (2017), since this simulation attempts to explain data from that study. After each repetition, the model engages in rest. We have revised the Methods section to indicate the motivation for this choice: 

      “In the simulation that examines context-dependent forward and backward replay through experience (Figs. 2a and 5a), CMR-replay encodes an input sequence shown in Fig. 7a, which simulates a linear track run with no ambiguity in the direction of inputs, over eight awake learning sessions (as in Shin et al. 2019)”

      (3) Frequency of remote replay events: In the simulation based on Gupta et al, how frequently overall does remote replay occur? In the main text, the authors mention the mean frequency with which shortcut replay occurs (i.e., the mean proportion of replay events that contain a shortcut sequence = 0.0046), which was helpful. But, it also made me wonder about the likelihood of remote replay events. I would imagine that remote replay events are infrequent as well - given that it is considerably more likely to replay sequences from the local track, given the recency-weighted mental context. Reporting the above mean proportion for remote and local replay events would be helpful context for the reader.

      In Figure 4c, we report the proportion of remote replay in the two experimental conditions of Gupta et al. that we simulate. 

      (4) Point of clarification re: backwards replay: Is backwards replay less likely to occur than forward replay overall because of the forward asymmetry associated with these models? For example, for a backwards replay event to occur, the context would need to drift backwards at least five times in a row, in spite of a higher probability of moving one step forward at each of those steps. Am I getting that right?

      The reviewer’s interpretation is correct: CMR-replay is more likely to produce forward than backward replay in sleep because of its forward asymmetry. We note that this forward asymmetry leads to high likelihood of forward replay in the section titled “The context-dependency of memory replay”: 

      “As with prior retrieved context models (Howard & Kahana 2002; Polyn et al., 2009), CMR-replay encodes stronger forward than backward associations. This asymmetry exists because, during the first encoding of a sequence, an item's associated context contributes only to its ensuing items' encoding contexts. Therefore, after encoding, bringing back an item's associated context is more likely to reactivate its ensuing than preceding items, leading to forward asymmetric replay (Fig. 6d left).”

      (5) On terminating a replay period: "At any t, the replay period ends with a probability of 0.1 or if a task-irrelevant item is reactivated." (Figure 1 caption; see also pg 18, line 635). How was the 0.1 decided upon? Also, could you please add some detail as to what a 'task-irrelevant item' would be? From what I understood, the model only learns sequences that represent the points in a track - wouldn't all the points in the track be task-relevant?

      This value was arbitrarily chosen as a small value that allows probabilistic stopping. It was not motivated by prior modeling or a systematic search. We have added: “At each timestep, the replay period ends either with a stop probability of 0.1 or if a task-irrelevant item becomes reactivated. (The choice of the value 0.1 was arbitrary; future work could explore the implications of varying this parameter).” 

      In addition, we now explain in the paper that task irrelevant items “do not appear as inputs during awake encoding, but compete with task-relevant items for reactivation during replay, simulating the idea that other experiences likely compete with current experiences during periods of retrieval and reactivation.”

      (6) Minor typos:

      Turn all instances of "nonlocal" into "non-local", or vice versa

      "For rest at the end of a run, cexternal is the context associated with the final item in the sequence. For rest at the end of a run, cexternal is the context associated with the start item." (pg 20, line 663) - I believe this is a typo and that the second sentence should begin with "For rest at the START of a run".

      We have updated the manuscript to correct these typos. 

      (7) Code availability: I may have missed it, but it doesn't seem like the code is currently available for these simulations. Including the commented code in a public repository (Github, OSF) would be very useful in this case.

      We now include a Github link to our simulation code: https://github.com/schapirolab/CMR-replay.

    1. As Dallas overshadows the battle for Black Lives Matter

      This is a reference to a news story which dates back to 2016, when Micah J. Johnson shot five Dallas police officers dead and harmed other eleven people. Since the killer was a Black man and the shooting happened during a peaceful Black Live Matter rally, the assassination was connected to the movement itself. However, "the BLM organization responded to these critiques head-on, calling the attack "the result of the actions of a lone gunman” and calling it “dangerous and irresponsible” to “assign the actions of one person to an entire movement”." https://www.theguardian.com/us-news/2016/jul/09/black-lives-matter-dallas-protest-shooting https://www.billboard.com/music/rb-hip-hop/eminem-untouchable-lyrics-decoded-8062711/

    2. another slave in my head

      Double consciousness, again, is a key concept to interpret this line: Lamar feels like a prisoner in his own head, enchained by his own thoughts. This occurs because he has internalized a way of perceiving and judging reality which pertains to the oppressor (in this case, whites).

    3. The blacker the berry, the sweeter the juice

      "This Black proverb appears as early as 1929 in the title of Wallace Thurman’s novel, The Blacker the Berry. Most agree that the phrase is meant as an affirmation of the richness and beauty of Black people and of darker skin Blacks. In many ways it is a counter response to the pervasive celebration of white or lighter skin Black Americans. The phrase appears in Tupac Shakur’s 1993 song “Keep Your Head Up,” and continues to flow through Black culture as a form of praise and affirmation." https://thedig.howard.edu/all-stories/save-culture-slang-exploration-black-language-expressions

    1. As Dallas overshadows the battle for Black Lives Matter

      This is a reference to a news story which dates back to 2016: Micah J. Johnson, shot five Dallas police officers dead and harmed other eleven people. Since the killer was a Black man and the shooting happened during a peaceful Black Live Matter rally, the assassination was connected to the movement itself. However, "the BLM organization responded to these critiques head-on, calling the attack "the result of the actions of a lone gunman” and calling it “dangerous and irresponsible” to “assign the actions of one person to an entire movement”." https://www.theguardian.com/us-news/2016/jul/09/black-lives-matter-dallas-protest-shooting https://www.billboard.com/music/rb-hip-hop/eminem-untouchable-lyrics-decoded-8062711/

    2. The blacker the berry, the sweeter the juice

      "This Black proverb appears as early as 1929 in the title of Wallace Thurman’s novel, The Blacker the Berry. Most agree that the phrase is meant as an affirmation of the richness and beauty of Black people and of darker skin Blacks. In many ways it is a counter response to the pervasive celebration of white or lighter skin Black Americans. The phrase appears in Tupac Shakur’s 1993 song “Keep Your Head Up,” and continues to flow through Black culture as a form of praise and affirmation." https://thedig.howard.edu/all-stories/save-culture-slang-exploration-black-language-expressions

    3. another slave in my head

      Double consciousness is a key concept to interpret this line: Lamar feels like a prisoner in his own head, enchained by his own thoughts. This occurs because he has internalized a way of perceiving and judging reality which pertains to the oppressor (in this case,whites).

    1. Reviewer #2 (Public review):

      This paper introduces a framework for modeling individual differences in decision-making by learning a low-dimensional representation (the "individuality index") from one task and using it to predict behaviour in a different task. The approach is evaluated on two types of tasks: a sequential value-based decision-making task and a perceptual decision task (MNIST). The model shows improved prediction accuracy when incorporating this learned representation compared to baseline models.

      The motivation is solid, and the modelling approach is interesting, especially the use of individual embeddings to enable cross-task generalization. That said, several aspects of the evaluation and analysis could be strengthened.

      (1) The MNIST SX baseline appears weak. RTNet isn't directly comparable in structure or training. A stronger baseline would involve training the GRU directly on the task without using the individuality index-e.g., by fixing the decoder head. This would provide a clearer picture of what the index contributes.

      (2) Although the focus is on prediction, the framework could offer more insight into how behaviour in one task generalizes to another. For example, simulating predicted behaviours while varying the individuality index might help reveal what behavioural traits it encodes.

      (3) It's not clear whether the model can reproduce human behaviour when acting on-policy. Simulating behaviour using the trained task solver and comparing it with actual participant data would help assess how well the model captures individual decision tendencies.

      (4) Figures 3 and S1 aim to show that individuality indices from the same participant are closer together than those from different participants. However, this isn't fully convincing from the visualizations alone. Including a quantitative presentation would help support the claim.

      (5) The transfer scenarios are often between very similar task conditions (e.g., different versions of MNIST or two-step vs three-step MDP). This limits the strength of the generalization claims. In particular, the effects in the MNIST experiment appear relatively modest, and the transfer is between experimental conditions within the same perceptual task. To better support the idea of generalizing behavioural traits across tasks, it would be valuable to include transfers across more structurally distinct tasks.

      (6) For both experiments, it would help to show basic summaries of participants' behavioural performance. For example, in the MDP task, first-stage choice proportions based on transition types are commonly reported. These kinds of benchmarks provide useful context.

      (7) For the MDP task, consider reporting the number or proportion of correct choices in addition to negative log-likelihood. This would make the results more interpretable.

      (8) In Figure 5, what is the difference between the "% correct" and "% match to behaviour"? If so, it would help to clarify the distinction in the text or figure captions.

      (9) For the cognitive model, it would be useful to report the fitted parameters (e.g., learning rate, inverse temperature) per individual. This can offer insight into what kinds of behavioural variability the individuality index might be capturing.

      (10) A few of the terms and labels in the paper could be made more intuitive. For example, the name "individuality index" might give the impression of a scalar value rather than a latent vector, and the labels "SX" and "SY" are somewhat arbitrary. You might consider whether clearer or more descriptive alternatives would help readers follow the paper more easily.

      (11) Please consider including training and validation curves for your models. These would help readers assess convergence, overfitting, and general training stability, especially given the complexity of the encoder-decoder architecture.

      Comments on revisions:

      Thank you to the authors for the updated manuscript. The authors have addressed the majority of my concerns, and the paper is now in a much better form.

      Regarding my previous Comment 6, I still believe it would be helpful to include a graph similar to what is typically reported for these tasks-specifically, a breakdown of choices based on rare versus common transitions (see Model-Based Influences on Humans' Choices and Striatal Prediction Errors, Figure 2). Presenting this for both the actual behaviour and the simulated data would strengthen the paper and allow for clearer comparison.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Because the "source" and "target" tasks are merely parameter variations of the same paradigm, it is unclear whether EIDT achieves true crosstask transfer. The manuscript provides no measure of how consistent each participant's behaviour is across these variants (e.g., two- vs threestep MDP; easy vs difficult MNIST). Without this measure, the transfer results are hard to interpret. In fact, Figure 5 shows a notable drop in accuracy when transferring between the easy and difficult MNIST conditions, compared to transfers between accuracy-focused and speedfocused conditions. Does this discrepancy simply reflect larger withinparticipant behavioural differences between the easy and difficult settings? A direct analysis of intra-individual similarity for each task pair and how that similarity is related to EIDT's transfer performance is needed.

      Thank you for your insightful comment. We agree that the tasks used in our study are variations of the same paradigm. Accordingly, we have revised the manuscript to consistently frame our findings as demonstrating individuality transfer "across task conditions" rather than "across distinct tasks."

      In response to your suggestion, we have conducted a new analysis to directly investigate the relationship between individual behavioural patterns and transfer performance. As show in the new Figures 4, 11, S8, and S9, we found a clear relationship between the distance in the space of individual latent representation (called individuality index in the previous manuscript) and prediction performance. Specifically, prediction accuracy for a given individual's behaviour degrades as the latent representation of the model's source individual becomes more distant. This result directly demonstrates that our framework captures meaningful individual differences that are predictive of transfer performance across conditions.

      We have also expanded the Discussion (Lines 332--343) to address the potential for applying this framework to more structurally distinct tasks, hypothesizing that this would rely on shared underlying cognitive functions.

      Related to the previous comment, the individuality index is central to the framework, yet remains hard to interpret. It shows much greater within-participant variability in the MNIST experiment (Figure S1) than in the MDP experiment (Figure 3). Is such a difference meaningful? It is hard to know whether it reflects noisier data, greater behavioural flexibility, or limitations of the model.

      Thank you for raising this important point about interpretability. To enhance the interpretability of the individual latent representation, we have added a new analysis for the MDP task (see Figures 6 and S4). By applying our trained encoder to data from simulated Q-learning agents with known parameters, we demonstrate that the dimensions of the latent space systematically map onto the agents' underlying cognitive parameters (learning rate and inverse temperature). This analysis provides a clearer interpretation by linking our model's data-driven representation to established theoretical constructs.

      Regarding the greater within-participant variability observed in the MNIST task (visualized now in Figure S7), this could be attributed to several factors, such as greater behavioural flexibility in the perceptual task. However, disentangling these potential factors is complex and falls outside the primary scope of the current study, which prioritizes demonstrating robust prediction accuracy across different task conditions.

      The authors suggests that the model's ability to generalize to new participants "likely relies on the fact that individuality indices form clusters and individuals similar to new participants exist in the training participant pool". It would be helpful to directly test this hypothesis by quantifying the similarity (or distance) of each test participant's individuality index to the individuals or identified clusters within the training set, and assessing whether greater similarity (or closer proximity) to the clusters in the training set is associated with higher prediction accuracy for those individuals in the test set.

      Thank you for this excellent suggestion. We have performed the analysis you proposed to directly test this hypothesis. Our new results, presented in Figures 4, 11, S5, S8, and S9, quantify the distance between the latent representation of a test participant and that of the source participant used to generate the prediction model.

      The results show a significant negative correlation: prediction accuracy consistently decreases as the distance in the latent space increases. This confirms that generalization performance is directly tied to the similarity of behavioural patterns as captured by our latent representation, strongly supporting our hypothesis.

      Reviewer #2 (Public review):

      The MNIST SX baseline appears weak. RTNet isn't directly comparable in structure or training. A stronger baseline would involve training the GRU directly on the task without using the individuality index-e.g., by fixing the decoder head. This would provide a clearer picture of what the index contributes.

      We agree that a more direct baseline is crucial for evaluating the contribution of our transfer mechanism. For the Within-Condition Prediction scenario, the comparison with RTNet was intended only to validate that our task solver architecture could achieve average humanlevel task performance (Figure 7).

      For the critical Cross-Condition Transfer scenario, we have now implemented a stronger and more appropriate baseline, which we call ``task solver (source).'' This model has the same architecture as our EIDT task solver but is trained directly on the source task data of the specific test participant. As shown in revised Figure 9, our EIDT framework significantly outperforms this direct-training approach, clearly demonstrating the benefit of the individuality transfer mechanism.

      Although the focus is on prediction, the framework could offer more insight into how behaviour in one task generalizes to another. For example, simulating predicted behaviours while varying the individuality index might help reveal what behavioural traits it encodes.

      Thank you for this valuable suggestion. To provide more insight into the encoded behavioural traits, we have conducted a new analysis linking the individual latent representation to a theoretical cognitive model. As detailed in the revised manuscript (Figures 6 and S4), we applied our encoder to simulated data from Q-learning agents with varying parameters. The results show a systematic relationship between the latent space coordinates and the agents' learning rates and inverse temperatures, providing a clearer interpretation of what the representation captures.

      It's not clear whether the model can reproduce human behaviour when acting on-policy. Simulating behaviour using the trained task solver and comparing it with actual participant data would help assess how well the model captures individual decision tendencies.

      We have added the suggested on-policy evaluation (Lines 195--207). In the revised manuscript (Figure 5), we present results from simulations where the trained task solvers performed the MDP task. We compared their performance (total reward and rate of the highly-rewarding action selected) against their corresponding human participants. The strong correlations observed demonstrate that our model successfully captures and reproduces individual-specific behavioural tendencies in an onpolicy setting.

      Figures 3 and S1 aim to show that individuality indices from the same participant are closer together than those from different participants. However, this isn't fully convincing from the visualizations alone. Including a quantitative presentation would help support the claim.

      We agree that the original visualizations of inter- and intraparticipant distances was not sufficiently convincing. We have therefore removed that analysis. In its place, we have introduced a more direct and quantitative analysis that explicitly links the individual latent representation to prediction performance (see Figures 4, 11, S5, S8, and S9). This new analysis demonstrates that prediction error for an individual is a function of distance in the latent space, providing stronger evidence that the representation captures meaningful, individual-specific information.

      The transfer scenarios are often between very similar task conditions (e.g., different versions of MNIST or two-step vs three-step MDP). This limits the strength of the generalization claims. In particular, the effects in the MNIST experiment appear relatively modest, and the transfer is between experimental conditions within the same perceptual task. To better support the idea of generalizing behavioural traits across tasks, it would be valuable to include transfers across more structurally distinct tasks.

      We agree with this limitation and have revised the manuscript to be more precise. We now frame our contribution as "individuality transfer across task conditions" rather than "across tasks" to accurately reflect the scope of our experiments. We have also expanded the Discussion section (Line 332-343) to address the potential and challenges of applying this framework to more structurally distinct tasks, noting that it would likely depend on shared underlying cognitive functions.

      For both experiments, it would help to show basic summaries of participants' behavioural performance. For example, in the MDP task, first-stage choice proportions based on transition types are commonly reported. These kinds of benchmarks provide useful context.

      We have added behavioral performance summaries as requested. For the MDP task, Figure 5 now compares the total reward and rate of highlyrewarding action selected between humans and our model. For the MNIST task, Figure 7 shows the rate of correct responses for humans, RTNet, and our task solver across all conditions. These additions provide better context for the model's performance.

      For the MDP task, consider reporting the number or proportion of correct choices in addition to negative log-likelihood. This would make the results more interpretable.

      Thank you for the suggestion. To make the results more interpretable, we have added a new prediction performance metric: the rate for behaviour matched. This metric measures the proportion of trials where the model's predicted action matches the human's actual choice. This is now included alongside the negative log-likelihood in Figures 2, 3, 4, 8, 9, and 11.

      In Figure 5, what is the difference between the "% correct" and "% match to behaviour"? If so, it would help to clarify the distinction in the text or figure captions.

      We have clarified these terms in the revised manuscript. As defined in the Result section (Lines 116--122, 231), "%correct" (now "rate of correct responses") is a measure of task performance, whereas "%match to behaviour" (now "rate for behaviour matched") is a measure of prediction accuracy.

      For the cognitive model, it would be useful to report the fitted parameters (e.g., learning rate, inverse temperature) per individual. This can offer insight into what kinds of behavioural variability the individual latent representation might be capturing.

      We have added histograms of the fitted Q-learning parameters for the human participants in Supplementary Materials (Figure S1). This analysis revealed which parameters varied most across the population and directly informed the design of our subsequent simulation study with Q-learning agents (see response to Comment 2-2), where we linked these parameters to the individual latent representation (Lines 208--223).

      A few of the terms and labels in the paper could be made more intuitive. For example, the name "individuality index" might give the impression of a scalar value rather than a latent vector, and the labels "SX" and "SY" are somewhat arbitrary. You might consider whether clearer or more descriptive alternatives would help readers follow the paper more easily.

      We have adopted the suggested changes for clarity.

      "Individuality index" has been changed to "individual latent representation".

      "Situation SX" and "Situation SY" have been renamed to the more descriptive "Within-Condition Prediction" and "Cross-Condition Transfer", respectively.

      We have also added a table in Figure 7 to clarify the MNIST condition acronyms (EA/ES/DA/DS).

      Please consider including training and validation curves for your models. These would help readers assess convergence, overfitting, and general training stability, especially given the complexity of the encoder-decoder architecture.

      Training and validation curves for both the MDP and MNIST tasks have been added to Supplementary Materials (Figure S2 and S6) to show model convergence and stability.

      Reviewer #3 (Public review):

      To demonstrate the effectiveness of the approach, the authors compare a Q-learning cognitive model (for the MDP task) and RTNet (for the MNIST task) against the proposed framework. However, as I understand it, neither the cognitive model nor RTNet is designed to fit or account for individual variability. If that is the case, it is unclear why these models serve as appropriate baselines. Isn't it expected that a model explicitly fitted to individual data would outperform models that do not? If so, does the observed superiority of the proposed framework simply reflect the unsurprising benefit of fitting individual variability? I think the authors should either clarify why these models constitute fair control or validate the proposed approach against stronger and more appropriate baselines.

      Thank you for raising this critical point. We wish to clarify the nature of our baselines:

      For the MDP task, the cognitive model baseline was indeed designed to account for individual variability. We estimated its parameters (e.g., learning rate) from each individual's source task behaviour and then used those specific parameters to predict their behaviour in the target task. This makes it a direct, parameter-based transfer model and thus a fair and appropriate baseline for individuality transfer.

      For the MNIST task, we agree that the RTNet baseline was insufficient for evaluating individual-level transfer in the "Cross-Condition Transfer" scenario. We have now introduced a much stronger baseline, the "task solver (source)," which is trained specifically on the source task data of each test participant. Our results (Figure 9) show that the EIDT framework significantly outperforms this more appropriate, individualized baseline, highlighting the value of our transfer method over direct, within-condition fitting.

      It's not very clear in the results section what it means by having a shorter within-individual distance than between-individual distances. Related to the comment above, is there any control analysis performed for this? Also, this analysis appears to have nothing to do with predicting individual behavior. Is this evidence toward successfully parameterizing individual differences? Could this be task-dependent, especially since the transfer is evaluated on exceedingly similar tasks in both experiments? I think a bit more discussion of the motivation and implications of these results will help the reader in making sense of this analysis.

      We agree that the previous analysis on inter- and intra-participant distances was not sufficiently clear or directly linked to the model's predictive power. We have removed this analysis from the manuscript. In its place, we have introduced a new, more direct analysis (Figures 4, 11, S5, S8, and S9) that demonstrates a quantitative relationship between the distance in the latent space and prediction accuracy. This new analysis shows that prediction error for an individual increases as a function of this distance, providing much stronger and clearer evidence that our framework successfully parameterizes meaningful individual differences.

      The authors have to better define what exactly he meant by transferring across different "tasks" and testing the framework in "more distinctive tasks". All presented evidence, taken at face value, demonstrated transferring across different "conditions" of the same task within the same experiment. It is unclear to me how generalizable the framework will be when applied to different tasks.

      Conceptually, it is also unclear to me how plausible it is that the framework could generalize across tasks spanning multiple cognitive domains (if that's what is meant by more distinctive). For instance, how can an individual's task performance on a Posner task predict task performance on the Cambridge face memory test? Which part of the framework could have enabled such a cross-domain prediction of task performance? I think these have to be at least discussed to some extent, since without it the future direction is meaningless.

      We agree with your assessment and have corrected our terminology throughout the manuscript. We now consistently refer to the transfer as being "across task conditions" to accurately describe the scope of our findings.

      We have also expanded our Discussion (Line 332-343) to address the important conceptual point about cross-domain transfer. We hypothesize that such transfer would be possible if the tasks, even if structurally different, rely on partially shared underlying cognitive functions (e.g., working memory). In such a scenario, the individual latent representation would capture an individual's specific characteristics related to that shared function, enabling transfer. Conversely, we state that transfer between tasks with no shared cognitive basis would not be expected to succeed with our current framework.

      How is the negative log-likelihood, which seems to be the main metric for comparison, computed? Is this based on trial-by-trial response prediction or probability of responses, as what usually performed in cognitive modelling?

      The negative log-likelihood is computed on a trial-by-trial basis. It is based on the probability the model assigned to the specific action that the human participant actually took on that trial. This calculation is applied consistently across all models (cognitive models, RTNet, and EIDT). We have added sentences to the Results section to clarify this point (Lines 116--122).

      None of the presented evidence is cross-validated. The authors should consider performing K-fold cross-validation on the train, test, and evaluation split of subjects to ensure robustness of the findings.

      All prediction performance results reported in the revised manuscript are now based on a rigorous leave-one-participant-out cross-validation procedure to ensure the robustness of our findings. We have updated the

      Methods section to reflect this (Lines 127--129 and 229).

      For some purely illustrative visualizations (e.g., plotting the entire latent space in Figures S3 and S7), we used a model trained on all participants' data to provide a single, representative example and avoid clutter. We have explicitly noted this in the relevant figure captions.

      The authors excluded 25 subjects (20% of the data) for different reasons. This is a substantial proportion, especially by the standards of what is typically observed in behavioral experiments. The authors should provide a clear justification for these exclusion criteria and, if possible, cite relevant studies that support the use of such stringent thresholds.

      We acknowledge the concern regarding the exclusion rate. The previous criteria were indeed empirical. We have now implemented more systematic exclusion procedure based on the interquartile range of performance metrics, which is detailed in Section 4.2.2 (Lines 489--498). This revised, objective criterion resulted in the exclusion of 42 participants (34% of the initial sample). While this rate is high, we attribute it to the online nature of the data collection, where participant engagement can be more variable. We believe applying these strict criteria was necessary to ensure the quality and reliability of the behavioural data used for modeling.

      The authors should do a better job of creating the figures and writing the figure captions. It is unclear which specific claim the authors are addressing with the figure. For example, what is the key message of Figure 2C regarding transfer within and across participants? Why are the stats presentation different between the Cognitive model and the EIDT framework plots? In Figure 3, it's unclear what these dots and clusters represent and how they support the authors' claim that the same individual forms clusters. And isn't this experiment have 98 subjects after exclusion, this plot has way less than 98 dots as far as I can tell. Furthermore, I find Figure 5 particularly confusing, as the underlying claim it is meant to illustrate is unclear. Clearer figures and more informative captions are needed to guide the reader effectively.

      We agree that several figures and analyses in the original manuscript were unclear, and we have thoroughly revised our figures and their captions to improve clarity.

      The confusing analysis in the old Figures 2C and 5 (Original/Others comparison) have been completely removed. The unclear visualization of the latent space for the test pool (old Figure 3 showing representations only from test participants) has also been removed to avoid confusion. For visualization of the overall latent space, we now use models trained on all data (Figures S3 and S7) and have clarified this in the captions. In place of these removed analyses, we have introduced a new, more intuitive "cross-individual" analysis (presented in Figures 4, 11, S5, S8, and S9). As explained in the new, more detailed captions, this analysis directly plots prediction performance as a function of the distance in latent space, providing a much clearer demonstration of how the latent representation relates to predictive accuracy.

      I also find the writing somewhat difficult to follow. The subheadings are confusing, and it's often unclear which specific claim the authors are addressing. The presentation of results feels disorganized, making it hard to trace the evidence supporting each claim. Also, the excessive use of acronyms (e.g., SX, SY, CG, EA, ES, DA, DS) makes the text harder to parse. I recommend restructuring the results section to be clearer and significantly reducing the use of unnecessary acronyms.

      Thank you for this feedback. We have made significant revisions to improve the clarity and organization of the manuscript. We have renamed confusing acronyms: "Situation SX" is now "Within- Condition Prediction," and "Situation SY" is now "Cross-Condition Transfer." We also added a table to clarify the MNIST condition acronyms (EA/ES/DA/DS) in Figure 7.

      The Results section has been substantially restructured with clearer subheadings.

    1. Reviewer #1 (Public review):

      Summary of goals:

      The authors' stated goal (line 226) was to compare gene expression levels for gut hormones between males and females. As female flies contain more fat than males, they also sought to identify hormones that control this sex difference. Finally, they attempted to place their findings in the broader context of what is already known about established underlying mechanisms.

      Strengths:

      (1) The core research question of this work is interesting. The authors provide a reasonable hypothesis (neuro/entero-peptides may be involved) and well-designed experiments to address it.

      (2) Some of the data are compelling, especially positive results that clearly implicate enteropeptides in sex-biased fat contents (Figures 1 and 3).

      Weaknesses:

      (1) The greatest weakness of this work is that it falls short of providing a clear mechanism for the regulation of sex-biased fat content by AstC and Tk. By and large, feminization of neurons or enteroendocrine cells with UAS-traF did not increase fat in males (Figure 2). The authors mention that ecdysone, juvenile hormone or Sex-lethal may instead play a role (lines 258-270), but this is speculative, making this study incomplete.

      (2) Related to the above point, the cellular mechanisms by which AstC and Tk regulate fat content in males and females are only partially characterized. For example, knockdown of TkR99D in insulin-producing neurons (Figure 4E) but not pan-neuronally (Figure 4B) increases fat in males, but Tk itself only shows a tendency (Figure 3B). In females, the situation is even less clear: again, Tk only shows a tendency (Figure 3B), and pan-neuronal, but not IPC-specific knockdown of TkR99D decreases fat.

      (3) The text sometimes misrepresents or contradicts the Results shown in the figures. UAS-traF expression in neurons or enteroendocrine cells did sometimes alter fat contents (Figure 2H, S), but the authors report that sex differences were unaffected (lines 164-166). On the other hand, although knockdown of Tk in enteroendocrine cells caused no significant effect (Figure 3B), the authors report this as a trend towards reduction (lines 182-183). This biased representation raises concerns about the interpretation of the data and the authors' conclusions.

      (4) The authors find that in males, neuropeptide expression in the head is higher (Figure 1F-J). This may also play an important role in maintaining lower levels of fat in males, but this finding is not explored in the manuscript.

      Appraisal of goal achievement & conclusions:

      The authors were successful in identifying hormones that show sex bias in their expression and also control the male vs. female difference in fat content. However, elucidation of the relevant cellular pathways is incomplete. Additionally, some of their conclusions are not supported by the data (see Weaknesses, point 3).

      Impact:

      It is difficult to evaluate the impact of this study. This is in great part because the authors do not attempt to systematically place their findings about AstC/Tk in the broader context of their previous studies, which investigated the same phenomenon (Wat et al., 2021, eLife and Biswas et al., 2025, Cell Reports). As the underlying mechanisms are complex and likely redundant, it is necessary to generate a visual model to explain the pathways which regulate fat content in males and females.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript by Biswas and Rideout investigates sex differences in the expression and function of hormones derived from Drosophila enteroendocrine cells (EE). The authors report that while whole-body and head expression of several EE hormones (AstA, AstC, Tk, NPF, Dh31) is male-biased, gut-specific expression of AstC, Tk, and NPF is female-biased. Intriguingly, this sex-specific effect is not dependent on Tra - a surprising and important result. The authors then used an RNAi-based approach to demonstrate that gut-derived AstC and Tk promote fat storage specifically in females. Similar effects are observed when their receptors are knocked down in neurons. In addition, the authors were able to demonstrate that while Tk promotes female body fat via the insulin-producing cells. Together, these findings suggest that EE cell-derived hormones contribute to sex-specific fat storage regulation.

      Strengths:

      Overall, I find the paper quite interesting. While the findings are brief, they reveal novel aspects of the sex-specific lipid storage program that I believe are important. As noted by the authors in the discussion, there are many open questions, including how these neuronal effects translate into systemic sex-specific regulation of lipid storage. Regardless, I find the results to be convincing - this paper will serve as the launching point of many future studies.

      Weaknesses:

      My main criticisms are focused on two points:

      (1) If the sex specific differences are eliminated by tra overexpression, what else might be responsible? As the authors note, the differences in 20E titers might be responsible. I would encourage the authors to simply feed adult flies with food containing 20E and determine if this alters sex-specific 20E expression.

      (2) I'm quite intrigued by the discovery that Tra does not eliminate the sex-specific differences. There are quite a few recent studies demonstrating that fruitless influences sex-specific neuronal function - here to I would encourage the authors to examine whether this aspect of the sex-determination pathway is involved in the lipid accumulation phenotype.

    1. There was one thing to be done before I left, an awkward, unpleasant thing that perhaps had better have been let alone. But I wanted to leave things in order and not just trust that obliging and indifferent sea to sweep my refuse away. I saw Jordan Baker and talked over and around what had happened to us together, and what had happened afterward to me, and she lay perfectly still, listening, in a big chair. She was dressed to play golf, and I remember thinking she looked like a good illustration, her chin raised a little jauntily, her hair the colour of an autumn leaf, her face the same brown tint as the fingerless glove on her knee. When I had finished she told me without comment that she was engaged to another man. I doubted that, though there were several she could have married at a nod of her head, but I pretended to be surprised. For just a minute I wondered if I wasn’t making a mistake, then I thought it all over again quickly and got up to say goodbye.

      wow the breakup trauma,they didn't end well

    2. He murdered her.” “It was an accident, George.” Wilson shook his head. His eyes narrowed and his mouth widened slightly with the ghost of a superior “Hm!” “I know,” he said definitely. “I’m one of these trusting fellas and I don’t think any harm to nobody, but when I get to know a thing I know it. It was the man in that car. She ran out to speak to him and he wouldn’t stop.” Michaelis had seen this too, but it hadn’t occurred to him that there was any special significance in it. He believed that Mrs. Wilson had been running away from her husband, rather than trying to stop any particular car.

      george start to accuse gastby as the murderer of his wife,i think hisa poor guy,everyone has been hiding secret from him.

    1. much like a key unlocking a lock. This allows the myosin heads to attach to actin.

      FWIW, the analogy I like to use here is a garage door opener (troponin) pulling open the garage door (tropomyosin), allowing the car (myosin head) to enter the garage (myosin binding site on actin).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary

      This is a strong paper that presents a clear advance in multi-animal tracking. The authors introduce an updated version of idtracker.ai that reframes identity assignment as a contrastive learning problem rather than a classification task requiring global fragments. This change leads to gains in speed and accuracy. The method eliminates a known bottleneck in the original system, and the benchmarking across species is comprehensive and well executed. I think the results are convincing and the work is significant.

      Strengths

      The main strengths are the conceptual shift from classification to representation learning, the clear performance gains, and the fact that the new version is more robust. Removing the need for global fragments makes the software more flexible in practice, and the accuracy and speed improvements are well demonstrated. The software appears thoughtfully implemented, with GUI updates and integration with pose estimators.

      Weaknesses

      I don't have any major criticisms, but I have identified a few points that should be addressed to improve the clarity and accuracy of the claims made in the paper.

      (1) The title begins with "New idtracker.ai," which may not age well and sounds more promotional than scientific. The strength of the work is the conceptual shift to contrastive representation learning, and it might be more helpful to emphasize that in the title rather than branding it as "new."

      We considered using “Contrastive idtracker.ai”. However, we thought that readers could then think that we believe they could use both the old idtracker.ai or this contrastive version. But we want to say that the new version is the one to use as it is better in both accuracy and tracking times. We think “New idtracker.ai” communicates better that this version is the version we recommend.

      (2) Several technical points regarding the comparison between TRex (a system evaluated in the paper) and idtracker.ai should be addressed to ensure the evaluation is fair and readers are fully informed.

      (2.1) Lines 158-160: The description of TRex as based on "Protocol 2 of idtracker.ai" overlooks several key additions in TRex, such as posture image normalization, tracklet subsampling, and the use of uniqueness feedback during training. These features are not acknowledged, and it's unclear whether TRex was properly configured - particularly regarding posture estimation, which appears to have been omitted but isn't discussed. Without knowing the actual parameters used to make comparisons, it's difficult to dassess how the method was evaluated.

      We added the information about the key additions of TRex in the section “The new idtracker.ai uses representation learning”, lines 153-157. Posture estimation in TRex was not explicitly used but neither disabled during the benchmark; we clarified this in the last paragraph of “Benchmark of accuracy and tracking time”, lines 492-495.

      (2.2) Lines 162-163: The paper implies that TRex gains speed by avoiding Protocol 3, but in practice, idtracker.ai also typically avoids using Protocol 3 due to its extremely long runtime. This part of the framing feels more like a rhetorical contrast than an informative one.

      We removed this, see new lines 153-157.

      (2.3) Lines 277-280: The contrastive loss function is written using the label l, but since it refers to a pair of images, it would be clearer and more precise to write it as l_{I,J}. This would help readers unfamiliar with contrastive learning understand the formulation more easily.

      We added this change in lines 613-620.

      (2.4) Lines 333-334: The manuscript states that TRex can fail to track certain videos, but this may be inaccurate depending on how the authors classify failures. TRex may return low uniqueness scores if training does not converge well, but this isn't equivalent to tracking failure. Moreover, the metric reported by TRex is uniqueness, not accuracy. Equating the two could mislead readers. If the authors did compare outputs to human-validated data, that should be stated more explicitly.

      We observed TRex crashing without outputting any trajectories on some occasions (Appendix 1—figure 1), and this is what we labeled as “failure”. These failures happened in the most difficult videos of our benchmark, that’s why we treated them the same way as idtracker.ai going to P3. We clarified this in new lines 464-469.

      The accuracy measured in our benchmark is not estimated but it is human-validated (see section Computation of tracking accuracy in Appendix 1). Both softwares report some quality estimators at the end of a tracking (“estimated accuracy” for idtracker.ai and "uniqueness” for TRex) but these were not used in the benchmark.

      (2.5) Lines 339-341: The evaluation approach defines a "successful run" and then sums the runtime across all attempts up to that point. If success is defined as simply producing any output, this may not reflect how experienced users actually interact with the software, where parameters are iteratively refined to improve quality.

      Yes, our benchmark was designed to be agnostic to the different experiences of the user. Also, our benchmark was designed for users that do not inspect the trajectories to choose parameters again not to leave room for potential subjectivity.

      (2.6) Lines 344-346: The simulation process involves sampling tracking parameters 10,000 times and selecting the first "successful" run. If parameter tuning is randomized rather than informed by expert knowledge, this could skew the results in favor of tools that require fewer or simpler adjustments. TRex relies on more tunable behavior, such as longer fragments improving training time, which this approach may not capture.

      We precisely used the TRex parameter track_max_speed to elongate fragments for optimal tracking. Rather than randomized parameter tuning, we defined the “valid range” for this parameter so that all values in it would produce a decent fragment structure. We used this procedure to avoid worsening those methods that use more parameters.

      (2.7) Line 354 onward: TRex was evaluated using two varying parameters (threshold and track_max_speed), while idtracker.ai used only one (intensity_threshold). With a fixed number of samples, this asymmetry could bias results against TRex. In addition, users typically set these parameters based on domain knowledge rather than random exploration.

      idtracker.ai and TRex have several parameters. Some of them have a single correct value (e.g. number of animals) or the default value that the system computes is already good (e.g. minimum blob size). For a second type of parameters, the system finds a value that is in general not as good, so users need to modify them. In general, users find that for this second type of parameter there is a valid interval of possible values, from which they need to choose a single value to run the system. idtracker.ai has intensity_threshold as the only parameter of this second type and TRex has two: threshold and track_max_speed. For these parameters, choosing one value or another within the valid interval can give different tracking results. Therefore, when we model a user that wants to run the system once except if it goes to P3 (idtracker.ai) or except if it crashes (TRex), it is these parameters we sample from within the valid interval to get a different value for each run of the system. We clarify this in lines 452-469 of the section “Benchmark of accuracy and tracking time”.

      Note that if we chose to simply run old idtracker.ai (v4 or v5) or TRex a single time, this would benefit the new idtracker.ai (v6). This is because old idtracker.ai can enter the very slow protocol 3 and TRex can fail to track. So running old idtracker.ai or TRex up to 5 times until old idtracker.ai does not use Protocol 3 and TRex does not fail is to make them as good as they can be with respect to the new idtracker.ai

      (2.8) Figure 2-figure supplement 3: The memory usage comparison lacks detail. It's unclear whether RAM or VRAM was measured, whether shared or compressed memory was included, or how memory was sampled. Since both tools dynamically adjust to system resources, the relevance of this comparison is questionable without more technical detail.

      We modified the text in the caption (new Figure 1-figure supplement 2) adding the kind of memory we measured (RAM) and how we measured it. We already have a disclaimer for this plot saying that memory management depends on the machine's available resources. We agree that this is a simple analysis of the usage of computer resources.

      (3) While the authors cite several key papers on contrastive learning, they do not use the introduction or discussion to effectively situate their approach within related fields where similar strategies have been widely adopted. For example, contrastive embedding methods form the backbone of modern facial recognition and other image similarity systems, where the goal is to map images into a latent space that separates identities or classes through clustering. This connection would help emphasize the conceptual strength of the approach and align the work with well-established applications. Similarly, there is a growing literature on animal re-identification (ReID), which often involves learning identity-preserving representations across time or appearance changes. Referencing these bodies of work would help readers connect the proposed method with adjacent areas using similar ideas, and show that the authors are aware of and building on this wider context.

      We have now added a new section in Appendix 3, “Differences with previous work in contrastive/metric learning” (lines 792-841) to include references to previous work and a description of what we do differently.

      (4) Some sections of the Results text (e.g., lines 48-74) read more like extended figure captions than part of the main narrative. They include detailed explanations of figure elements, sorting procedures, and video naming conventions that may be better placed in the actual figure captions or moved to supplementary notes. Streamlining this section in the main text would improve readability and help the central ideas stand out more clear

      Thank you for pointing this out. We have rewritten the Results, for example streamlining the old lines 48-74 (new lines 42-48)  by moving the comments about names, files and order of videos to the caption of Figure 1.

      Overall, though, this is a high-quality paper. The improvements to idtracker.ai are well justified and practically significant. Addressing the above comments will strengthen the work, particularly by clarifying the evaluation and comparisons.

      We thank the reviewer for the detailed suggestions. We believe we have taken all of them into consideration to improve the ms.

      Reviewer #2 (Public review):

      Summary:

      This work introduces a new version of the state-of-the-art idtracker.ai software for tracking multiple unmarked animals. The authors aimed to solve a critical limitation of their previous software, which relied on the existence of "global fragments" (video segments where all animals are simultaneously visible) to train an identification classifier network, in addition to addressing concerns with runtime speed. To do this, the authors have both re-implemented the backend of their software in PyTorch (in addition to numerous other performance optimizations) as well as moving from a supervised classification framework to a self-supervised, contrastive representation learning approach that no longer requires global fragments to function. By defining positive training pairs as different images from the same fragment and negative pairs as images from any two co-existing fragments, the system cleverly takes advantage of partial (but high-confidence) tracklets to learn a powerful representation of animal identity without direct human supervision. Their formulation of contrastive learning is carefully thought out and comprises a series of empirically validated design choices that are both creative and technically sound. This methodological advance is significant and directly leads to the software's major strengths, including exceptional performance improvements in speed and accuracy and a newfound robustness to occlusion (even in severe cases where no global fragments can be detected). Benchmark comparisons show the new software is, on average, 44 times faster (up to 440 times faster on difficult videos) while also achieving higher accuracy across a range of species and group sizes. This new version of idtracker.ai is shown to consistently outperform the closely related TRex software (Walter & Couzin, 2021\), which, together with the engineering innovations and usability enhancements (e.g., outputs convenient for downstream pose estimation), positions this tool as an advancement on the state-of-the-art for multi-animal tracking, especially for collective behavior studies.

      Despite these advances, we note a number of weaknesses and limitations that are not well addressed in the present version of this paper:

      Weaknesses

      (1) The contrastive representation learning formulation. Contrastive representation learning using deep neural networks has long been used for problems in the multi-object tracking domain, popularized through ReID approaches like DML (Yi et al., 2014\) and DeepReID (Li et al., 2014). More recently, contrastive learning has become more popular as an approach for scalable self-supervised representation learning for open-ended vision tasks, as exemplified by approaches like SimCLR (Chen et al., 2020), SimSiam (Chen et al., 2020\), and MAE (He et al., 2021\) and instantiated in foundation models for image embedding like DINOv2 (Oquab et al., 2023). Given their prevalence, it is useful to contrast the formulation of contrastive learning described here relative to these widely adopted approaches (and why this reviewer feels it is appropriate):

      (1.1) No rotations or other image augmentations are performed to generate positive examples. These are not necessary with this approach since the pairs are sampled from heuristically tracked fragments (which produces sufficient training data, though see weaknesses discussed below) and the crops are pre-aligned egocentrically (mitigating the need for rotational invariance).

      (1.2) There is no projection head in the architecture, like in SimCLR. Since classification/clustering is the only task that the system is intended to solve, the more general "nuisance" image features that this architectural detail normally affords are not necessary here.

      (1.3) There is no stop gradient operator like in BYOL (Grill et al., 2020\) or SimSiam. Since the heuristic tracking implicitly produces plenty of negative pairs from the fragments, there is no need to prevent representational collapse due to class asymmetry. Some care is still needed, but the authors address this well through a pair sampling strategy (discussed below).

      (1.4) Euclidean distance is used as the distance metric in the loss rather than cosine similarity as in most contrastive learning works. While cosine similarity coupled with L2-normalized unit hypersphere embeddings has proven to be a successful recipe to deal with the curse of dimensionality (with the added benefit of bounded distance limits), the authors address this through a cleverly constructed loss function that essentially allows direct control over the intra- and inter-cluster distance (D\_pos and D\_neg). This is a clever formulation that aligns well with the use of K-means for the downstream assignment step.

      No concerns here, just clarifications for readers who dig into the review. Referencing the above literature would enhance the presentation of the paper to align with the broader computer vision literature.

      Thank you for this detailed comparison. We have now added a new section in Appendix 3, “Differences with previous work in contrastive/metric learning” (lines 792-841) to include references to previous work and a description of what we do differently, including the points raised by the reviewer.

      (2) Network architecture for image feature extraction backbone. As most of the computations that drive up processing time happen in the network backbone, the authors explored a variety of architectures to assess speed, accuracy, and memory requirements. They land on ResNet18 due to its empirically determined performance. While the experiments that support this choice are solid, the rationale behind the architecture selection is somewhat weak. The authors state that: "We tested 23 networks from 8 different families of state-of-the-art convolutional neural network architectures, selected for their compatibility with consumer-grade GPUs and ability to handle small input images (20 × 20 to 100 × 100 pixels) typical in collective animal behavior videos."

      (2.1) Most modern architectures have variants that are compatible with consumer-grade GPUs. This is true of, for example, HRNet (Wang et al., 2019), ViT (Dosovitskiy et al., 2020), SwinT (Liu et al., 2021), or ConvNeXt (Liu et al., 2022), all of which report single GPU training and fast runtime speeds through lightweight configuration or subsequent variants, e.g., MobileViT (Mehta et al., 2021). The authors may consider revising that statement or providing additional support for that claim (e.g., empirical experiments) given that these have been reported to outperform ResNet18 across tasks.

      Following the recommendation of the reviewer, we tested the architectures SwinT, ConvNeXt and ViT. We found out that none of them outperformed ResNet18 since they all showed a slower learning curve. This would result in higher tracking times. These tests are now included in the section “Network architecture” (lines 550-611).

      (2.2) The compatibility of different architectures with small image sizes is configurable. Most convolutional architectures can be readily adapted to work with smaller image sizes, including 20x20 crops. With their default configuration, they lose feature map resolution through repeated pooling and downsampling steps, but this can be readily mitigated by swapping out standard convolutions with dilated convolutions and/or by setting the stride of pooling layers to 1, preserving feature map resolution across blocks. While these are fairly straightforward modifications (and are even compatible with using pretrained weights), an even more trivial approach is to pad and/or resize the crops to the default image size, which is likely to improve accuracy at a possibly minimal memory and runtime cost. These techniques may even improve the performance with the architectures that the authors did test out.

      The only two tested architectures that require a minimum image size are AlexNet and DenseNet. DenseNet proved to underperform ResNet18 in the videos where the images are sufficiently large. We have tested AlexNet with padded images to see that it also performs worse than ResNet18 (see Appendix 3—figure 1).

      We also tested the initialization of ResNet18 with pre-trained weights from ImageNet (in Appendix 3—figure 2) and it proved to bring no benefit to the training speed (added in lines 591-592).

      (2.3) The authors do not report whether the architecture experiments were done with pretrained or randomly initialized weights.

      We adapted the text to make it clear that the networks are always randomly initialized (lines 591-592, lines 608-609 and the captions of Appendix 3—figure 1 and 2).

      (2.4) The authors do not report some details about their ResNet18 design, specifically whether a global pooling layer is used and whether the output fully connected layer has any activation function. Additionally, they do not report the version of ResNet18 employed here, namely, whether the BatchNorm and ReLU are applied after (v1) or before (v2) the conv layers in the residual path.

      We use ResNet18 v1 with no activation function nor bias in its last layer (this has been clarified in the lines 606-608). Also, by design, ResNet has a global average pool right before the last fully connected layer which we did not remove. In response to the reviewer, Resnet18 v2 was tested and its performance is the same as that of v1 (see Appendix 3—figure 1 and lines 590-591).

      (3) Pair sampling strategy. The authors devised a clever approach for sampling positive and negative pairs that is tailored to the nature of the formulation. First, since the positive and negative labels are derived from the co-existence of pretracked fragments, selection has to be done at the level of fragments rather than individual images. This would not be the case if one of the newer approaches for contrastive learning were employed, but it serves as a strength here (assuming that fragment generation/first pass heuristic tracking is achievable and reliable in the dataset). Second, a clever weighted sampling scheme assigns sampling weights to the fragments that are designed to balance "exploration and exploitation". They weigh samples both by fragment length and by the loss associated with that fragment to bias towards different and more difficult examples.

      (3.1) The formulation described here resembles and uses elements of online hard example mining (Shrivastava et al., 2016), hard negative sampling (Robinson et al., 2020\), and curriculum learning more broadly. The authors may consider referencing this literature (particularly Robinson et al., 2020\) for inspiration and to inform the interpretation of the current empirical results on positive/negative balancing.

      Following this recommendation, we added references of hard negative mining in the new section “Differences with previous work in contrastive/metric learning”, lines 792-841. Regarding curriculum learning, even though in spirit it might have parallels with our sampling method in the sense that there is a guided training of the network, we believe the approach is more similar to an exploration-exploitation paradigm.

      (4) Speed and accuracy improvements. The authors report considerable improvements in speed and accuracy of the new idTracker (v6) over the original idTracker (v4?) and TRex. It's a bit unclear, however, which of these are attributable to the engineering optimizations (v5?) versus the representation learning formulation.

      (4.1) Why is there an improvement in accuracy in idTracker v5 (L77-81)? This is described as a port to PyTorch and improvements largely related to the memory and data loading efficiency. This is particularly notable given that the progression went from 97.52% (v4; original) to 99.58% (v5; engineering enhancements) to 99.92% (v6; representation learning), i.e., most of the new improvement in accuracy owes to the "optimizations" which are not the central emphasis of the systematic evaluations reported in this paper.

      V5 was a two year-effort designed to improve time efficiency of v4. It was also a surprise to us that accuracy was higher, but that likely comes from the fact that the substituted code from v4 contained some small bug/s. The improvements in v5 are retained in v6 (contrastive learning) and v6 has higher accuracy and shorter tracking times. The difference in v6 for this extra accuracy and shorter tracking times is contrastive learning.

      (4.2) What about the speed improvements? Relative to the original (v4), the authors report average speed-ups of 13.6x in v5 and 44x in v6. Presumably, the drastic speed-up in v6 comes from a lower Protocol 2 failure rate, but v6 is not evaluated in Figure 2 - figure supplement 2.

      Idtracker.ai v5 runs an optimized Protocol 2 and, sometimes, the Protocol 3. But v6 doesn’t run either of them. While P2 is still present in v6 as a fallback protocol when contrastive fails, in our v6 benchmark P2 was never needed. So the v6 speedup comes from replacing both P2 and P3 with the contrastive algorithm.

      (5) Robustness to occlusion. A major innovation enabled by the contrastive representation learning approach is the ability to tolerate the absence of a global fragment (contiguous frames where all animals are visible) by requiring only co-existing pairs of fragments owing to the paired sampling formulation. While this removes a major limitation of the previous versions of idtracker.ai, its evaluation could be strengthened. The authors describe an ablation experiment where an arc of the arena is masked out to assess the accuracy under artificially difficult conditions. They find that the v6 works robustly up to significant proportions of occlusions, even when doing so eliminates global fragments.

      (5.1) The experiment setup needs to be more carefully described.

      (5.1.1) What does the masking procedure entail? Are the pixels masked out in the original video or are detections removed after segmentation and first pass tracking is done?

      The mask is defined as a region of interest in the software. This means that it is applied at the segmentation step where the video frame is converted to a foreground-background binary image. The region of interest is applied here, converting to background all pixels not inside of it. We clarified this in the newly added section Occlusion tests, lines 240-244.

      (5.1.2) What happens at the boundary of the mask? (Partial segmentation masks would throw off the centroids, and doing it after original segmentation does not realistically model the conditions of entering an occlusion area.)

      Animals at the boundaries of the mask are partially detected. This can change the location of their detected centroid. That’s why, when computing the ground-truth accuracy for these videos, only the groundtruth centroids that were at minimum 15 pixels further from the mask were considered. We clarified this in the newly added section Occlusion tests, lines 248-251.

      (5.1.3) Are fragments still linked for animals that enter and then exit the mask area?

      No artificial fragment linking was added in these videos. Detected fragments are linked the usual way. If one animal hides into the mask, the animal disappears so the fragment breaks.  We clarified this in the newly added section Occlusion tests, lines 245-247.

      (5.1.4) How is the evaluation done? Is it computed with or without the masked region detections?

      The groundtruth used to validate these videos contains the positions of all animals at all times. But only the positions outside the mask at each frame were considered to compute the tracking accuracy. We clarified this in the newly added section Occlusion tests, lines 248-251.

      (5.2) The circular masking is perhaps not the most appropriate for the mouse data, which is collected in a rectangular arena.

      We wanted to show the same proof of concept in different videos. For that reason, we used to cover the arena parametrized by an angle. In the rectangular arena the circular masking uses an external circle, so it is covering the rectangle parametrized by an angle.

      (5.3) The number of co-existing fragments, which seems to be the main determinant of performance that the authors derive from this experiment, should be reported for these experiments. In particular, a "number of co-existing fragments" vs accuracy plot would support the use of the 0.25(N-1) heuristic and would be especially informative for users seeking to optimize experimental and cage design. Additionally, the number of co-existing fragments can be artificially reduced in other ways other than a fixed occlusion, including random dropout, which would disambiguate it from potential allocentric positional confounds (particularly relevant in arenas where egocentric pose is correlated with allocentric position).

      We included the requested analysis about the fragment connectivity in Figure 3-figure supplement 1. We agree that there can be additional ways of reducing co-existing fragments, but we think the occlusion tests have the additional value that there are many real experiments similar to this test.

      (6) Robustness to imaging conditions. The authors state that "the new idtracker.ai can work well with lower resolutions, blur and video compression, and with inhomogeneous light (Figure 2 - figure supplement 4)." (L156). Despite this claim, there are no speed or accuracy results reported for the artificially corrupted data, only examples of these image manipulations in the supplementary figure.

      We added this information in the same image, new Figure 1 - figure supplement 3.

      (7) Robustness across longitudinal or multi-session experiments. The authors reference idmatcher.ai as a compatible tool for this use case (matching identities across sessions or long-term monitoring across chunked videos), however, no performance data is presented to support its usage. This is relevant as the innovations described here may interact with this setting. While deep metric learning and contrastive learning for ReID were originally motivated by these types of problems (especially individuals leaving and entering the FOV), it is not clear that the current formulation is ideally suited for this use case. Namely, the design decisions described in point 1 of this review are at times at odds with the idea of learning generalizable representations owing to the feature extractor backbone (less scalable), low-dimensional embedding size (less representational capacity), and Euclidean distance metric without hypersphere embedding (possible sensitivity to drift). It's possible that data to support point 6 can mitigate these concerns through empirical results on variations in illumination, but a stronger experiment would be to artificially split up a longer video into shorter segments and evaluate how generalizable and stable the representations learned in one segment are across contiguous ("longitudinal") or discontiguous ("multi-session") segments.

      We have now added a test to prove the reliability of idmatcher.ai in v6. In this test, 14 videos are taken from the benchmark and split in two non-overlapping parts (with a 200 frames gap in between). idmatcher.ai is run between the two parts presenting a 100% accuracy identity matching across all of them (see section “Validity of idmatcher.ai in the new idtracker.ai”, lines 969-1008).

      We thank the reviewer for the detailed suggestions. We believe we have taken all of them into consideration to improve the ms.

      Reviewer #3 (Public review):

      Summary

      The authors propose a new version of idTracker.ai for animal tracking. Specifically, they apply contrastive learning to embed cropped images of animals into a feature space where clusters correspond to individual animal identities.

      Strengths

      By doing this, the new software alleviates the requirement for so-called global fragments - segments of the video, in which all entities are visible/detected at the same time - which was necessary in the previous version of the method. In general, the new method reduces the tracking time compared to the previous versions, while also increasing the average accuracy of assigning the identity labels.

      Weaknesses

      The general impression of the paper is that, in its current form, it is difficult to disentangle the old from the new method and understand the method in detail. The manuscript would benefit from a major reorganization and rewriting of its parts. There are also certain concerns about the accuracy metric and reducing the computational time.

      We have made the following modifications in the presentation:

      (1) We have added section tiles to the main text so it is clearer what tracking system we are referring to. For example, we now have sections “Limitation of the original idtracker.ai”, “Optimizing idtracker.ai without changes in the learning method” and “The new idtracker.ai uses representation learning”.

      (2) We have completely rewritten all the text of the ms until we start with contrastive learning. Old L20-89 is now L20-L66, much shorter and easier to read.

      (3) We have rewritten the first 3 paragraphs in the section “The new idtracker.ai uses representation learning” (lines 68-92).

      (4) We now expanded Appendix 3 to discuss the details of our approach  (lines 539-897).  It discusses in detail the steps of the algorithm, the network architecture, the loss function, the sampling strategy, the clustering and identity assignment, and the stopping criteria in training

      (5) To cite previous work in detail and explain what we do differently, we have now added in Appendix 3 the new section “Differences with previous work in contrastive/metric learning” (lines 792-841).

      Regarding accuracy metrics, we have replaced our accuracy metric with the standard metric IDF1. IDF1 is the standard metric that is applied to systems in which the goal is to maintain consistent identities across time. See also the section in Appendix 1 "Computation of tracking accuracy” (lines 414-436) explaining IDF1 and why this is an appropriate metric for our goal.

      Using IDF1 we obtain slightly higher accuracies for the idtracker.ai systems. This is the comparison of mean accuracy over all our benchmark for our previous accuracy score and the new one for the full trajectories:

      v4:   97.42% -> 98.24%

      v5:   99.41% -> 99.49%

      v6:   99.74% -> 99.82%

      trex: 97.89% -> 97.89%

      We thank the reviewer for the suggestions about presentation and about the use of more standard metrics.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      (1) Figure 1a: A graphical legend inset would make it more readable since there are multiple colors, line styles, and connecting lines to parse out.

      Following this recommendation, we added a graphical legend in the old Figure 1 (new Figure 2).

      (2) L46: "have images" → "has images".

      We applied this correction. Line 35.

      (3) L52: "videos start with a letter for the species (z,**f**,m)", but "d" is used for fly videos.

      We applied this correction in the caption of Figure 1.

      (4) L62: "with Protocol 3 a two-step process" → "with Protocol 3 being a two-step process".

      We rewrote this paragraph without mentioning Protocol 3, lines 37-41.

      (5) L82-89: This is the main statement of the problems that are being addressed here (speed and relaxing the need for global fragments). This could be moved up, emphasized, and made clearer without the long preamble and results on the engineering optimizations in v5. This lack of linearity in the narrative is also evident in the fact that after Figure 1a is cited, inline citations skip to Figure 2 before returning to Figure 1 once the contrastive learning is introduced.

      We have rewritten all the text until the contrastive learning, (old lines 20-89 are now lines 20-66). The text is shorter, more linear and easier to read.

      (6) L114: "pairs until the distance D_{pos}" → "pairs until the distance approximates D_{pos}".

      We rewrote as “ pairs until the distance 𝐷pos (or 𝐷neg) is reached” in line 107.

      (7) L570: Missing a right parenthesis in the equation.

      We no longer have this equation in the ms.

      (8) L705: "In order to identify fragments we, not only need" → "In order to identify fragments, we not only need".

      We applied this correction, Line 775.

      (9) L819: "probably distribution" → "probability distribution".

      We applied this correction, Line 776.

      (10) L833: "produced the best decrease the time required" → "produced the best decrease of the time required".

      We applied this correction, Line 746.

      Reviewer #3 (Recommendations for the authors):

      (1) We recommend rewriting and restructuring the manuscript. The paper includes a detailed explanation of the previous approaches (idTracker and idTracker.ai) and their limitations. In contrast, the description of the proposed method is short and unstructured, which makes it difficult to distinguish between the old and new methods as well as to understand the proposed method in general. Here are a few examples illustrating the problem. 

      (1.1) Only in line 90 do the authors start to describe the work done in this manuscript. The previous 3 pages list limitations of the original method.

      We have now divided the main text into sections, so it is clearer what is the previous method (“Limitation of the original idtracker.ai”, lines 28-51), the new optimization we did of this method (“Optimizing idtracker.ai without changes in the learning method”, lines 52-66) and the new contrastive approach that also includes the optimizations (“The new idtracker.ai uses representation learning”, lines 66-164). Also, the new text has now been streamlined until the contrastive section, following your suggestion. You can see that in the new writing the three sections are 25 , 15 and 99 lines. The more detailed section is the new system, the other two are needed as reference, to describe which problem we are solving and the extra new optimizations.  

      (1.2) The new method does not have a distinct name, and it is hard to follow which idtracker.ai is a specific part of the text referring to. Not naming the new method makes it difficult to understand.

      We use the name new idtracker.ai (v6) so it becomes the current default version. v5 is now obsolete, as well as v4. And from the point of view of the end user, no new name is needed since v6 is just an evolution of the same software they have been using. Also, we added sections in the main text to clarify the ideas in there and indicate the version of idtracker.ai we are referring to.

      (1.3) There are "Protocol 2" and "Protocol 3" mixed with various versions of the software scattered throughout the text, which makes it hard to follow. There should be some systematic naming of approaches and a listing of results introduced.

      Following this recommendation we no longer talk about the specific protocols of the old version of idtracker.ai in the main text. We rewritten the explanation of these versions in a more clear and straightforward way, lines 29-36.

      (2) To this end, the authors leave some important concepts either underexplained or only referenced indirectly via prior work. For example, the explanation of how the fragments are created (line 15) is only explained by the "video structure" and the algorithm that is responsible for resolving the identities during crossings is not detailed (see lines 46-47, 149-150). Including summaries of these elements would improve the paper's clarity and accessibility.

      We listed the specific sections from our previous publication where the reader can find information about the entire tracking pipeline (lines 539-549). This way, we keep the ms clear and focused on the new identification algorithm while indicating where to find such information.

      (3) Accuracy metrics are not clear. In line 319, the authors define it as based on "proportion of errors in the trajectory". This proportion is not explained. How is the error calculated if a trajectory is lost or there are identity swaps? Multi-object tracking has a range of accuracy metrics that account for such events but none of those are used by the authors. Estimating metrics that are common for MOT literature, for example, IDF1, MOTA, and MOTP, would allow for better method performance understanding and comparison.

      In the new ms, we replaced our accuracy metric with the standard metric IDF1. IDF1 is the standard metric that is applied to systems in which the goal is to maintain consistent identities across time. See also the section in Appendix 1 "Computation of tracking accuracy” explaining why IDF1 and not MOTA or MOTP is the adequate metric for a system that wants to give correct tracking by identification in time. See lines 416-436.

      Using IDF1 we obtain slightly higher accuracies for the idtracker.ai systems. This is the comparison of mean accuracy four our previous accuracy and the new one for the full trajectories:

      v4:   97.42% -> 98.24%

      v5:   99.41% -> 99.49%

      v6:   99.74% -> 99.82%

      trex: 97.89% -> 97.89%

      (4) Additionally, the authors distinguish between tracking with and without crossings, but do not provide statistics on the frequency of crossings per video. It is also unclear how the crossings are considered for the final output. Including information such as the frame rate of the videos would help to better understand the temporal resolution and the differences between consecutive frames of the videos.

      We added this information in the Appendix 1 “Benchmark of accuracy and tracking time”, lines 445-451. The framerate in our benchmark videos goes from 25 to 60 fps (average of 37 fps). On average 2.6% of the blobs are crossings (1.1% for zebrafish 0.7% for drosophila 9.4% for mice).

      (5) In the description of the dataset used for evaluation (lines 349-365), the authors describe the random sampling of parameter values for each tracking run. However, it is unclear whether the same values were used across methods. Without this clarification, comparisons between the proposed method, older versions, and TRex might be biased due to lucky parameter combinations. In addition, the ranges from which the values were randomly sampled were also not described.

      Only one parameter is shared between idtracker.ai and TRex: intensity_threshold (in idtracker.ai) and threshold (in TRex). Both are conceptually equivalent but differ in their numerical values since they affect different algorithms. V4, v5, and TRex each required the same process of independent expert visual inspection of the segmentation to select the valid value range. Since versions 5 and 6 use exactly the same segmentation algorithm, they share the same parameter ranges.

      All the ranges of valid values used in our benchmark are public here https://drive.google.com/drive/folders/1tFxdtFUudl02ICS99vYKrZLeF28TiYpZ as stated in the section “Data availability”, lines 227-228.

      (6) Lines 122-123, Figure 1c. "batches" - is an imprecise metric of training time as there is no information about the batch size.

      We clarified the Figure caption, new Figure 2c.

      (7) Line 145 - "we run some steps... For example..." leaves the method description somewhat unclear. It would help if you could provide more details about how the assignments are carried out and which metrics are being used.

      Following this recommendation, we listed the specific sections from our previous publication where the reader can find information about the entire tracking pipeline (lines 539-549). This way, we keep the ms clear and focused on the new identification algorithm while indicating where to find such information.

      (8) Figure 3. How is tracking accuracy assessed with occlusions? Are the individuals correctly recognized when they reappear from the occluded area?

      The groundtruth for this video contains the positions of all animals at all times. Only the groundtruth points inside the region of interest are taken into account when computing the accuracy. When the tracking reaches high accuracy, it means that animals are successfully relabeled every time they enter the non-masked region. Note that this software works all the time by identification of animals, so crossings and occlusion are treated the same way. What is new here is that the occlusions are so large that there are no global fragments. We clarified this in the new section “Occlusion tests” in Methods, lines 239-251.

      (9) Lines 185-187 this part of the sentence is not clear.

      We rewrote this part in a clearer way, lines 180-182.

      (10) The authors also highlight the improved runtime performance. However, they do not provide a detailed breakdown of the time spent on each component of the tracking/training pipeline. A timing breakdown would help to compare the training duration with the other components. For example, the calculation of the Silhouette Score alone can be time-consuming and could be a bottleneck in the training process. Including this information would provide a clearer picture of the overall efficiency of the method.

      We measured that the training of ResNet takes on average in our benchmark 47% of the tracking time (we added this information line 551 section “Network Architecture”). In this training stage the bottleneck becomes the network forward and backward pass, limited by the GPU performance. All other processes happening during training have been deeply optimized and parallelized when needed so their contribution to the training time is minimal. Apart from the training, we also measured 24.4% of the total tracking time spent in reading and segmenting the video files and 11.1% in processing the identification images and detecting crossings.

      (11) An important part of the computational cost is related to model training. It would be interesting to test whether a model trained on one video of a specific animal type (e.g., zebrafish_5) generalizes to another video of the same type (e.g., zebrafish_7). This would assess the model's generalizability across different videos of the same species and spare a lot of compute. Alternatively, instead of training a model from scratch for each video, the authors could also consider training a base model on a superset of images from different videos and then fine-tuning it with a lower learning rate for each specific video. This could potentially save time and resources while still achieving good performance.

      Already before v6, there was the possibility for the user to start training the identification network by copying the final weights from another tracking session. This knowledge transfer feature is still present in v6 and it still decreases the training times significatively. This information has been added in Appendix 4, lines 906-909.

      We have already begun working on the interesting idea of a general base model but it brings some complex challenges. It could be a very useful new feature for future idtracker.ai releases.

      We thank the reviewer for the many suggestions. We have implemented all of them.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #2 (Public review):

      (1) Vglut2 isn't a very selective promoter for the STN. Did the authors verify every injection across brain slices to ensure the para-subthalamic nucleus, thalamus, lateral hypothalamus, and other Vglut2-positive structures were never infected?

      The STN is anatomically well-confined, with its borders and the overlying zona incerta (composed of GABAergic neurons) providing protection against off-target expression in most neighboring forebrain regions. All viral injections were histologically verified and did not into extend into thalamic or hypothalamic areas. As described in the Methods, we employed an app we developed (Brain Atlas Analyzer, available on OriginLab) that aligns serial histological sections with the Allen Brain Atlas to precisely assess viral spread and confirm targeting accuracy. The experiments included in the revised manuscript now focus on optogenetic inhibition and irreversible lesion approaches—three complementary methods that consistently targeted the STN and yielded similar behavioral effects.

      (2) The authors say in the methods that the high vs low power laser activation for optogenetic experiments was defined by the behavioral output. This is misleading, and the high vs low power should be objectively stated and the behavioral results divided according to the power used, not according to the behavioral outcome.

      Optogenetic excitation is no longer part of the study.

      (3) In the fiber photometry experiments exposing mice to the range of tones, it is impossible to separate the STN response to the tone from the STN response to the movement evoked by the tone. The authors should expose the mouse to the tones in a condition that prevents movement, such as anesthetized or restrained, to separate out the two components.

      The new mixed-effects modeling approach clearly differentiates sensory (auditory) from motor contributions during tone-evoked STN activation. In prior work (see Hormigo et al, 2023, eLife), we explored experimental methods such as head restraint or anesthesia to reduce movement, but we concluded that these approaches are unsuitable for addressing this question. Mice exhibit substantial residual movement even when head-fixed, and anesthesia profoundly alters neural excitability and behavioral state, introducing major confounds. To fully eliminate movement would require paralysis and artificial ventilation, which would again disrupt physiological network dynamics and raise ethical concerns. Therefore, the current modeling approach—incorporating window-specific covariates for movement—is the most appropriate and rigorous way to dissociate tone-evoked sensory activity from motor activity in behaving animals.

      (4) The claim 'STN activation is ideally suited to drive active avoids' needs more explanation. This claim comes after the fiber photometry experiments during active avoidance tasks, so there has been no causality established yet.

      Text adjusted. 

      (5) The statistical comparisons in Figure 7E need some justification and/or clarification. The 9 neuron types are originally categorized based on their response during avoids, then statistics are run showing that they respond differently during avoids. It is no surprise that they would have significantly different responses, since that is how they were classified in the first place. The authors must explain this further and show that this is not a case of circular reasoning.

      Statistically verifying the clustering is useful to ensure that the selected number of clusters reflects distinct classes. It is also necessary when different measurements are used to classify (movement time series classified the avoids) and to compare neuronal types within each avoid mode/class (know called “mode”). Moreover, the new modeling approach goes beyond the prior statistical limitations related to considering movement and neuronal variables separately. 

      (6) The authors show that neurons that have strong responses to orientation show reduced activity during avoidance. What are the implications of this? The author should explain why this is interesting and important.

      The new modeling approach goes beyond the prior analysis limitations. For instance, it shows that most of the prior orienting related activations closely reflect the orienting movement, and only in a few cases (noted and discussed in the results) orienting activations are related to the behavioral contingencies or behavioral outcomes in the task. 

      (8) The experiments in Figure 10 are used to say that STN stimulation is not aversive, but they only show that STN stimulation cannot be used as punishment in place of a shock. This doesn't mean that it is not aversive; it just means it is not as aversive as a shock. The authors should do a simpler aversion test, such as conditioned or real-time place preference, to claim that STN stimulation is not aversive. This is particularly surprising as previous work (Serra et al., 2023) does show that STN stimulation is aversive.

      Optogenetic excitation is no longer part of the study. 

      (7) It is not clear which conditions each mouse experienced in which order. This is critical to the interpretation of Figure 9 and the reduction of passive avoids during STN stimulation. Did these mice have the CS1+STN stimulation pairing or the STN+US pairing prior to this experiment? If they did, the stimulation of the STN could be strongly associated with either punishment or with the CS1 that predicts punishment. If that is the case, stimulating the STN during CS2 could be like presenting CS1+CS2 at the same time and could be confusing.

      Optogenetic excitation is no longer part of the study. 

      (8) The experiments in Figure 10 are used to say that STN stimulation is not aversive, but they only show that STN stimulation cannot be used as punishment in place of a shock. This doesn't mean that it is not aversive; it just means it is not as aversive as a shock. The authors should do a simpler aversion test, such as conditioned or real-time place preference, to claim that STN stimulation is not aversive. This is particularly surprising as previous work (Serra et al., 2023) does show that STN stimulation is aversive.

      Optogenetic excitation is no longer part of the study.

      (9) In the discussion, the idea that the STN encodes 'moving away' from contralateral space is pretty vague and unsupported. It is puzzling that the STN activates more strongly to contraversive turns, but when stimulated, it evokes ipsiversive turns; however, it seems a stretch to speculate that this is related to avoidance. In the last experiments of the paper, the axons from the STN to the GPe and to the midbrain are selectively stimulated. Do these evoke ipsiversive turns similarly?

      Optogenetic excitation is no longer part of the study. 

      (10) In the discussion, the authors claim that the STN is essential for modulating action timing in response to demands, but their data really only show this in one direction. The STN stimulation reliably increases the speed of response in all conditions (except maximum speed conditions such as escapes). It seems to be over-interpreting the data to say this is an inability to modulate the speed of the task, especially as clear learning and speed modulation do occur under STN lesion conditions, as shown in Figure 12B. The mice learn to avoid and increase their latency in AA2 vs AA1, though the overall avoids and latency are different from controls. The more parsimonious conclusion would be that STN stimulation biases movement speed (increasing it) and that this is true in many different conditions.

      Optogenetic excitation is no longer part of the study.

      (11)  In the discussion, the authors claim that the STN projections to the midbrain tegmentum directly affect the active avoidance behavior, while the STN projections to the SNr do not affect it. This seems counter to their results, which show STN projections to either area can alter active avoidance behavior. What is the laser power used in these terminal experiments? If it is high (3mW), the authors may be causing antidromic action potentials in the STN somas, resulting in glutamate release in many brain areas, even when terminals are only stimulated in one area. The authors could use low (0.25mW) laser power in the terminals to reduce the chance of antidromic activation and spatially restrict the optical stimulation.

      Optogenetic excitation is no longer part of the study. 

      (12) Was normality tested for data prior to statistical testing?

      Yes, although now we use mixed models

      (13) Why are there no error bars on Figure 5B, black circles and orange triangles?

      When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Reviewer #3 (Public review):

      (1) I really don't understand or accept this idea that delayed movement is necessarily indicative of cautious movements. Is the distribution of responses multi-modal in a way that might support this idea, or do the authors simply take a normal distribution and assert that the slower responses represent 'caution'? Even if responses are multi-modal and clearly distinguished by 'type', why should readers think this that delayed responses imply cautious responding instead of say: habituation or sensitization to cue/shock, variability in attention, motivation, or stress; or merely uncertainty which seems plausible given what I understand of the task design where the same mice are repeatedly tested in changing conditions. This relates to a major claim (i.e., in the work's title).

      In our study, “caution” is defined operationally as the tendency to delay initiation of an avoidance response in demanding situations (e.g., taking more time or care before crossing a busy street). The increase in avoidance latency with task difficulty is highly robust, as we have shown previously through detailed analyses of timing distributions and direct comparisons with appetitive behaviors (e.g., Zhou et al., 2022 JNeurosci). Moreover, we used the tracked movement time series to statistically classify responses into cautious modes, which is likely novel. This definition can dissociate cautious responding from broader constructs listed by a reviewer, such as attention, motivation, or stress, which must be explicitly defined to be rigorously considered in this context, including the likelihood that they covary with caution without being equivalent to it. 

      Cue-evoked orienting responses at CS onset are directly measured, and their habituation and sensitization have been characterized in our prior work (e.g., Zhou et al., 2023 JNeurosci). US-evoked escapes are also measured in the present study and directly compared with avoidance responses. Together, these analyses provide a rigorous and consistent framework for defining and quantifying caution within our behavioral procedures.

      Importantly, mice exhibit cautious responding as defined here across different tasks, making it more informative to classify avoidance responses by behavioral mode rather than by task alone. Accordingly, in the miniscope, single-neuron, and mixed-effects model analyses, we classified active avoids into distinct modes reflecting varying levels of caution. Although these modes covary with task contingencies, their explicit classification improves model predictability and interpretability with respect to cautious responding.

      (2) Related to the last, I'm struggling to understand the rationale for dividing cells into 'types' based the their physiological responses in some experiments (e.g., Figure 7).

      This section has now been expanded into 3 figures (Fig. 7-9) with new modeling approaches that should make the rationale more straight forward.

      By emphasizing the mixed-effects modeling results and integrating these analyses directly into the figures, the revised manuscript now more clearly delineates what is encoded at the population and single-neuron levels. Including movement and baseline covariates allowed us to dissociate motor-related modulation from other neural signals, substantially clarifying the distinction between movement encoding and other task-related variables, which we focus on in the paper. These analyses confirm the strong role of the STN in representing movement while revealing additional signals related to aversive stimulation and cautious responding that persist after accounting for motor effects. These signals arise from distinct neuronal populations that can be differentiated by their movement sensitivity and activation patterns across avoidance modes, reflecting varying levels of caution. At the same time, several effects that initially reflected orienting-related activity at CS-onset (note that our movement tracking captures both head position and orientation as a directional vector) dissipated once movement and baseline covariates were included in the models, emphasizing the utility of the analytical improvements in the revision.

      (3)The description and discussion of orienting head movements were not well supported, but were much discussed in the avoidance datasets. The initial speed peaks to cue seem to be the supporting data upon which these claims rest, but nothing here suggests head movement or orientation responses.

      As described in the methods (and noted above), we track the head and decompose the movement into rotational and translational components. With the new approach, several effects that initially reflected orienting-related activity at CS-onset (note that our movement tracking captures both head position and orientation as a directional vector) dissipated once movement and baseline covariates were included in the models, emphasizing the utility of the analytical improvements in the revision.

      (4) Similar to the last, the authors note in several places, including abstract, the importance of STN in response timing, i.e., particularly when there must be careful or precise timing, but I don't think their data or task design provides a strong basis for this claim.

      The avoidance modes and the measured latencies directly support the relation to action timing, but now the portion of the previous paper about optogenetic excitation and apparently the main source of criticism is no longer in the present study. 

      (5) I think that other reports show that STN calcium activity is recruited by inescapable foot shock as well. What do these authors see? Is shock, independent of movement, contributing to sharp signals during escapes?

      The question, “Is shock, independent of movement, contributing to sharp signals during escapes?” is now directly addressed in the revised analyses. By incorporating movement and baseline covariates into the mixed-effects models, we dissociate STN activity related to aversive stimulation from that associated with motor output. The results show that shock-evoked STN activation persists even after controlling for movement within defined neuronal populations, supporting a specific nociceptive contribution independent of motor dynamics—a dissociation that appears to be new in this field.

      (6) In particular, and related to the last point, the following work is very relevant and should be cited:  Note that the focus of this other paper is on a subset of VGLUT2+ Tac1 neurons in paraSTN, but using VGLUT2-Cre to target STN will target both STN and paraSTN.

      We appreciate the reviewer’s reference to the recent preprint highlighting the role of the para-subthalamic nucleus in avoidance learning. However, our study focused specifically on performance in well-trained mice rather than on learning processes. Behavioral learning is inherently more variable and can be disrupted by less specific manipulations, whereas our experiments targeted the stable execution of learned avoidance behaviors. Future work will extend these findings to the learning phase and examine potential contributions of subthalamic subdivisions, which our current Vglut2-based manipulations do not dissociate. We will consider this and related work more closely in those studies.

      (7) In multiple other instances, claims that were more tangential to the main claims were made without clearly supporting data or statistics. E.g., claim that STN activation is related to translational more than rotational movement; claim that GCaMP and movement responses to auditory cues were small; claims that 'some animals' responded differently without showing individual data.

      We have adjusted the text accordingly.

      (8) In several figures, the number of subjects used was not described. This is necessary. Also necessary is some assessment of the variability across subjects. The only measure of error shown in many figures relates to trial-to-trial or event variability, which is minimal because, in many cases, it appears that hundreds of trials may have been averaged per animal, but this doesn't provide a strong view of biological variability. When bar/line plots are used to display data, I recommend showing individual animals where feasible.

      All experiments report number of mice and sessions. Wherever feasible, we display individual data points (e.g., Figures 1 and 2) to convey variability directly. However, in cases where figures depict hundreds of paired (repeated-measures) data points, showing all points without connecting them would not be appropriate, while linking them would make the figures visually cluttered and uninterpretable. All plots and traces include measures of variability (SEM), and the raw data will be shared on Dryad. When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Also, to minimize visual clutter, only a subset of relevant comparisons is highlighted with asterisks, whereas all relevant statistical results, comparisons, and mouse/session numbers are fully reported in the Results section, with statistical analyses accounting for the clustering of data within subjects and sessions.

      (9) Can the authors consider the extent to which calcium imaging may be better suited to identify increases compared to decreases and how this may affect the results, particularly related to the GRIN data when similar numbers of cells show responses in both directions (e.g., Figure 3)?

      This is an interesting issue related to a widely used technique beyond the scope of our study.

      (10) Raw example traces are not provided.

      We do not think raw traces are useful here. All figures contain average traces to reflect the activity of the estimated population.

      (11) The timeline of the spontaneous movement and avoidance sessions was not clear, nor was the number of events or sessions per animal nor how this was set. It is not clear if there was pre-training or habituation, if many or variable sessions were combined per animal, or what the time gaps between sessions were, or if or how any of these parameters might influence interpretation of the results.

      We have enhanced the description of the sessions, including the number of animals and sessions, which are daily and always equal per animals in each group of experiments. As noted, the sessions are part of the random effects in the model.

      (12) It is not clear if or how the spread of expression outside of the target STN was evaluated, and if or how many mice were excluded due to spread or fiber placements.

      The STN is anatomically well-confined, with its borders and the overlying zona incerta (composed of GABAergic neurons) providing protection against off-target expression in most neighboring forebrain regions. All viral injections were histologically verified and did not into extend into thalamic or hypothalamic areas. As described in the Methods, we employed an app we developed (Brain Atlas Analyzer, available on OriginLab) that aligns serial histological sections with the Allen Brain Atlas to precisely assess viral spread and confirm targeting accuracy. The experiments included in the revised manuscript now focus on optogenetic inhibition and irreversible lesion approaches—three complementary methods that consistently targeted the STN and yielded similar behavioral effects.

      Recommendations for the authors:

      Reviewing Editor Comments:

      The primary feedback agreed upon by all the reviewers was that the manuscript requires significant streamlining as it is currently overly long and convoluted.

      We thank the reviewers and editors for their thoughtful and constructive feedback. In response to the primary comment that “the manuscript requires significant streamlining as it is currently overly long and convoluted,” we have substantially revised and refocused the paper. Specifically, we streamlined the included data and enhanced the analyses to emphasize the central findings: the encoding of movement, cautious responding, and punishment in the STN during avoidance behavior. We also focused the causal component of the study by including only the loss-of-function experiments—both optogenetic inhibition and irreversible viral/electrolytic lesions—that establish the critical role of STN circuits in generating active avoidance. Together, these revisions enhance clarity, tighten the narrative focus, and align the manuscript more closely with the reviewers’ recommendations.

      Major revisions include the addition of mixed-effects modeling to dissociate the contributions of movement from other STN-encoded signals related to caution and punishment. This modeling approach allowed us to reveal that these components are statistically separable, demonstrating that movement, cautious responding, and aversive input are encoded by neuronal subsets. To streamline the manuscript and address reviewer concerns, we removed the optogenetic excitation experiments. As revised, the paper presents a more concise and cohesive narrative showing that STN neurons differentially encode movement, caution, and aversive stimuli, and that this circuitry is essential for generating active avoidance behavior.

      Many of the specific points raised by reviewers now fall outside the scope of the revised manuscript. This is primarily because the revised version omits data and analyses related to optogenetic excitation and associated control experiments. By removing these components, the paper now presents a streamlined and internally consistent dataset focused on how the STN encodes movement, cautious responding, and aversive outcomes during avoidance behavior, as well as on loss-of-function experiments demonstrating its necessity for generating active avoidance. Below, we address the points that remain relevant across reviews.

      Following extensive revisions, the current manuscript differs in several important ways from what the assessment describes:

      The description that the study “uses fiber photometry, implantable lenses, and optogenetics” is more accurately represented as using both fiber photometry and singleneuron calcium imaging with miniscopes, combined with optogenetic and irreversible lesion approaches.

      The phrase stating that “active but not passive avoidance depends in part on STN projections to substantia nigra” is better characterized as “STN projections to the midbrain,” since our data show that optogenetic inhibition of STN terminals in both the mesencephalic reticular tegmentum (MRT) and substantia nigra pars reticulata (SNr) produce equivalent effects, and thus these sites are combined in the study. 

      Finally, the original concern that evidence for STN involvement in cautious responding or avoidance speed was incomplete no longer applies. The revised focus on encoding, through the inclusion of mixed-effects modeling, now dissociates movement-related, cautious, and aversive components of STN activity. By removing the optogenetic excitation data, we no longer claim that the STN controls caution but rather that it encodes cautious responding, alongside movement and punishment signals. Furthermore, loss-of-function experiments demonstrate that silencing STN output abolishes active avoidance entirely, supporting an essential role for the STN in generating goal-directed avoidance behavior—a behavioral domain that, unlike appetitive responding, is fundamentally defined by caution and the need to balance action timing under threat.

      Reviewer #2 (Recommendations for the authors):

      (1) Show individual data points on bar plots.

      Wherever feasible, we display individual data points (e.g., Figures 1 and 2) to convey variability directly. However, in cases where figures depict hundreds of paired (repeatedmeasures) data points, showing all points without connecting them would not be appropriate, while linking them would make the figures visually cluttered and uninterpretable. All plots and traces include measures of variability (SEM), and the raw data will be shared on Dryad. When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Also, to minimize visual clutter, only a subset of relevant comparisons is highlighted with asterisks, whereas all relevant statistical results, comparisons, and mouse/session numbers are fully reported in the Results section, with statistical analyses accounting for the clustering of data within subjects and sessions.

      (2) The active avoidance experiments are confusing when they are introduced in the results section. More explanation of what paradigms were used and what each CS means at the time these are introduced would add clarity. For example, AA1, AA2, etc, are explained only with references to other papers, but a brief description of each protocol and a schematic figure would really help.

      The avoidance protocols (AA1–4) are now described briefly but clearly in the Results section (second paragraph of “STN neurons activate during goal-directed avoidance contingencies”) and in greater detail in the Methods section. As stated, these tasks were conducted sequentially, and mice underwent the same number of sessions per procedure, which are indicated. All relevant procedural information has been included in these sections. Mice underwent daily sessions and learnt these tasks within 1-2 sessions, progressing sequentially across tasks with an equal number of sessions per task (7 per task), and the resulting data were combined and clustered by mouse/session in the statistical models.

      (3) How do the Class 1, 2, 3 avoids relate to Class 1, 2, 3 neural types established in Figure 3? It seems like they are not related, and if that is the case, they should be named something different from each other to avoid confusion. (4) Similarly, having 3 different cell types (a,b,c) in the active avoidance seems unrelated to the original classification of cell types (1,2,3), and these are different for each class of avoid. This is very confusing, and it is unclear how any of these types relate to each other. Presumably, the same mouse has all three classes of avoids, so there are recordings from each cell during each type of avoid.

      The terms class, mode, and type are now clearly distinguished throughout the manuscript. Modes refer to distinct patterns of avoidance behavior that differ in the level of cautious responding (Mode 3 is most cautious). Within each mode, types denote subgroups of neurons identified based on their ΔF/F activity profiles. In contrast, classes categorize neurons according to their relationship to movement, determined by cross-correlation analyses between ΔF/F and head speed (Class1-4; Fig. 7 is a new analysis) or head turns (ClassA-C, renamed from 1-3). This updated terminology clarifies the analytic structure, highlighting distinct neuronal populations within each analysis. For example, during avoidance behaviors, these classifications distinguish neurons encoding movement-, caution-, and outcome-related signals. Comparisons are conducted within each analytical set, within classes (A-C or 1-4 separately), within avoidance modes, or within modespecific neuronal types.

      …So the authors could compare one cell during each avoid and determine whether it relates to movement or sound, or something else. It is interesting that types a,b, and c have the exact same proportions in each class of avoid, and makes it important to investigate if these are the exact same cells or not.

      That previous table with the a,b,c % in the three figure panels was a placeholder, which was not updated in the included figure. It has now been correctly updated. They do not have the same proportions as shown in Fig. 9, although they are similar.

      Also, these mice could be recorded during the open field, so the original neural classification (class 1, 2,3) could be applied to these same cells, and then the authors can see whether each cell type defined in the open field has a different response to the different avoid types. As it stands, the paper simply finds that during movement and during avoidance behaviors, different cells in the STN do different things.

      We included a new analysis in Fig. 7 that classifies neurons based on the cross-correlation with movement. The inclusion of the models now clearly assigns variance to movement versus the other factors, and this analysis leads to the classification based on avoid modes. 

      (5) The use of the same colors to mean two different things in Figure 9 is confusing. AA1 vs AA2 shouldn't be the same colors as light-naïve vs light signaling CS.

      Optogenetic excitation is no longer part of the study.

      (6) The exact timeline of the optogenetics experiments should be presented as a schematic for understanding. It is not clear which conditions each mouse experienced in which order. This is critical to the interpretation of Figure 9 and the reduction of passive avoids during STN stimulation. Did these mice have the CS1+STN stimulation pairing or the STN+US pairing prior to this experiment? If they did, the stimulation of the STN could be strongly associated with either punishment or with the CS1that predicts punishment. If that is the case, stimulating the STN during CS2 could be like presentingCS1+CS2 at the same time and could be confusing. The authors should make it clear whether the mice were naïve during this passive avoid experiment or whether they had experienced STN stimulation paired with anything prior to this experiment.

      Optogenetic excitation is no longer part of the study.

      (20) Similarly, the duration of the STN stimulation should be made clear on the plots that show behavior over time (e.g., Figure 9E).

      Optogenetic excitation is no longer part of the study.

      (21) There is just so much data and so many conditions for each experiment here. The paper is dense and difficult to read. It would really benefit readability if the authors put only the key experiments and key figure panels in the main text and moved much of the repetitive figure panels to supplemental figures. The addition of schematic drawings for behavioral experiment timing and for the different AA1, AA2, and AA3 conditions would also really improve clarity.

      By focusing the study, we believe it has substantially improved clarity and readability. 

      Reviewer #3 (Recommendations for the authors):

      (1) Minor error in results 'Cre-AAV in the STN of Vglut2-Cre' Fixed.

      (2) In some Figure 2 panels, the peaks appear to be cut off, and blue traces are obscured by red.

      In Fig. 2, the peaks of movement (speed) traces are intentionally truncated to emphasize the rising phase of the turn, which would otherwise be obscured if the full y-axis range were displayed (peaks and other measures are statistically compared). This adjustment enhances clarity without omitting essential detail and is now noted in the legend.

    1. Sam Altman, the head of OpenAI, debuted his Sora app, which creates alarmingly realistic videos of fake scenes.

      Dowd introduces Sora as evidence that A.I. media can convincingly fabricate realistic videos, expanding the argument beyond Hollywood labor to the problem of misinformation and deception.

    1. During World War I he oversaw voluntary rationing as the head of the U.S. Food Administration and after the armistice served as the director-general of the American Relief Association in Europe.

      Hoover had experience helping people during crises, so Americans thought he could handle the Depression.

    1. Reviewer #1 (Public review):

      In this important study, the authors characterized the transformation of neural representations of olfactory stimuli from primary sensory cortex to multisensory regions in the medial temporal lobe and investigated how they were affected by non-associative learning. The authors used high-density silicon probe recordings from five different cortical regions while familiar vs. novel odors were presented to a head-restrained mouse. This is a timely study because unlike other sensory systems (e.g., vision), the progressive transformation of olfactory information is still poorly understood. The authors report that both odor identity and experience are encoded by all of these five cortical areas but nonetheless, some themes emerge. Single neuron tuning of odor identity is broad in the sensory cortices but becomes narrowly tuned in hippocampal regions. Furthermore, while experience affects neuronal response magnitudes in early sensory cortices, it changes the proportion of active neurons in hippocampal regions. Thus, this study is an important step forward in the ongoing quest to understand how olfactory information is progressively transformed along the olfactory pathway.

      The study is well-executed. The direct comparison of neuronal representations from five different brain regions is impressive. Conclusions are based on single neuronal level as well as population level decoding analyses. Among all the reported results, one stands out for being remarkably robust. The authors show that the anterior olfactory nucleus (AON), which receives direct input from the olfactory bulb output neurons, was far superior at decoding odor identity as well as novelty compared to all the other brain regions. This is perhaps surprising because the other primary sensory region - the piriform cortex - has been thought to be the canonical site for representing odor identity. A vast majority of studies have focused on aPCx, but direct comparisons between odor coding in the AON and aPCx are rare. The experimental design of this current study allowed the authors to do so and the AON was found to convincingly outperform aPCx. Although this result goes against the canonical model, it is consistent with a few recent studies including one that predicted this outcome based on anatomical and functional comparisons between the AON-projecting tufted cells vs. the aPCx-projecting mitral cells in the olfactory bulb.

      Future experiments are needed to probe the circuit mechanisms underlying the differential importance of the two primary olfactory cortices, as well as their potential causal roles in odor identification. Moreover, future work should test whether the decoding accuracy of odor identity and experience from neural data (as reported here) can predict the causal contributions of these regions, as revealed through perturbations during behavioral tasks that explicitly probe odor identification and/or experience.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1 (Public review):

      In this important study, the authors characterized the transformation of neural representations of olfactory stimuli from the primary sensory cortex to multisensory regions in the medial temporal lobe and investigated how they were affected by non-associative learning. The authors used high-density silicon probe recordings from five different cortical regions while familiar vs. novel odors were presented to a head-restrained mouse. This is a timely study because unlike other sensory systems (e.g., vision), the progressive transformation of olfactory information is still poorly understood. The authors report that both odor identity and experience are encoded by all of these five cortical areas but nonetheless some themes emerge. Single neuron tuning of odor identity is broad in the sensory cortices but becomes narrowly tuned in hippocampal regions. Furthermore, while experience affects neuronal response magnitudes in early sensory cortices, it changes the proportion of active neurons in hippocampal regions. Thus, this study is an important step forward in the ongoing quest to understand how olfactory information is progressively transformed along the olfactory pathway.

      The study is well-executed. The direct comparison of neuronal representations from five different brain regions is impressive. Conclusions are based on single neuronal level as well as population level decoding analyses. Among all the reported results, one stands out for being remarkably robust. The authors show that the anterior olfactory nucleus (AON), which receives direct input from the olfactory bulb output neurons, was far superior at decoding odor identity as well as novelty compared to all the other brain regions. This is perhaps surprising because the other primary sensory region - the piriform cortex - has been thought to be the canonical site for representing odor identity. A vast majority of studies have focused on aPCx, but direct comparisons between odor coding in the AON and aPCx are rare. The experimental design of this current study allowed the authors to do so and the AON was found to convincingly outperform aPCx. Although this result goes against the canonical model, it is consistent with a few recent studies including one that predicted this outcome based on anatomical and functional comparisons between the AON-projecting tufted cells vs. the aPCx-projecting mitral cells in the olfactory bulb (Chae, Banerjee et. al. 2022). Future experiments are needed to probe the circuit mechanisms that generate this important difference between the two primary olfactory cortices as well as their potential causal roles in odor identification.

      The authors were also interested in how familiarity vs. novelty affects neuronal representation across all these brain regions. One weakness of this study is that neuronal responses were not measured during the process of habituation. Neuronal responses were measured after four days of daily exposure to a few odors (familiar) and then some other novel odors were introduced. This creates a confound because the novel vs. familiar stimuli are different odorants and that itself can lead to drastic differences in evoked neural responses. Although the authors try to rule out this confound by doing a clever decoding and Euclidian distance analysis, an alternate more straightforward strategy would have been to measure neuronal activity for each odorant during the process of habituation.

      Reviewer #2 (Public review):

      This manuscript investigates how olfactory representations are transformed along the cortico-hippocampal pathway in mice during a non-associative learning paradigm involving novel and familiar odors. By recording single-unit activity in several key brain regions (AON, aPCx, LEC, CA1, and SUB), the authors aim to elucidate how stimulus identity and experience are encoded and how these representations change across the pathway.

      The study addresses an important question in sensory neuroscience regarding the interplay between sensory processing and signaling novelty/familiarity. It provides insights into how the brain processes and retains sensory experiences, suggesting that the earlier stations in the olfactory pathway, the AON aPCx, play a central role in detecting novelty and encoding odor, while areas deeper into the pathway (LEC, CA1 & Sub) are more sparse and encodes odor identity but not novelty/familiarity. However, there are several concerns related to methodology, data interpretation, and the strength of the conclusions drawn.

      Strengths:

      The authors combine the use of modern tools to obtain high-density recordings from large populations of neurons at different stages of the olfactory system (although mostly one region at a time) with elegant data analyses to study an important and interesting question.

      Weaknesses:

      (1) The first and biggest problem I have with this paper is that it is very confusing, and the results seem to be all over the place. In some parts, it seems like the AON and aPCx are more sensitive to novelty; in others, it seems the other way around. I find their metrics confusing and unconvincing. For example, the example cells in Figure 1C show an AON neuron with a very low spontaneous firing rate and a CA1 with a much higher firing rate, but the opposite is true in Figure 2A. So, what are we to make of Figure 2C that shows the difference in firing rates between novel vs. familiar odors measured as a difference in spikes/sec. This seems nearly meaningless. The authors could have used a difference in Z-scored responses to normalize different baseline activity levels. (This is just one example of a problem with the methodology.)

      We appreciate the reviewer’s concerns regarding clarity and methodology. It is less clear why all neurons in a given brain area should have similar firing rates. Anatomically defined brain areas typically comprise of multiple cell types, which can have diverse baseline firing rates. Since we computed absolute firing rate differences per neuron (i.e., novel vs. familiar odor responses within the same neuron), baseline differences across neurons do not have a major impact.

      The suggestion to use Z-scores instead of absolute firing rate differences is well taken. However, Z-scoring assumes that the underlying data are normally distributed, which is not the case in our dataset. Specifically, when analyzing odor-evoked firing rates on a per-neuron basis, only 4% of neurons exhibit a normal distribution. In cases of skewed distributions, Z-scoring can distort the data by exaggerating small variations, leading to misleading conclusions. We acknowledge that different analysis methods exist, we believe that our chosen approach best reflects the properties of the dataset and avoids potential misinterpretations introduced by inappropriate normalization techniques.

      (2) There are a lot of high-level data analyses (e.g., decoding, analyzing decoding errors, calculating mutual information, calculating distances in state space, etc.) but very little neural data (except for Figure 2C, and see my comment above about how this is flawed). So, if responses to novel vs. familiar odors are different in the AON and aPCx, how are they different? Why is decoding accuracy better for novel odors in CA1 but better for familiar odors in SUB (Figure 3A)? The authors identify a small subset of neurons that have unusually high weights in the SVM analyses that contribute to decoding novelty, but they don't tell us which neurons these are and how they are responding differently to novel vs. familiar odors.

      We performed additional analyses to address the reviewer’s feedback (Figures 2C-E and lines 118-132) and added more single-neuron data (Figures 1, S3 and S4).

      (3) The authors call AON and aPCx "primary sensory cortices" and LEC, CA1, and Sub "multisensory areas". This is a straw man argument. For example, we now know that PCx encodes multimodal signals (Poo et al. 2021, Federman et al., 2024; Kehl et al., 2024), and LEC receives direct OB inputs, which has traditionally been the criterion for being considered a "primary olfactory cortical area". So, this terminology is outdated and wrong, and although it suits the authors' needs here in drawing distinctions, it is simplistic and not helpful moving forward.

      We appreciate the reviewer’s concern regarding the classification of brain regions as “primary sensory” versus “multisensory.” Of note, the cited studies (Poo et al., 2021; Federman et al., 2024; Kehl et al., 2024) focus on posterior PCx (pPCx), while our recordings were conducted in very anterior section of anterior PCx. The aPCx and pPCx have distinct patterns of connectivity, both anatomically and functionally. To the best of our knowledge, there is no evidence for multimodal responses in aPCx, whereas there is for LEC, CA1 and SUB. Furthermore, our distinction is not based on a connectivity argument, as the reviewer suggests, but on differences in the α-Poisson ratio (Figure 1E and F).

      To avoid confusion due to definitions of what constitutes a “primary sensory” region, we adopted a more neutral description throughout the manuscript.

      (4) Why not simply report z-scored firing rates for all neurons as a function of trial number? (e.g., Jacobson & Friedrich, 2018). Figure 2C is not sufficient.

      Regarding z-scores, please see response to 1). We further added a figure showing responses of all neurons to novel stimuli (using ROC instead of z-scoring, as described previously (e.g. Cohen et al. Nature 2012). We added the following figure to the supplementary for the completeness of the analysis (S2E).

      For example, in the Discussion, they say, "novel stimuli caused larger increases in firing rates than familiar stimuli" (L. 270), but what does this mean?

      This means that on average, the population of neurons exhibit higher firing rates in response to novel odors compared to familiar ones.

      Odors typically increase the firing in some neurons and suppress firing in others. Where does the delta come from? Is this because novel odors more strongly activate neurons that increase their firing or because familiar odors more strongly suppress neurons?

      We thank the reviewer for this valuable feedback and extended the characterization of firing rate properties, including a separate analysis of neurons i) significantly excited by odorants, ii) significantly inhibited by odorants and iii) not responsive to odorants. We added the analysis and corresponding discussion to the main manuscript (Figures 2C-E and lines 118-132)

      (5) Lines 122-124 - If cells in AON and aPCx responded the same way to novel and familiar odors, then we would say that they only encode for odor and not at all for experience. So, I don't understand why the authors say these areas code for a "mixed representation of chemical identity and experience." "On the other hand," if LEC, CA1, and SUB are odor selective and only encode novel odors, then these areas, not AON and aPCx, are the jointly encoding chemical identity and experience. Also, I do not understand why, here, they say that AON and PCx respond to both while LEC, CA1, and SUB were selective for novel stimuli, but the authors then go on to argue that novelty is encoded in the AON and PCx, but not in the LEC, CA1, and SUB.

      We appreciate the reviewer’s request for clarification. Throughout the brain areas we studied, odorant identity and experience can be decoded. However, the way information is represented is different between regions. We acknowledge that that “mixed” representation is a misleading term and removed it from the manuscript.

      In AON and aPCx, neurons significantly respond to both novel and familiar odors. However, the magnitude of their responses to novel and familiar odors is sufficiently distinct to allow for decoding of odor experience (i.e., whether an odor is novel or familiar). Moreover, novelty engages more neurons in encoding the stimulus (Figure 2D). In neural space, the position of an odor’s representation in AON and aPCx shifts depending on whether it is novel or familiar, meaning that experience modifies the neural representation of odor identity. This suggests that in these regions the two representations are intertwined.

      In contrast, some neurons in LEC, CA1, and SUB exhibit responses to novel odors, but few neurons respond to familiar odors at all. This suggests a more selective encoding of novelty.

      (6) Lines 132-140 - As presented in the text and the figure, this section is poorly written and confusing. Their use of the word "shuffled" is a major source of this confusion, because this typically is the control that produces outcomes at the chance level. More importantly, they did the wrong analysis here. The better and, I think, the only way to do this analysis correctly is to train on some of the odors and test on an untrained odor (i.e., what Bernardi et al., 2021 called "cross-condition generalization performance"; CCGP).

      We appreciate the feedback and thank the reviewer for the recommendation to implement cross-condition generalization performance (CCGP) as used in Bernardi et al., 2020. We acknowledge that the term "shuffled" may have caused confusion, as it typically refers to control analyses producing chance-level outcomes. In our case, by "shuffling" we shuffled the identity of novel and familiar odors to assess how much the decoder relies on odor identity when distinguishing novelty. This test provided insight into how novelty-based structure exists within neural activity beyond random grouping but does not directly assess generalization.

      As suggested, we used CCGP to measure how well novelty-related representations generalize across different odors. Our findings show that in AON and aPCx, novelty-related information is indeed highly generalizable, supporting the idea that these regions encode novelty in a less odor-selective manner (Figure 2K).

      Reviewer #3 (Public review):

      In this manuscript, the authors investigate how odor-evoked neural activity is modulated by experience within the olfactory-hippocampal network. The authors perform extracellular recordings in the anterior olfactory nucleus (AON), the anterior piriform (aPCx) and lateral entorhinal cortex (LEC), the hippocampus (CA1), and the subiculum (SUB), in naïve mice and in mice repeatedly exposed to the same odorants. They determine the response properties of individual neurons and use population decoding analyses to assess the effect of experience on odor information coding across these regions.

      The authors' findings show that odor identity is represented in all recorded areas, but that the response magnitude and selectivity of neurons are differentially modulated by experience across the olfactory-hippocampal pathway.

      Overall, this work represents a valuable multi-region data set of odor-evoked neural activity. However, limitations in the interpretability of odor experience of the behavioral paradigm, and limitations in experimental design and analysis, restrict the conclusions that can be drawn from this study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Some suggestions, in no particular order, to further improve the manuscript:

      (1) The example neuronal responses for CA1 and SUB in Figure 1 are not very inspiring. To my eyes, the odor period response is not that different from the baseline period. In general, a thorough characterization of firing rate properties during the odor period between the different brain regions would be informative.

      We thank the reviewer for this valuable feedback. We have replaced the example neurons from CA1 and SUB in Figure 1C. We further extended the characterization of firing rate properties, including a separate analysis of neurons i) significantly excited by odorants, ii) significantly inhibited by odorants and iii) not responsive to odorants. We added the analysis and corresponding discussion to the main manuscript (Figures 2C-E and lines 118-132)

      (2) For the summary in Figure 1, why not show neuronal responses as z-scored firing rates as opposed to auROC?

      We chose to use auROC instead of z-scored firing rates due to the non-normality of the dataset, which can distort results when using z-scores. Specifically, z-scoring can exaggerate small deviations in neurons with low responsiveness, potentially leading to misleading conclusions. auROC provides a more robust measure of response change that is less sensitive to these distortions because it does not assume any specific distribution. This approach has been used previously (e.g. Cohen et al. 2012, Nature).

      (3) To study novelty, the authors presented odorants that were not used during four days of habituation. But this design makes it hard to dissociate odor identity from novelty. Why not track the response of the same odorants during the habituation process itself?

      We respectfully disagree with the argument that using different stimuli as novel and familiar constitutes a confound in our analysis. In our study, we used multiple different, structurally dissimilar single molecule chemicals which were randomly assigned to novel and familiar categories in each animal. If individual stimuli did cause “drastic differences in evoked neural responses”, these would be evenly distributed between novel and familiar stimuli. It is therefore extremely unlikely that the clear differences we observed between novel and familiar conditions and between brain areas can be attributed to the contribution of individual stimuli, in particular given our analyses was performed at the population level. In fact, we observed that responses between novel and familiar conditions were qualitatively very similar in the short time window after odor onset (Figure 1G and H).

      Importantly, the goal of this study was to investigate the impact of long-term habituation over more than 4 days, rather than short term habituation during one behavioral session. However, tracking the activity of large numbers of neurons across multiple days presents a significant technical challenge, due to the difficulty of identifying stable single-unit recordings over extended periods of time with sufficient certainty. Tools that facilitate tracking have recently been developed (e.g. Yuan AX et al., Elife. 2024) and it will be interesting to apply them to our dataset in the future.

      (4) Since novel odors lead to greater sniffing and sniffing strongly influences firing rates in the olfactory system, the authors decided to focus on a 400 ms window with similar sniffing rates for both novel vs. familiar odors. Although I understand the rationale for this choice, I worry that this is too restrictive, and it may not capture the full extent of the phenomenology.

      Could the authors model the effect of sniffing on firing rates of individual neurons from the data, and then check whether the odor response for novel context can be fully explained just by increased sniffing or not?

      It is an interesting suggestion to extend the window of analysis and observe how responses evolve with sniffing (and other behavioral reactions). To address this, we added an additional figure to the supplementary material, showing the mean responses of all neurons to novel stimuli during the entire odor presentation window (Fig. S1B).

      As suggested, we further created a Generalized Linear Model (GLM) for the entire 2s odor stimulation period, incorporating sniffing and novelty as independent variables. As expected, sniffing had a dominant impact on firing rate in all brain areas. A smaller proportion of neurons was modulated by novelty or by the interaction between novelty x breathing, suggesting the entrainment of neural activity by sniffing during the response to novel odors. These results support our decision to focus the analysis on the early 400ms window in order to dissociate the effects of novelty and behavioral responses. Taken together, our results suggest that odorant responses are modulated by novelty early during odorant processing, whereas at later stages sniffing becomes the predominant factor driving firing (Figure S2C-D).

      (5) The authors conclude that aPCx has a subset of neurons dedicated to familiar odors based on the distribution of SVM weights in Figure 3D. To me, this is the weakest conclusion of the paper because although significant, the effect size is paltry; the central tendencies are hardly different for the two conditions in aPCx. Could the authors show the PSTHs of some of these neurons to make this point more convincing?

      We appreciate the reviewer’s concern regarding the effect size. To strengthen our conclusion, we now include PSTHs of representative neurons in the least 10% and best 10% of neuronal population based on the SVM analysis (Figures S3 and S4). We hope this provides more clarity and support for the interpretation that there is a subset of neurons in aPCx that show greater sensitivity to familiar odors, despite the relatively modest central tendency differences.

      In the revised manuscript, we discuss the effect size more explicitly in the text to provide context for its significance (lines 193 - 195).

      Reviewer #2 (Recommendations for the authors):

      (1) The authors only talk about "responsive" neurons. Does this include neurons whose activity increases significantly (activated) and neurons whose activity decreases (suppressed)?

      Yes, the term "responsive" refers to neurons whose activity either increases significantly (excited) or decreases (inhibited) in response to the odor stimuli. We performed additional analyses to characterize responses separately for the different groups (Figure 2C-E and lines 118-132).

      (2) Line 54 - The Schoonover paper doesn't show that cells lose their responses to odors, but rather that the population of cells that respond to odors changes with time. That is, population responses don't become more sparse

      The fact that “the population of cells that respond to odors changes with time”, implies that some neurons lose their responsiveness (e.g. unit 2 in Figure 1 of Schoonover et al., 2021), while others become responsive (e.g. unit 1 in Figure 1 of Schoonover et al., 2021). Frequent responses reduce drift rate (Figure 4 of Schoonover et al., 2021), thus fewer neurons loose or gain responsiveness. We have revised the manuscript to clarify this.

      (3) Line 104 - "Recurrent" is incorrectly used here. I think the authors mean "repeated" or something more like that.

      Thank you for pointing this out. We replaced "recurrent" with "repeated".

      (4) Figure 3D - What is the scale bar here?

      We apologize for the accidental omission. The scale bar was be added to Figure 3D in the revised version of the manuscript.

      (5) Line 377 - They say they lowered their electrodes to "200 um/s per second." This must be incorrect. Is this just a typo, or is it really 200 um/s, because that's really fast?

      Thank you for pointing this out. It was 20 to 60 um/s, the change has been made in the manuscript.

      (6) Line 431: The authors say they used auROC to calculate changes in firing rates (which I think is only shown in Figure 1D). Note that auROC measures the discriminability of two distributions, not the strength or change in the strength of response.

      Indeed we used auROC to measure the discriminability of firing between baseline and during stimulus response. We have corrected the wording in the methods.

      (7) Figure 1B: The anatomical locations of the five areas they recorded from are straightforward, and this figure is not hugely helpful. However, the reader would benefit tremendously by including an experimental schematic. As is, we needed to scour the text and methods sections to understand exactly what they did when.

      We thank the reviewer for this suggestion. We included an experimental schematic in the supplementary material.

      (8) Figure 1F(left): This plot is much less useful without showing a pre-odor window, even if only times after the odor onset were used for calculation alpha

      We appreciate this concern, however the goal of Figure 1F is to illustrate the meaning of the alpha value itself. We chose not to include a pre-odor window comparison to avoid confusing the reader.

      (9) Figure 2A: What are the bar plots above the raster plots? Are these firing rates? Are the bars overlaid or stacked? Where is the y-axis scale bar?

      The bar plots above the raster plots represent a histogram of the spike count/trials over time, with a bin width of 50 ms. These bars are overlaid on the raster plot. We will include a y-axis scale bar in the revised figure to clarify the presentation.

      (10) Figure 4G: This makes no sense. First, the Y axis is supposed to measure standard deviation, but the axis label is spikes/s. Second, if responses in the AON are much less reliable than responses in "deeper" areas, why is odor decoding in AON so much better than in the other areas?

      We acknowledge the error in the axis label, and we will correct it to indicate the correct units. AON has a larger response variability but also larger responses magnitudes, which can explain the higher decoding accuracy.

      (11) From the model and text, one predicts that the lifetime sparseness increases along the pathway. The authors should use this metric as well/instead of "odor selectivity" because of problems with arbitrary thresholding.

      We acknowledge that lifetime sparseness, often computed using lifetime kurtosis, can be an informative measure of selectivity. However, we believe it has limitations that make it less suitable for our analysis. One key issue is that lifetime sparseness does not account for the stability of responses across multiple presentations of the same stimulus. In contrast, our odor selectivity measure incorporates trial-to-trial variability by considering responses over 10 trials and assessing significance using a Wilcoxon test compared to baseline. While the choice of a p-value threshold (e.g., 0.05) is somewhat arbitrary, it is a widely accepted statistical convention. Additionally, lifetime sparseness does not account for excitatory and inhibitory responses. For example, if a neuron X is strongly inhibited by odor A, strongly excited by odor B, and unresponsive to odors C and D, lifetime sparseness would classify it as highly selective for odor B, without capturing its inhibitory selectivity for odor A. The lifetime sparseness will be higher than if X was simply unresponsive for A.

      Our odor selectivity measure addresses this by considering both excitation and inhibition as potential responses. Thus, while lifetime sparseness could provide a useful complementary perspective in another type of dataset, it does not fully capture the dynamics of odor selectivity here.

      Author response 1.

      Lifetime Kurtosis distribution per region.

      Reviewer #3 (Recommendations for the authors):

      Main points:

      (1) The authors use a non-associative learning paradigm - repeated odor exposure - to test how experience modulates odor responses along the olfactory-hippocampal pathway. While repeated odor exposure clearly modulates odor-evoked neural activity, the relevance of this modulation and its differential effect across different brain areas are difficult to assess in the absence of any behavioral read-outs.

      Our experimental paradigm involves a robust, reliable behavioral readout of non-associative learning. Novel olfactory stimuli evoke a well-characterized orienting reaction, which includes a multitude of physiological reactions, including exploratory sniffing, facial movements and pupil dilation (Modirshanechi et al., Trends Neuroscience 2023). In our study, we focused on exploration sniffing.

      Compared to associative learning, non-associative learning might have received less attention. However, it is critically important because it forms the foundation for how organisms adapt to their environment through experience without forming associations. This is highlighted by the fact that non-instrumental stimuli can be remembered in large number (Standing, 1973) and with remarkable detail (Brady et al., 2008). While non-associative learning can thus create vast, implicit memory of stimuli in the environment, it is unclear how stimulus representations reflect this memory. Our study contributes to answering this question. We describe the impact of experience on olfactory sensory representations and reveal a transformation of representations from olfactory cortical to hippocampal structures. Our findings also indicate that sensory responses to familiar stimuli persist within sensory cortical and hippocampal regions, even after spontaneous orienting behaviors habituated. Further studies involving experimental manipulation techniques are needed to elucidate the causal mechanisms underlying the formation of stimulus memory during non-associative learning.

      (2) The authors discuss the olfactory-hippocampal pathway as a transition from primary sensory (AON, aPCx) to associative areas (LEC, CA1, SUB). While this is reasonable, given the known circuit connectivity, other interpretations are possible. For example, AON, aPCx, and LEC receive direct inputs from the olfactory bulb ('primary cortex'), while CA1 and SUB do not; AON receives direct top-down inputs from CA1 ('associative cortex'), while aPCx does not. In fact, the data presented in this manuscript does not appear to support a consistent, smooth transformation from sensory to associative, as implied by the authors (e.g. Figure 4A, F, and G).

      Thank you for this insightful comment. Indeed, there are complexities in the circuitry, and the relationships between different areas are not linear. We believe that AON and aPCx are distinctly different from LEC, CA1 and SUB, as the latter areas have been shown to integrate multimodal sensory information. To avoid confusion due to definitions of what constitutes a “primary sensory” region, we adopted a more neutral description throughout the manuscript. We also removed the term “gradual” to describe the transition of neural representations from olfactory cortical to hippocampal areas.

      (3) The analysis of odor-evoked responses is focused on a 400 ms window to exclude differences in sniffing behavior. This window spans 200 ms before and after the first inhalation after odor onset. Inhalation onset initiates neural odor responses - why do the authors include neural data before inhalation onset?

      The reason to include a brief time window prior to odor onset is to account for what is often called “partical” sniffs. In our experimental setup, odor delivery is not triggered by the animal’s inhalation. Therefore, it can happen that an animal has just begun to inhale when the stimulus is delivered. In this case, the animal is exposed to odorant molecules prior to the first complete inhalation after odor onset. We acknowledge that this limits the temporal resolution of our measurements, but it does not affect the comparison of sensory representations between different brain areas.

      It would also be interesting to explore the effect of sniffing behavior (see point 2) on odor-evoked neural activity.

      Thank you for your comment, we performed additional analysis including a GLM to address this question (Figure S2C-D).

      Minor points:

      (4) Figure 2A represents raster plots for 2 neurons per area - it is unclear how to distinguish between the 2 neurons in the plots.

      Figure 2A shows one example neuron per brain area. Each neurons has two raster plot which indicate responses to either a novel (orange) or a familiar stimulus (blue). We have revised the figure caption for clarity.

      (5) Overall, axes should be kept consistent and labeled in more detail. For example, Figure 2H and I are difficult to compare, given that the y-axis changes and that decoding accuracies are difficult to estimate without additional marks on the y-axis.

      Axes are indeed different, because chance level decoding accuracy is different between those two figures. The decoding between novel and familiar odors has a chance level of 0.5, while chance level decoding odors is 0.1 (there are 10 odors to decode the identity from).

      (6) Some parts of the discussion seem only loosely related to the data presented in this manuscript. For example, the statement that 'AON rather than aPCx should be considered as the primary sensory cortex in olfaction' seems out of context. Similarly, it would be helpful to provide data on the stability of subpopulations of neurons tuned to familiar odors, rather than simply speculate that they could be stable. The authors could summarize more speculative statements in an 'Ideas and Speculation' subsection.

      Thank you for your comment. We appreciate your perspective on our hypotheses. We have revised the discussion accordingly. Specifically, we removed the discussion of stable subpopulations, since we have not performed longitudinal tracking in this study.

      (7) The authors should try to reference relevant published work more comprehensively.

      Thank you for your comment. We attempted to include relevant published work without exceeding the limit for references but might have overseen important contributions. We apologize to our colleagues, whose relevant work might not have been cited.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03206

      Corresponding author(s____): Teresa M. Przytycka

      General Statements

      We thank all the reviewers for their time and their constructive criticism, based on which we have revised our manuscript. All review comments in are italics. Our responses are indicated in normal font except the excerpts from manuscript which are shown within double quote and in italics. The line numbers indicated here refer to those in the revised manuscript.

      Point-by-point description of the revisions

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This paper addresses the interesting question of how cell size may scale with organ size in different tissues. The approach is to mine data from the fly single cell atlas (FCA) which despite its name is a database of gene expression levels in single isolated nuclei. Using this data, they infer cell size based on ribosomal protein gene expression, and based on this approach infer that there are tissue and sex specific differences in scaling, some of which may be driven by differences in ribosomal protein gene expression.

      Response: Indeed, using the FCA dataset, we infer sex-specific differences in both cell size and cell number, which we validated with targeted experiments. We show that Drosophila cell types scale through distinct strategies-via cell size, cell number, or a mix of both-in an allometric rather than uniform fashion. We further propose that these scaling differences are driven, at least in part, by variation in translational activity, reflected in the expression of ribosomal proteins, translation elongation factors, and Myc.

      -----------------------------------------------------------

      I think the idea of mining this database is a clever one, however there a number of concerns about whether the existing data can really be used to draw the conclusions that are stated.

      __Response: __We are pleased to see that the reviewer found the question and our approach interesting.

      -----------------------------------------------------------

      *One concern has to do with the assumption that RP (ribosome protein) expression is a proxy for cell size. It is well established that ribosome abundance scales with cell size, but is there reason to believe that ribosome nuclear gene EXPRESSION correlates with ribosome abundance? *

      I'm not saying that this can't be true, but it seems like a big assumption that needs to be justified with some data. Maybe this is well known in the Drosophila literature, but in that case the relevant literature really needs to be cited.

      __Response: __To avoid any misunderstanding: we use sex-biased RP expression as an indicator of sex differences in cell size only within the same cell type or subtype, as defined by expression-based clustering in the FCA-not as a general estimator of cell size. This measure is applied strictly within the same clusters, never between different ones. To prevent overinterpretation, we replaced the term 'proxy' with 'indicator,' since the earlier wording might have implied that ribosomal gene expression was being used to estimate cell size more broadly.

      We should have begun by providing more background on the well-established link between ribosomal protein gene dosage and cell growth. This context was missing from the introduction, so we have now added a full paragraph outlining what is known about this connection:

      *Added at line 85: *

      "Cell growth, which supports both cell enlargement and cell division, demands elevated protein synthesis, accomplished by boosting translation rates. Indeed, ribosome abundance is known to scale with cell size in many organisms (Schmoller and Skotheim 2015; Cadart and Heald 2022; Serbanescu et al. 2022). Long before it was known that DNA was the carrier of genetic information, Drosophila researchers had identified a large class of mutations known as "Minutes" (Schultz 1929). These were universally haplo-insufficient. A single wild type copy resulted in a tiny slowly growing fly, and the homozygous loss-of-function alleles were lethal. In clones, the Minute cells are clearly smaller and compete poorly with surrounding wild type cells. We now know that most of the Minute loci encode ribosomal proteins (Marygold et al. 2007). Similarly, the Drosophila diminutive locus, also characterized by small flies almost a century ago, is now known to encode the Myc oncogene (Gallant 2013). This is significant as Myc is a regulator of ribosomal protein encoding genes in metazoans, including Drosophila (Grewal et al. 2005). The ribosome is assembled in a specialized nuclear structure called the nucleolus (Ponti 2025). Across species, including Drosophila (Diegmiller et al. 2021) and C. elegans (Ma et al. 2018), nucleolar size scales with cell size and is broadly correlated with growth in cell size and/or cell number, processes that are directly relevant to sex-specific allometry. Collectively, these and many other studies offer compelling evidence that ribosomal biogenesis is positively associated with cell size and growth, underscoring the value of measuring ribosome biogenesis as a metric."

      We understand that the reviewer is asking whether reduced RP mRNA expression directly leads to reduced functional ribosome assembly. We do not have a definitive answer to that specific question. However, we directly measured translation in fat body cells (section: Female bias in ribosomal gene expression in fat body cells leads to sex-biased protein synthesis), and the results show a clear correlation between RP gene expression and biosynthetic activity; even though we did not track every step from transcription to ribosome assembly to polysome loading across all cell types. This would indeed be an excellent direction for future work, including polysome profiling and related assays. Importantly, we did examine the nucleolus (Figure 4), where ribosome assembly occurs, and showed that nucleolar volume scales with RP gene expression. This strongly supports the presence of sex-specific differences in ribosome biogenesis.

      Added at line 115:

      "Building on the earlier studies noted above, as well as our direct measurements of translation bias in the fat body, nucleolar size, and cell size, we used sex-biased expression of ribosomal proteins as an indicator of sex differences in per-nucleus cell size."

      -----------------------------------------------------------

      Second, the interpretation of RP expression as a proxy for cell size seems potentially at odds with the fact that some cells are multi-nucleate. Those cells are big because of multiple nuclei, and so they might not show any increase in ribosome expression per nucleus. presumably for multi-nucleate cells, RP expression if it reflects anything at all would be something to do with cell size PER nucleus.

      Response: Yes, this is a very important point, and this is why we chose multinucleated indirect flight muscles for our direct experimental analysis. We show that in indirect flight muscle cells, adult cell size is greatly influenced by the sex-specific number of nuclei per cell. The female muscle cells are larger and have larger nuclei count per cell. Additionally, they also have higher expression of ribosomal protein coding genes. As the latter data are from the single nucleus sequencing atlas, this already demonstrates what this reviewer is asking for: per nucleus, female muscle cells express more ribosome protein coding mRNAs.

      -----------------------------------------------------------

      *Third, it is well known that many tissues in Drosophila are polyploid or polytene. I don't know enough about the methodology used to produce the FCA to know whether this is somehow normalized. Otherwise, my hypothesis would be that nuclei showing higher RP expression might just be polyploid or polytene. You might say that this could be controlled by asking if all genes are similary upregulated, but that isn't the case since at least in polytene chromosomes it is well known that only a small number of genes are expressed at a given time, while many are silent. *

      Response: Yes, this is an excellent point. As noted above, our study does not distinguish among the different potential causes of sex differences in ribosomal mRNA copy number, as these may vary across cell types. We now explicitly acknowledge it in the discussion (line 327). Importantly, even in the cases when ribosomal gene expression bias primarily reflects differences in DNA content, this still represents a plausible mechanistic route linking ribosomal gene expression to increased nucleolar ribosome biogenesis and, ultimately, larger cell size. This possibility does not alter our main conclusions.

      -----------------------------------------------------------

      Overall, I think a lot more foundational work would need to be done in order to allow the inference of cell size from RP expression. In a way, it is a bit unfortunate that they chose to do this work in Drosophila where so many cells are polyploid, although I gather that even in humans some tissues have this issue, for example large neurons in the brain.

      Response: We acknowledge that we did not clearly reference some of the foundational work in the literature. To address this, we have expanded the introduction to provide additional background and context. We also clarify that our fat body experiment offers independent support for the relationship between ribosomal gene expression bias, nuclear size bias, and corresponding biases in protein synthesis, thereby reinforcing the use of sex-specific ribosomal gene expression as an indicator of sex-specific cell size. Importantly, we assess this bias only within clusters, not between them. These clusters are derived from gene-expression-based clustering and are therefore relatively homogeneous. For example, as discussed in our response to Reviewer #3, the fat body contains several clusters that correspond to expression-defined subtypes of fat body cells. Our previous terminology may have inadvertently implied that we were using ribosomal gene expression to estimate cell size more broadly, which was not our intention.

      As for the choice of the organism, most of the authors are Drosophila researchers and we benefit from the unique, highly replicated data from whole head and whole body of both sexes. Such data is necessary for a non-biased estimation of the differences in nuclear number.

      -----------------------------------------------------------

      *Reviewer #1 (Significance (Required)):

      The idea that gene regulatory networks could "program" differences in scaling by changing levels of ribosomal protein gene expression is a tremendously important one if it can be established, because it would show a simple way for size scaling to be placed under control of developmental regulatory pathways. My original concern when I first looked at the abstract was going to be that yeah the results are interesting but a mechanism is not provided, but as I read it, that concern went away. showing that RP gene expression, which could be programmed by various driving pathways, can affect allometric scaling, would be extremely impactful and really change how we think about scaling, but putting it into the framework of gene expression networks that control other aspects of developmewnht. it would not be necessary to show which pathways actually drive these expression differences, the fact that they are different would be interesting enough to make everyone want to read this paper. But as discussed above I am not, however, convinced by the evidence presented here. So while I think it would be very significant if true, I am not convinced that the conclusion is well supported. This doesn't mean I have a reason to think it is false, just that its not well supported for the reasons I have given.*

      Response: We are grateful to the reviewer for this positive assessment of our findings despite lack of a specific mechanism. We also regret that our initial writing did not clearly situate our work within the foundational literature on the relationship between ribosomal biogenesis and scaling. The key contribution of our study is to demonstrate that sex-biased ribosomal biogenesis plays a role in allometric scaling, providing a basis for future mechanistic exploration. We hope that the revised manuscript now offers clear and compelling support for the conclusion that RP gene expression bias can influence allometric scaling.

      -----------------------------------------------------------

      I hasten to point out that I could be entirely wrong, if the missing bits of logic (i.e. that RP expression matches ribosome abundance and that gene expression in the FCA dataset isn't influenced by ploidy of the nucleus). If suitable references can be provided to support these underlying assumptions, then in fact I think these concerns could be answered with very little effort. Otherwise, I think experiments would be needed to support these assumptions, and that might be non-trivial to do in a reasonable time frame. for that reason, in the next question I have put "cannot tell" for the time estimate.

      Response: While gene expression in some FCA cell types may indeed be influenced by ploidy, our analysis does not depend on distinguishing among the possible sources of gene expression bias, which may vary across cell types. Rather, our key point is that-regardless of its origin-an increase in ribosomal gene expression is associated with enhanced ribosome biogenesis in the nucleolus and, ultimately, larger cell size. Thus, our main conclusions do not rely on any specific mechanism underlying RP gene expression upregulation. We now include additional references supporting the relationship between RP expression bias and cell size bias. We also strengthen the link between ribosomal gene expression and biosynthetic activity by clarifying its relationship with sex-biased Myc expression and the strong correlation with expression bias of EF1. We now include additional references supporting the relationship between RP expression bias and cell size bias. We also strengthen the link between ribosomal gene expression and biosynthetic activity by clarifying its relationship with sex-biased Myc expression and the strong correlation with expression bias of EF1.

      We thank the reviewer for their thoughtful and constructive comments, which have prompted us to clarify both our reasoning and the relevant literature more fully.

      -----------------------------------------------------------

      *Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      The authors analyzed the FlyAtlas single-nucleus dataset to identify sex differences in gene expression and cell numbers. This led them to focus on muscles, cardiomyocytes, and fat body cells. They then measured cell and nucleolus size across different tissues and showed that reducing Myc function decreases sex differences in fat body cells. Overall, the manuscript provides a characterization of dimorphic differences in cell and organ size across three tissues.*

      Response: This is a nice synopsis of the work.

      -----------------------------------------------------------

      Major Comments: The major claims of the manuscript are well supported by the reported experiments and analyses. While Reviewer #2 considered the major claims of the manuscript to be well supported, by the reported experiments and analysesStatistical analyses appear adequate.

      Response: We agree, and we are glad that the reviewer found our work well supported.

      -----------------------------------------------------------

      *Minor Comments: The following minor issues should be addressed through textual edits:In the Introduction:

      "Disruptions in proportionality, whether due to undergrowth or overgrowth, can lead to reduced fitness or diseases such as cancer." Could the authors provide a reference for this statement, particularly for the claim that disruptions in proportion*

      Response: We apologize for this omission. The following explanation is now included starting at line 39:

      "For example, scaled cell growth is a driver of symmetry in Myc-dependent scaling of bone growth in the skeleton by chondrocyte proliferation (Ota et al. 2007; Zhou et al. 2011). Increased nucleolus size is a well known marker of cancer progression in a histopathological setting (Pianese 1896; Derenzini et al. 1998; Elhamamsy et al. 2022)."

      -----------------------------------------------------------

      *The authors state:

      "This study offers a comprehensive, cellular-resolution analysis of sexual size dimorphism in a model organism, uncovering how differences in cell number and size contribute to sex-specific body plans."*

      The study cannot be considered comprehensive, as not all organs were examined.

      Response: Indeed, "comprehensive" is a loaded word and in the revised manuscript we just omitted it.

      -----------------------------------------------------------

      *The following sentence from the abstract is unclear:

      "By uncovering how a conserved developmental system produces sex-specific proportions through distinct cellular strategies..."*

      * What do the authors mean by a conserved developmental system? Do they refer to a commonly used developmental model, or to a developmental system that is evolutionarily conserved?*

      Response: We acknowledge that the use of the word 'conserved' was inappropriate, and we have therefore removed it from the statement.

      -----------------------------------------------------------

      *Reviewer #2 (Significance (Required)):

      The manuscript presents a relevant exploration of sex-specific differences in cell size and cell number in Drosophila males and females. The limitations of the study are clearly acknowledged in the "Limitations" section. The work does not provide mechanistic insight into the causes or functional consequences of the observed differences. Nonetheless, the study extends our understanding of sexual dimorphism and establishes a foundation for future investigations into the autonomous and systemic mechanistic factors that regulate these differences.*

      Response: Thank you.

      -----------------------------------------------------------

      *Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      The manuscript by Pal and colleagues addresses an important question: the cellular mechanisms underlying sex differences in organ size. By leveraging single-nucleus transcriptomic data from the adult Drosophila Cell Atlas, the authors show that different cell types adopt distinct strategies to achieve sex differences in organ size-either by increasing cell size or by altering cell number. They then focus on three organs-the indirect flight muscles, the heart, and the fat body-and provide supporting evidence for their transcriptomic analyses.*

      Response: This is a nice summary of the study. Thank you.

      -----------------------------------------------------------

      This study tackles a highly relevant and often overlooked question, as our understanding of the molecular and cellular events driving sex differences remains incomplete. The work presents interesting observations; however, it is largely descriptive, establishing correlations without providing functional evidence or mechanistic insight.

      Response: We agree that this is an often overlooked problem that has been difficult to address experimentally without single-cell genomics. Our work aims to help fill this gap. While the paper does contain descriptive elements, we believe such characterization is important at the early stages of developing a new area of inquiry. The study explores a unique dataset and includes experimental validation to support key observations. We also propose how allometry may be shaped by cell division and cell size, drawing on well-established molecular mechanisms. Thus, the reviewer's comment regarding a lack of mechanistic insight likely pertains to the absence of a direct connection to the sex-determination pathway, which is beyond the scope of the current study.

      -----------------------------------------------------------

      Below are four main points that should be addressed before publication: 1. Introduction and contextualisation of prior work The introduction does not adequately present the current state of knowledge. Several key studies are missing or insufficiently discussed. In particular, the following works should be included and integrated into the introduction: - PMID: 26710087 - shows that the sex determination gene transformer regulates male-female differences in Drosophila body size. - PMID: 28064166 - describes how differences in Myc gene dosage contribute to sex differences in body size. - PMID: 26887495 - demonstrates that the intrinsic sexual identity of adult stem cells can control sex-biased organ size through sex-biased proliferation. - PMID: 28976974 - reveals that Sxl modulates body growth through both tissue-autonomous and non-autonomous mechanisms. - PMID: 39138201 - shows that transformer drives sex differences in organ size and body weight. Incorporating and discussing these references would provide a more comprehensive and up-to-date framework for the study.

      Response: We agree that the literature suggested by the reviewer strengthens the introduction and improves the contextualization of prior work relevant to our study. Although much of it was previously included in the discussion section on cell-autonomous and hormonal regulation, it has now been moved to the introduction, along with the discussion of the papers suggested by the reviewer (beginning at line 58).

      "In Drosophila melanogaster, adult females are substantially larger than males (Fig. 1A1), yet both sexes develop from genetically similar zygotes and share most organs and cell types. In wild type flies, sex is determined by the number of X chromosomes in embryos, with XX flies developing as females and X(Y) flies developing as males due to the activation and stable expression of Sex-lethal only in XX flies (Erickson and Quintero 2007). While it is not entirely clear how sexually dimorphic size is regulated, the sex determination pathway is implicated in size regulation. Sex-reversed flies often show a size based on the X chromosome number rather than sexual morphology. Female Sex-lethal contributes to larger female size independently of sexual identity (Cline 1984), and Sex-lethal expression in insulin producing neurons in the brain also impacts body size (Sawala and Gould 2017). Female-specific Transformer protein is produced as a consequence of female-specific Sex-lethal and also contributes to increased female size (Rideout et al. 2015). This size scaling also applies to individual organs. For example, the Drosophila female gut is longer than the male gut due Transformer activity (Hudry et al. 2016). It has also been suggested that Myc dose (it is X-linked) is a regulator of body size (Mathews et al. 2017), although the failed dosage compensation model proposed has not been demonstrated."

      And again at line 74:

      "These studies show that size is regulated, but they do not address whether scaling is uniform or non-uniform and the mechanism for sexual size differences (SSD). The origins of SSD can, in principle, arise from differences in (i) gene expression, (ii) the presence of sex-specific cell types, (iii) the number of cell-specific nuclei, or (iv) the size (per nucleus) of those cells. Previous research in Drosophila has largely focused on gene expression in sex-specific organs like the gonads (Arbeitman et al. 2002; Parisi et al. 2004; Graveley et al. 2011; Pal et al. 2023), which are governed by a well-characterized sex-determination pathway (Salz and Erickson 2010; Clough and Oliver 2012; Raz et al. 2023) However, whether and how scaling differences in shared, non-sex-specific tissues are achieved via changes in cell size and number remains largely unexamined (Fig. 1A2). These studies show that size is regulated, but they do not address whether scaling is uniform or non-uniform and the mechanism for size differences."

      -----------------------------------------------------------

      2. Use of ribosomal gene expression as a proxy for cell size The authors use ribosomal gene expression levels as a proxy for cell size, but this assumption is not adequately justified. The cited references (refs. 20-22) focus on unicellular organisms (bacteria and yeast) or cleavage divisions in frog embryos, which are fundamentally different from adult Drosophila tissues. The authors should provide evidence that ribosome abundance scales with cell size across the distinct adult Drosophila cell types. Given that most adult fly tissues are post-mitotic, it is more likely that ribosomal gene expression reflects protein synthesis activity rather than cell size, particularly in secretory cell types.

      Response: Reviewer 1 raised a similar point, and we agree. We recognize that the term "proxy" may have been misleading. We use this measure only in the context of sex bias within homogeneous cell clusters, and not between clusters, even when such clusters share the same cell-type annotation. To avoid overinterpretation, we changed "poxy" to "indicator".

      In response to the reviewer's concern, we have expanded our discussion of the relevant supporting literature (additional text starting line 75). We have also directly measured translation in the fat body cells (section: Female bias in ribosomal gene expression in fat body cells leads to sex biased protein synthesis), which clearly demonstrates a correlation between ribosomal protein gene expression and biosynthetic activity. Although, we have not traced the chain of events from expression to ribosome assembly to polysome loading in all cell types, we did examine the nucleolus (Figure 4), where ribosomes are assembled, and we make a strong point that the volume of the nucleolus scales like ribosome protein gene expression. This provides strong evidence for sex-specific ribosome biogenesis contributing to cell size.

      Furthermore, the observation that ribosomal gene expression likely reflects protein synthesis activity is not at odds with increased cell size: biosynthesis increases in larger cells (Schmoller and Skotheim 2015). We have added a panel to Figure 4 showing the relationship between ribosomal gene expression bias and the average expression bias of Eukaryotic Elongation Factor 1 (eEF1).

      -----------------------------------------------------------

      3. Relationship between Myc and sex-biased Rp expression The proposed link between Myc and sex-biased Rp expression is unclear. Panels D and E of Figure 1 show no consistent relationship: some cell types with strong Rp sex bias exhibit either high or low female Myc bias, or even a male bias. The linear regression in Figure 4I (R = 0.07, p = 0.59) confirms the lack of correlation. The authors should clarify this point and adopt a more cautious interpretation regarding Myc as a potential regulator of sex-biased Rp expression and cell size differences. Experimentally, using Myc hypomorph or heterozygous conditions would be more appropriate than complete knockdown to test its role.

      Response: Thank you for noting that the relationship between Myc expression bias and sex-biased RP expression required clarification. This response was prepared in consultation with Myc expert Dr. David Levens.

      We demonstrate that both Myc and RP gene expression exhibit an overall female bias in the body. The absence of a strong correlation across cell clusters does not invalidate this conclusion. Myc is a well-established master regulator of ribosome biogenesis, but its quantitative effects are complex. According to recent models of Myc-mediated gene regulation (Nie et al. 2012; Lin et al. 2012), Myc upregulates all actively transcribed genes. Because this regulation is global, the relationship between changes in Myc expression and corresponding changes in ribosomal protein gene expression depends on cell type. Moreover, (Lorenzin et al. 2016) demonstrated that ribosomal protein genes saturate at relatively low levels of Myc, which helps explain why we observe a correlation in head cell clusters-where Myc expression is lower-but not in body clusters.

      Importantly, on average, the female-specific Myc expression bias is stronger in body cell clusters than in head cell clusters, consistent with the stronger female bias in ribosomal protein gene expression observed in the head relative to the body.

      To make this relationship more transparent, we combined the head and body clusters, which yielded a strong overall correlation (Fig. 4J, replacing the previous Fig. 4H).

      To further strengthen the evidence linking ribosomal gene expression to cell size, we also examined the relationship between ribosomal gene expression bias and Elongation Factor 1 (eEF1) expression bias, a key component of protein biosynthesis during the elongation step of translation. The resulting correlation exceeds 0.9 (new Fig. 4H, added as an additional panel in Fig. 4).

      -----------------------------------------------------------

      4. Conclusions about fat body cell number I have concerns about drawing conclusions on sex differences in fat body cell number from single-nucleus transcriptomic data for two reasons:

      1- Drosophila fat body tissue is heterogeneous, comprising distinct subpopulations (e.g., visceral fat cells, subcuticular fat cells), some of which are sex-specific-such as fat cells associated with the spermathecae in females.

      Response: Thank you for giving us the opportunity to clarify our analysis of the FCA data. Our approach does account for subpopulations within the fat body as well as within other cell types. Based on gene expression profiles, we identify three fat body clusters, all of which are reported in Table S3. One small female-specific cluster (

      When all fat body clusters are combined into a single supercluster, this supercluster still shows a male bias. We have now clarified this point in the manuscript (line 113). Note that both subclusters of fat body are already shown in Fig. 1C and 1D.

      -----------------------------------------------------------

      2- Adult fat body cells can be multinucleated (PMID: 13723227). Apparent sex differences in nucleus number may reflect differences in specific subpopulations or degrees of multinucleation rather than true differences in cell number. To strengthen the conclusions, the analysis should be performed at the level of fat body subpopulations, distinguishing clusters where possible. Additionally, quantifying nuclei relative to actual cell number-as done for muscle tissue-would clarify whether observed sex differences reflect true variation in cell number or differences in nuclear content per cell.

      Response: Yes, some cells can be multinucleate. We specifically address this in the context of muscle cells, where multinucleation is prominent, and we also conducted experimental validation in this tissue. As noted above, our analysis is performed at the subpopulation level, since clusters are defined by expression similarity (Leiden resolution 4.0) rather than by annotation.

      Because our work relies on single-nucleus data, each nucleus is treated as an individual unit of analysis. Nevertheless, we observe genuine nuclear differences within each cluster. Importantly, the presence of multinucleated cells does not alter our conclusions; it simply represents one form of variation in cell number that can be thought of as a subcomponent of cell/nuclei number.

      -----------------------------------------------------------

      Minor corrections/points: 1-The term body size in the title does not accurately reflect the content of the paper. I recommend replacing it with organ size to better align with the study's focus.

      Response: Thank you for the suggestion.

      ----------------------------------------------------------- 2-The term sexual size dimorphism is somewhat inaccurate in this context. Sex differences in size would be more appropriate. The term sexual dimorphism typically refers to traits that exhibit two distinct forms in males and females-such as primary or secondary sexual characteristics like sex organs or sex combs. In contrast, size is a quantitative trait that follows a normal distribution. Although the average female may be larger than the average male, the distributions overlap, making the term dimorphism imprecise.

      Response: Thank you for the suggestion.

      -----------------------------------------------------------

      3-In Figure 2E, there appears to be an inconsistency between the text, figure legend, and the data presented. The text and legend state that the total volume of dorsal longitudinal flight muscle cells was quantified, whereas the graph indicates measurements of nuclear size. This discrepancy should be clarified.

      Response: Thank you for pointing this out. We figured out that Y-axis label in the graph was incorrect and it is now fixed.

      -----------------------------------------------------------

      4-The authors proposed: "This increased biosynthetic activity in fat body cells may contribute to cell size differences, but also to the regulation of body size via production of factors that mediate body growth via interorgan communication". Please note that this hypothesis has already been tested functionally in PMID: 39138201 and was shown to be incorrect. Sex differences in body size are completely independent of fat body sexual identity or any intrinsic sex differences within fat cells.

      __Response: __We thank the reviewer for the opportunity to discuss why the data shown in PMID 39138201 (Hérault et al. 2024) do not rule out a model in which the fat body contributes to the sex-specific regulation of body size via interorgan communication. The main reason data in Herault et al cannot rule out such a model is that they use wing size as a proxy for body size. This is in contrast to prior studies, such as (Rideout et al. 2015), in which pupal volume was used to directly measure body size and show a non-autonomous effect of sex determination gene transformer on body size. Measuring body size directly is a more precise readout of growth during the larval stages of development, as opposed to using adult wing area which reflects the growth of a single organ. It is also important to note that the diets used to rear flies in Herault and Rideout differ, which is an important consideration as females do not achieve their maximal size without high dietary protein levels (Millington et al. 2021). To ensure all these points are communicated to readers, we added text to this effect in the revised version of our manuscript.

      Added at line 254:

      "This increased biosynthetic activity in fat body cells may contribute to cell size differences, but also to the regulation of body size via production of factors that mediate body growth via interorgan communication (Colombani et al. 2003; Géminard et al. 2009; Rajan and Perrimon 2012; Sano et al. 2015; Koyama and Mirth 2016). Indeed, one study showed the sexual identity of the fat body influenced pupal volume, which is an accurate readout of larval growth (Rideout et al. 2015; Delanoue et al. 2010). While a recent study suggests that male-female differences in body size were regulated independently of fat body sexual identity (Hérault et al. 2024), this study measured the growth of a single organ, the wing, as a proxy for body size. Additional studies are therefore needed to resolve whether fat body protein synthesis plays an important role in regulating sex differences in body size."

      -----------------------------------------------------------

      *5-The authors state: "This demonstrate that Myc plays a key role in regulating the sex difference in nucleolar size." This is an overstatement given the functional data presented. The claim should be toned down to reflect the limited evidence.

      **Referee cross-commenting**

      I completely agree with the main comments of Reviewer 1, as they address the paper's core.*

      Response: We have addressed the comments of Reviewer 1 in the response to reviewer's comments above.

      -----------------------------------------------------------

      *Reviewer #3 (Significance (Required)):

      The main novelty and strongest aspect of this study is its use of single-nucleus transcriptomic data from the adult Drosophila Cell Atlas to investigate how different cell types adopt distinct strategies to generate sex differences in organ size-either by increasing cell size or by altering cell number. Previous studies have largely focused on specific tissues, whereas this work provides a comprehensive, organism-wide view that encompasses all tissues, enabling direct cross-comparison between organs. This represents a clear advance in the field, primarily from a technical perspective, by leveraging organism-wide single-cell transcriptomics. The main limitations lie in the lack of functional experiments and mechanistic insights. Moreover, the proposed mechanism-differences in Myc gene dosage or expression levels-is not entirely novel, as Myc dosage has previously been implicated in contributing to sex differences in body size (PMID: 28064166).*

      Response: We do have some functional testing in the 3 tissues, flight muscle, heart and fat body, however, providing mechanistic insights is beyond the scope of this paper. The paper suggested by the reviewer is an example of one attempt to provide such a mechanism, probably not the only one. We hope that our rich data that we have assembled in this paper provide resources for generating hypotheses and stimulate further research.

      -----------------------------------------------------------

      References

      Cadart, Clotilde, and Rebecca Heald. 2022. "Scaling of Biosynthesis and Metabolism with Cell Size." Molecular Biology of the Cell 33 (9): pe5. https://doi.org/10.1091/mbc.E21-12-0627.

      Diegmiller, Rocky, Caroline A. Doherty, Tomer Stern, Jasmin Imran Alsous, and Stanislav Y. Shvartsman. 2021. "Size Scaling in Collective Cell Growth." Development (Cambridge, England) 148 (18): dev199663. https://doi.org/10.1242/dev.199663.

      Gallant, Peter. 2013. "Myc Function in Drosophila." Cold Spring Harbor Perspectives in Medicine 3 (10): a014324. https://doi.org/10.1101/cshperspect.a014324.

      Grewal, Savraj S., Ling Li, Amir Orian, Robert N. Eisenman, and Bruce A. Edgar. 2005. "Myc-Dependent Regulation of Ribosomal RNA Synthesis during Drosophila Development." Nature Cell Biology 7 (3): 295-302. https://doi.org/10.1038/ncb1223.

      Hérault, Chloé, Thomas Pihl, and Bruno Hudry. 2024. "Cellular Sex throughout the Organism Underlies Somatic Sexual Differentiation." Nature Communications 15 (1): 6925. https://doi.org/10.1038/s41467-024-51228-6.

      Lin, Charles Y., Jakob Lovén, Peter B. Rahl, et al. 2012. "Transcriptional Amplification in Tumor Cells with Elevated C-Myc." Cell 151 (1): 56-67. https://doi.org/10.1016/j.cell.2012.08.026.

      Lorenzin, Francesca, Uwe Benary, Apoorva Baluapuri, et al. 2016. "Different Promoter Affinities Account for Specificity in MYC-Dependent Gene Regulation." eLife 5 (July): e15161. https://doi.org/10.7554/eLife.15161.

      Ma, Tian-Hsiang, Po-Hsiang Chen, Bertrand Chin-Ming Tan, and Szecheng J. Lo. 2018. "Size Scaling of Nucleolus in Caenorhabditis Elegans Embryos." Biomedical Journal 41 (5): 333-36. https://doi.org/10.1016/j.bj.2018.07.003.

      Marygold, Steven J., John Roote, Gunter Reuter, et al. 2007. "The Ribosomal Protein Genes and Minute Loci of Drosophila Melanogaster." Genome Biology 8 (10): R216. https://doi.org/10.1186/gb-2007-8-10-r216.

      Millington, Jason W., George P. Brownrigg, Charlotte Chao, et al. 2021. "Female-Biased Upregulation of Insulin Pathway Activity Mediates the Sex Difference in Drosophila Body Size Plasticity." eLife 10 (January): e58341. https://doi.org/10.7554/eLife.58341.

      Nie, Zuqin, Gangqing Hu, Gang Wei, et al. 2012. "C-Myc Is a Universal Amplifier of Expressed Genes in Lymphocytes and Embryonic Stem Cells." Cell 151 (1): 68-79. https://doi.org/10.1016/j.cell.2012.08.033.

      Ponti, Donatella. 2025. "The Nucleolus: A Central Hub for Ribosome Biogenesis and Cellular Regulatory Signals." International Journal of Molecular Sciences 26 (9): 4174. https://doi.org/10.3390/ijms26094174.

      Rideout, Elizabeth J., Marcus S. Narsaiya, and Savraj S. Grewal. 2015. "The Sex Determination Gene Transformer Regulates Male-Female Differences in Drosophila Body Size." PLOS Genetics 11 (12): e1005683. https://doi.org/10.1371/journal.pgen.1005683.

      Schmoller, Kurt M., and Jan M. Skotheim. 2015. "The Biosynthetic Basis of Cell Size Control." Trends in Cell Biology 25 (12): 793-802. https://doi.org/10.1016/j.tcb.2015.10.006.

      Schultz, J. 1929. "The Minute Reaction in the Development of DROSOPHILA MELANOGASTER." Genetics 14 (4): 366-419. https://doi.org/10.1093/genetics/14.4.366.

      Serbanescu, Diana, Nikola Ojkic, and Shiladitya Banerjee. 2022. "Cellular Resource Allocation Strategies for Cell Size and Shape Control in Bacteria." The FEBS Journal 289 (24): 7891-906. https://doi.org/10.1111/febs.16234.

    1. Author response:

      The following is the authors’ response to the previous reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      This research group has consistently performed cutting-edge research aiming to understand the role of hormones in the control of social behaviors, specifically by utilizing the genetically-tractable teleost fish, medaka, and the current work is no exception. The overall claim they make, that estrogens modulate social behaviors in males and females is supported, with important caveats. For one, there is no evidence these estrogens are generated by "neurons" as would be assumed by their main claim that it is NEUROestrogens that drive this effect. While indeed the aromatase they have investigated is expressed solely in the brain, in most teleosts, brain aromatase is only present in glial cells (astrocytes, radial glia). The authors should change this description so as not to mislead the reader. Below I detail more specific strengths and weaknesses of this manuscript.

      We thank the reviewer for this positive evaluation of our work and for the helpful comments and suggestions. Regarding the concern that the term “neuroestrogens” may be misleading, we addressed this in the previous revision by consistently replacing it throughout the manuscript with “brain-derived estrogens” or “brain estrogens.”

      In addition, the following sentence was added to the Introduction (line 61): “In teleost brains, including those of medaka, aromatase is exclusively localized in radial glial cells, in contrast to its neuronal localization in rodent brains (Forlano et al., 2001; Diotel et al., 2010; Takeuchi and Okubo, 2013).”

      Strenghth:

      Excellent use of the medaka model to disentangle the control of social behavior by sex steroid hormones 

      The findings are strong for the most part because deficits in the mutants are restored by the molecule (estrogens) that was no longer present due to the mutation 

      Presentation of the approach and findings are clear, allowing the reader to make their own inferences and compare them with the authors' 

      Includes multiple follow-up experiments, which leads to tests of internal replication and an impactful mechanistic proposal 

      Findings are provocative not just for teleost researchers, but for other species since, as the authors point out, the data suggest mechanisms of estrogenic control of social behaviors may be evolutionary ancient 

      We thank the reviewer again for their positive evaluation of our work.

      Weakness:

      As stated in the summary, the authors are attributing the estrogen source to neurons and there isn't evidence this is the case. The impact of the findings doesn't rest on this either

      As mentioned above, we addressed this in the previous revision by replacing “neuroestrogens” with “brain-derived estrogens” or “brain estrogens” throughout the manuscript. In addition, the following sentence was added to the Introduction (line 61): “In teleost brains, including those of medaka, aromatase is exclusively localized in radial glial cells, in contrast to its neuronal localization in rodent brains (Forlano et al., 2001; Diotel et al., 2010; Takeuchi and Okubo, 2013).”

      The d4 versus d8 esr2a mutants showed different results for aggression. The meaning and implications of this finding are not discussed, leaving the reader wondering

      This comment is the same as one raised in the first review (Reviewer #1’s comment 2 on weaknesses), which we already addressed in our initial revision. For the reviewer’s convenience, we provide the response below:

      Line 300: As the reviewer correctly noted, circles were significantly reduced in mutant males of the Δ8 line, whereas no significant reduction was observed in those of the Δ4 line. However, a tendency toward reduction was evident in the Δ4 line (P = 0.1512), and both lines showed significant differences in fin displays. Based on these findings, we believe our conclusion that esr2a<sup>−/−</sup> males exhibit reduced aggression remains valid. To clarify this point and address potential reader concerns, we have revised the text as follows: “esr2a<sup>−/−</sup> males exhibited significantly fewer fin displays (P = 0.0461 and 0.0293 for Δ8 and Δ4 lines, respectively) and circles (P = 0.0446 and 0.1512 for Δ8 and Δ4 lines, respectively) than their wild-type siblings (Fig. 5L; Fig. S8E), suggesting less aggression” was edited to read “esr2a<sup>−/−</sup> males from both the Δ8 and Δ4 lines exhibited significantly fewer fin displays than their wild-type siblings (P = 0.0461 and 0.0293, respectively). Circles followed a similar pattern, with a significant reduction in the Δ8 line (P = 0.0446) and a comparable but non-significant decrease in the Δ4 line (P =0.1512) (Figure 5L, Figure 5—figure supplement 3E), showing less aggression.”

      Lack of attribution of previous published work from other research groups that would provide the proper context of the present study

      This comment is also the same as one raised in the first review (Reviewer #1’s comment 3 on weaknesses). In our previous revision, in response to this comment, we cited the relevant references (Hallgren et al., 2006; O’Connell and Hofmann, 2012; Huffman et al., 2013; Jalabert et al., 2015; Yong et al., 2017; Alward et al., 2020; Ogino et al., 2023) in the appropriate sections. We also added the following new references and revised the Introduction and Discussion accordingly:

      (2) Alward BA, Laud VA, Skalnik CJ, York RA, Juntti SA, Fernald RD. 2020. Modular genetic control of social status in a cichlid fish. Proceedings of the National Academy of Sciences of the United States of America 117:28167–28174. DOI: https://doi.org/10.1073/pnas.2008925117

      (39) O’Connell LA, Hofmann HA. 2012. Social status predicts how sex steroid receptors regulate complex behavior across levels of biological organization. Endocrinology 153:1341–1351. DOI:https://doi.org/10.1210/en.2011-1663

      (54) Yong L, Thet Z, Zhu Y. 2017. Genetic editing of the androgen receptor contributes to impaired male courtship behavior in zebrafish. Journal of Experimental Biology 220:3017–3021.DOI:https://doi.org/10.1242/jeb.161596

      There are a surprising number of citations not included; some of the ones not included argue against the authors' claims that their findings were "contrary to expectation"

      In our previous revision, we cited the relevant references (Hallgren et al., 2006; O’Connell and Hofmann, 2012; Huffman et al., 2013; Jalabert et al., 2015) in the Introduction. We also revised the text to remove phrases such as “contrary to expectation” and “unexpected.”

      The experimental design for studying aggression in males has flaws. A standard test like a residentintruder test should be used.

      Following this comment, we have attempted additional aggression assays using the resident-intruder paradigm. However, these experiments did not produce consistent or interpretable results. As noted in our previous revision, medaka naturally form shoals and exhibit weak territoriality, and even slight differences in dominance between a resident and an intruder can markedly increase variability, reducing data reliability. Therefore, we believe that the approach used in the present study provides a more suitable assessment of aggression in medaka, regardless of territorial tendencies. We will continue to explore potential refinements in future studies and respectfully ask the reviewer to evaluate the present work based on the assay used here.

      While they investigate males and females, there are fewer experiments and explanations for the female results, making it feel like a small addition or an aside

      While we did not adopt this comment in our previous revision, we have carefully reconsidered the reviewers’ feedback and have now decided to remove the female data. This change allows us to present a more focused and cohesive story centered on males. The specific revisions are outlined below:

      Abstract

      Line 25: The text “, thereby revealing a previously unappreciated mode of action of brain-derived estrogens. We additionally show that female fish lacking Cyp19a1b are less receptive to male courtship and conversely court other females, highlighting the significance of brain-derived estrogens in establishing sex-typical behaviors in both sexes.” has been revised to “. Taken together, these findings reveal a previously unappreciated mode of action of brain-derived estrogens in shaping male-typical behaviors.”

      Results

      Line 88: The text “Loss of cyp19a1b function in these fish was verified by measuring brain and peripheral levels of sex steroids. As expected, brain estradiol-17β (E2) in both male and female homozygous mutants (cyp19a1b<sup>−/−</sup>) was significantly reduced to 16% and 50%, respectively, of the levels in their wild-type (cyp19a1b<sup>+/+</sup>) siblings (P = 0.0037, males; P = 0.0092, females) (Fig. 1, A and B). In males, brain E2 in heterozygotes (cyp19a1b<sup>−/−</sup>) was also reduced to 45% of the level in wild-type siblings (P = 0.0284) (Fig. 1A), indicating a dosage effect of cyp19a1b mutation. In contrast, peripheral E2 levels were unaltered in both cyp19a1b<sup>−/−</sup> males and females (Fig. S1, C and D), consistent with the expected functioning of Cyp19a1b primarily in the brain. Strikingly, brain levels of testosterone, as opposed to E2, increased 2.2-fold in cyp19a1b<sup>−/−</sup> males relative to wild-type siblings (P = 0.0006) (Fig. 1A). Similarly, brain 11KT levels in cyp19a1b<sup>−/−</sup> males and females increased 6.2- and 1.9-fold, respectively, versus wild-type siblings (P = 0.0007, males; P = 0.0316, females) (Fig. 1, A and B). These results show that cyp19a1b-deficient fish have reduced estrogen levels coupled with increased androgen levels in the brain, confirming the loss of cyp19a1b function. They also suggest that the majority of estrogens in the male brain and half of those in the female brain are synthesized locally in the brain. In addition, peripheral 11KT levels in cyp19a1b<sup>−/−</sup> males and females increased 3.7- and 1.8-fold, respectively (P = 0.0789, males; P = 0.0118, females) (Fig. S1, C and D), indicating peripheral influence in addition to central effects.” has been revised to “Loss of cyp19a1b function in these fish was verified by measuring brain and peripheral levels of sex steroids in males. As expected, brain estradiol-17β (E2) in homozygous mutants (cyp19a1b<sup>−/−</sup>) was significantly reduced to 16% of the levels in wild-type (cyp19a1b<sup>+/+</sup>) siblings (P = 0.0037) (Figure 1A). Brain E2 in heterozygotes (cyp19a1b<sup>+/−</sup>) was also reduced to 45% of wild-type levels (P = 0.0284) (Figure 1A), indicating a dosage effect of the cyp19a1b mutation. In contrast, peripheral E2 levels were unaltered in cyp19a1b<sup>−/−</sup> males (Figure 1B), consistent with the expected functioning of Cyp19a1b primarily in the brain. Strikingly, brain testosterone levels, as opposed to E2, increased 2.2-fold in cyp19a1b<sup>−/−</sup> males relative to wild-type siblings (P = 0.0006) (Figure 1A). Similarly, brain 11KT levels increased 6.2-fold (P = 0.0007) (Figure 1A). These results indicate that cyp19a1b-deficient males have reduced estrogen coupled with elevated androgen levels in the brain, confirming the loss of cyp19a1b function. They also suggest that the majority of estrogens in the male brain are synthesized locally in the brain. Peripheral 11KT levels also increased 3.7-fold in cyp19a1b<sup>−/−</sup> males (P = 0.0789) (Figure 1B), indicating peripheral influence in addition to central effects.”

      Line 211: “expression of vt in the pNVT of cyp19a1b<sup>−/−</sup> males was significantly reduced to 18% as compared with cyp19a1b<sup>+/+</sup> males (P = 0.0040), a level comparable to that observed in females” has been revised to “expression of vt in the pNVT of cyp19a1b<sup>−/−</sup> males was significantly reduced to 18% as compared with cyp19a1b<sup>+/+</sup> males (P = 0.0040).”

      The subsection entitled “cyp19a1b-deficient females are less receptive to males and instead court other females,” which followed line 311, has been removed.

      Discussion

      The two paragraphs between lines 373 and 374, which addressed the female data, have been removed.

      Materials and methods

      Line 433: “males and females” has been changed to “males”.

      Line 457: “focal fish” has been changed to “focal male”.

      Line 458: “stimulus fish” has been changed to “stimulus female”.

      Line 458: “Fig. 6, E and F, ” has been deleted.

      Line 460: “; wild-type males in Fig. 6, A to C” has been deleted.

      Line 466: The text “The period of interaction/recording was extended to 2 hours in tests of courtship displays received from the stimulus esr2b-deficient female and in tests of mating behavior between females, because they take longer to initiate courtship (12). In tests using an esr2b-deficient female as the stimulus fish, where the latency to spawn could not be calculated because these fish were unreceptive to males and did not spawn, the sexual motivation of the focal fish was instead assessed by counting the number of courtship displays and wrapping attempts in 30 min. The number of these mating acts was also counted in tests to evaluate the receptivity of females. In tests of mating behavior between two females, the stimulus female was marked with a small notch in the caudal fin to distinguish it from the focal female.” has been revised to “In tests using an esr2b-deficient female as the stimulus fish, the latency to spawn could not be calculated because the female was unreceptive to males and did not spawn. Therefore, the sexual motivation of the focal male was assessed by counting the number of courtship displays and wrapping attempts in 30 min. To evaluate courtship displays performed by stimulus esr2bdeficient females toward focal males, the recording period was extended to 2 hours, as these females take longer to initiate courtship (Nishiike et al., 2021). In all video analyses, the researcher was blind to the fish genotype and treatment.”

      Line 499: “brains dissected from males and females of the cyp19a1b-deficient line (analysis of ara, arb, vt, gal, npba, and esr2b) and males of the esr1-, esr2a-, and esr2b-deficient lines” has been revised to “male brains from the cyp19a1b-deficient line (analysis of ara, arb, vt, and gal) and from the esr1-, esr2a-, and esr2b-deficient lines.”

      Line 504: “After color development for 15 min (gal), 40 min (npba), 2 hours (vt), or overnight (ara, arb, and esr2b)” has been revised to “After color development for 15 min (gal), 2 hours (vt), or overnight (ara and arb).”

      Line 516: “Thermo Fisher Scientific, Waltham, MA” has been changed to “Thermo Fisher Scientific” to avoid redundancy.

      Line 565: The subsection entitled “Measurement of spatial distances between fish” has been removed.

      Line 585: “6/10 cyp19a1b<sup>+/+</sup>, 3/10 cyp19a1b<sup>+/−</sup>, and 6/10 cyp19a1b<sup>−/−</sup> females were excluded in Fig. 6B;” has been deleted.

      References

      The following references have been removed:

      Capel B. 2017. Vertebrate sex determination: evolutionary plasticity of a fundamental switch. Nature Reviews Genetics 18:675–689. DOI: https://doi.org/10.1038/nrg.2017.60

      Hiraki T, Nakasone K, Hosono K, Kawabata Y, Nagahama Y, Okubo K. 2014. Neuropeptide B is femalespecifically expressed in the telencephalic and preoptic nuclei of the medaka brain. Endocrinology 155:1021–1032. DOI: https://doi.org/10.1210/en.2013-1806

      Juntti SA, Hilliard AT, Kent KR, Kumar A, Nguyen A, Jimenez MA, Loveland JL, Mourrain P, Fernald RD. 2016. A neural basis for control of cichlid female reproductive behavior by prostaglandin F2α. Current Biology 26:943–949. DOI: https://doi.org/10.1016/j.cub.2016.01.067

      Kimchi T, Xu J, Dulac C. 2007. A functional circuit underlying male sexual behaviour in the female mouse brain. Nature 448:1009–1014. DOI: https://doi.org/10.1038/nature06089

      Kobayashi M, Stacey N. 1993. Prostaglandin-induced female spawning behavior in goldfish (Carassius auratus) appears independent of ovarian influence. Hormones and Behavior 27:38–55.

      DOI:https://doi.org/10.1006/hbeh.1993.1004

      Liu H, Todd EV, Lokman PM, Lamm MS, Godwin JR, Gemmell NJ. 2017. Sexual plasticity: a fishy tale. Molecular Reproduction and Development 84:171–194. DOI: https://doi.org/10.1002/mrd.22691

      Munakata A, Kobayashi M. 2010. Endocrine control of sexual behavior in teleost fish. General and Comparative Endocrinology 165:456–468. DOI: https://doi.org/10.1016/j.ygcen.2009.04.011

      Nugent BM, Wright CL, Shetty AC, Hodes GE, Lenz KM, Mahurkar A, Russo SJ, Devine SE, McCarthy MM. 2015. Brain feminization requires active repression of masculinization via DNA methylation. Nature Neuroscience 18:690–697. DOI: https://doi.org/10.1038/nn.3988

      Shaw K, Therrien M, Lu C, Liu X, Trudeau VL. 2023. Mutation of brain aromatase disrupts spawning behavior and reproductive health in female zebrafish. Frontiers in Endocrinology 14:1225199.

      DOI:https://doi.org/10.3389/fendo.2023.1225199

      Stacey NE. 1976. Effects of indomethacin and prostaglandins on the spawning behaviour of female goldfish. Prostaglandins 12:113–126. DOI: https://doi.org/10.1016/s0090-6980(76)80010-x

      Figure 1

      Panel B, which originally showed steroid levels in female brains, has been replaced with steroid levels in the periphery of males, originally presented in Figure S1, panel C. Accordingly, the legend “(A and B) Levels of E2, testosterone, and 11KT in the brain of adult cyp19a1b<sup>+/+</sup>, cyp19a1b<sup>+/−</sup>, and cyp19a1b<sup>−/−</sup> males (A) and females (B) (n = 3 per genotype and sex).” has been revised to “(A, B) Levels of E2, testosterone, and 11KT in the brain (A) and periphery (B) of adult cyp19a1b<sup>+/+</sup>, cyp19a1b<sup>+/−</sup>, and cyp19a1b<sup>−/−</sup> males (n = 3 per genotype).”

      Figure 3

      The female data have been deleted from Figure 3. The revised Figure 3 is presented.

      The corresponding legend text has been revised as follows:

      Line 862: “males and females (n = 4 and 5 per genotype for males and females, respectively)” has been changed to “males (n = 4 per genotype)”.

      Line 864: “males and females (n = 4 except for cyp19a1b<sup>+/+</sup> males, where n = 3)” has been changed to “males (n = 3 and 4, respectively)”.

      Figure 6

      Figure 6 and its legend have been removed.

      Figure 1—figure supplement 1

      Panel C, showing male data, has been moved to Figure 1B, as described above, while panel D, showing female data, has been deleted. The corresponding legend “(C and D) Levels of E2, testosterone, and 11KT in the periphery of adult cyp19a1b<sup>+/+</sup>, cyp19a1b<sup>+/−</sup>, and cyp19a1b<sup>−/−</sup> males (C) and females (D) (n = 3 per genotype and sex). Statistical differences were assessed by Bonferroni’s post hoc test (C and D). Error bars represent SEM. *P < 0.05.” has also been removed.

      Line 804: Following this change, the figure title has been updated from “Generation of cyp19a1bdeficient medaka and evaluation of peripheral sex steroid levels” to “Generation of cyp19a1b-deficient medaka.”

      The statistics comparing "experimental to experimental" and "control to experimental" isn't appropriate 

      This comment is the same as one raised in the first review (Reviewer #1’s comment 7 on weaknesses), which we already addressed in our initial revision. For the reviewer’s convenience, we provide the response below:

      The reviewer raised concerns about the statistical analysis used for Figures 4C and 4E, suggesting that Bonferroni’s test should be used instead of Dunnett’s test. However, Dunnett’s test is commonly used to compare treatment groups to a reference group that receives no treatment, as in our study. Since we do not compare the treated groups with each other, we believe Dunnett’s test is the most appropriate choice.

      Line 576: The reviewer’s concern may have arisen from the phrase “comparisons between control and experimental groups” in the Materials and methods. We have revised it to “comparisons between untreated and E2-treated groups in Figure 4C and D” for clarity.

      Reviewer #3 (Public Review):

      Summary:

      Taking advantage of the existence in fish of two genes coding for estrogen synthase, the enzyme aromatase, one mostly expressed in the brain (Cyp19a1b) and the other mostly found in the gonads (Cyp19a1a), this study investigates the role of brain-derived estrogens in the control of sexual and aggressive behavior in medaka. The constitutive deletion of Cyp19a1b markedly reduced brain estrogen content in males and to a lesser extent in females. These effects are accompanied by reduced sexual and aggressive behavior in males and reduced preference for males in females. These effects are reversed by adult treatment with supporting a role for estrogens. The deletion of Cyp19a1b is associated with a reduced expression of the genes coding for the two androgen receptors, ara and arb, in brain regions involved in the regulation of social behavior. The analysis of the gene expression and behavior of mutants of estrogen receptors indicates that these effects are likely mediated by the activation of the esr1 and esr2a isoforms. These results provide valuable insight into the role of estrogens in social behavior in the most abundant vertebrate taxon, however the conclusion of brain-derived estrogens awaits definitive confirmation.

      We thank this reviewer for their positive evaluation of our work and comments that have improved the manuscript.

      Strength:

      Evaluation of the role of brain "specific" Cyp19a1 in male teleost fish, which as a taxon are more abundant and yet proportionally less studied that the most common birds and rodents. Therefore, evaluating the generalizability of results from higher vertebrates is important. This approach also offers great potential to study the role of brain estrogen production in females, an understudied question in all taxa.

      Results obtained from multiple mutant lines converge to show that estrogen signaling, likely synthesized in the brain drives aspects of male sexual behavior.

      The comparative discussion of the age-dependent abundance of brain aromatase in fish vs mammals and its role in organization vs activation is important beyond the study of the targeted species.  - The authors have made important corrections to tone down some of the conclusions which are more in line with the results. 

      We thank the reviewer again for their positive evaluation of our work and the revisions we have made.

      weaknesses:

      No evaluation of the mRNA and protein products of Cyp19a1b and ESR2a are presented, such that there is no proper demonstration that the mutation indeed leads to aromatase reduction. The conclusion that these effects dependent on brain derived estrogens is therefore only supported by measures of E2 with an EIA kit that is not validated. No discussion of these shortcomings is provided in the discussion thus further weakening the conclusion manuscript.

      In response to this and other comments, we have now provided direct validation that the cyp19a1b mutation in our medaka leads to loss of function. Real-time PCR analysis showed that cyp19a1b transcript levels in the brain were reduced by approximately half in cyp19a1b<sup>+/−</sup> males and were nearly absent in cyp19a1b<sup>−/−</sup> males, consistent with nonsense-mediated mRNA decay

      In addition, AlphaFold 3-based structural modeling indicated that the mutant Cyp19a1b protein lacks essential motifs, including the aromatic region and heme-binding loop, and exhibits severe conformational distortion (see figure; key structural features are annotated as follows: membrane helix (blue), aromatic region (red), and heme-binding loop (orange)). 

      Results:

      Line 101: The following text has been added: “Loss of cyp19a1b function was further confirmed by measuring cyp19a1b transcript levels in the brain and by predicting the three-dimensional structure of the mutant protein. Real-time PCR revealed that transcript levels were reduced by half in cyp19a1b<sup>+/−</sup> males and were nearly undetectable in cyp19a1b<sup>−/−</sup> males, presumably as a result of nonsense-mediated mRNA decay (Lindeboom et al., 2019) (Figure 1C). The wild-type protein, modeled by AlphaFold 3, exhibited a typical cytochrome P450 fold, including the membrane helix, aromatic region, and hemebinding loop, all arranged in the expected configuration (Figure 1—figure supplement 1C). The mutant protein, in contrast, was severely truncated, retaining only the membrane helix (Figure 1—figure supplement 1C). The absence of essential domains strongly indicates that the allele encodes a nonfunctional Cyp19a1b protein. Together, transcript and structural analyses consistently demonstrate that the mutation generated in this study causes a complete loss of cyp19a1b function.”

      Materials and methods

      Line 438: A subsection entitled “Real-time PCR” has been added. The text of this subsection is as follows: “Total RNA was isolated from the brains of cyp19a1b<sup>+/+</sup>, cyp19a1b<sup>+/−</sup>, and cyp19a1b<sup>−/−</sup> males using the RNeasy Plus Universal Mini Kit (Qiagen, Hilden, Germany). cDNA was synthesized with the SuperScript VILO cDNA Synthesis Kit (Thermo Fisher Scientific, Waltham, MA). Real-time PCR was performed on the LightCycler 480 System II using the LightCycler 480 SYBR Green I Master (Roche Diagnostics). Melting curve analysis was conducted to verify that a single amplicon was obtained in each sample. The β-actin gene (actb; GenBank accession number NM_001104808) was used to normalize the levels of target transcripts. The primers used for real-time PCR are shown in Supplementary file 2.”

      Line 448: A subsection entitled “Protein structure prediction” has been added. The text of this subsection is as follows: “Structural predictions of Cyp19a1b proteins were conducted using AlphaFold 3 (Abramson et al., 2024). Amino acid sequences corresponding to the wild-type allele and the mutant allele generated in this study were submitted to the AlphaFold 3 prediction server. The resulting models were visualized with PyMOL (Schrödinger, New York, NY), and key structural features, including the membrane helix, aromatic region, and heme-binding loop, were annotated.”

      References

      The following two references have been added:

      Abramson J, Adler J, Dunger J, Evans R, Green T, Pritzel A, Ronneberger O, Willmore L, Ballard AJ, Bambrick J, Bodenstein SW, Evans DA, Hung CC, O'Neill M, Reiman D, Tunyasuvunakool K, Wu Z, Žemgulytė A, Arvaniti E, Beattie C, Bertolli O, Bridgland A, Cherepanov A, Congreve M, CowenRivers AI, Cowie A, Figurnov M, Fuchs FB, Gladman H, Jain R, Khan YA, Low CMR, Perlin K, Potapenko A, Savy P, Singh S, Stecula A, Thillaisundaram A, Tong C, Yakneen S, Zhong ED, Zielinski M, Žídek A, Bapst V, Kohli P, Jaderberg M, Hassabis D, Jumper JM. 2024. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630:493–500. DOI: https://doi.org/10.1038/s41586-024-07487-w

      Lindeboom RGH, Vermeulen M, Lehner B, Supek F. 2019. The impact of nonsense-mediated mRNA decay on genetic disease, gene editing and cancer immunotherapy. Nature Genetics 51:1645–1651.DOI:https://doi.org/10.1038/s41588-019-0517-5

      Figure 1

      The real-time PCR results described above have been incorporated in Figure 1, panel C, with the corresponding legend provided below (line 788).

      (C) Brain cyp19a1b transcript levels in cyp19a1b<sup>+/+</sup>, cyp19a1b<sup>+/−</sup>, and cyp19a1b<sup>−/−</sup> males (n = 6 per genotype). Mean value for cyp19a1b<sup>+/+</sup> males was arbitrarily set to 1.

      The subsequent panels have been renumbered accordingly. The entirety of the revised Figure 1.

      Figure 1—figure supplement 1

      The AlphaFold 3-generated structural models described above have been incorporated in Figure 1— figure supplement 1, panel C, with the corresponding legend provided below (line 811).

      (C) Predicted three-dimensional structures of wild-type (left) and mutant (right) Cyp19a1b proteins. Key structural features are annotated as follows: membrane helix (blue), aromatic region (red), and heme-binding loop (orange).

      The entirety of the revised Figure 1—figure supplement 1 is presented

      The information on the primers used for real-time PCR has been included in Supplementary file 2.

      The functional deficiency of esr2a was already addressed in the previous revision. For clarity, we have reproduced the relevant information here.

      A previous study reported that female medaka lacking esr2a fail to release eggs due to oviduct atresia (Kayo et al., 2019, Sci Rep 9:8868). Similarly, in this study, some esr2a-deficient females exhibited spawning behavior but were unable to release eggs, although the sample size was limited (Δ8 line: 2/3; Δ4 line: 1/1). In contrast, this was not observed in wild-type females (Δ8 line: 0/12; Δ4 line: 0/11). These results support the effective loss of esr2a function. To incorporate this information into the manuscript, the following text has been added to the Materials and methods (line 423): “A previous study reported that esr2a-deficient female medaka cannot release eggs due to oviduct atresia (Kayo et al., 2019). Likewise, some esr2a-deficient females generated in this study, despite the limited sample size, exhibited spawning behavior but were unable to release eggs (Δ8 line: 2/3; Δ4 line: 1/1), while such failure was not observed in wild-type females (Δ8 line: 0/12; Δ4 line: 0/11). These results support the effective loss of esr2a function.”

      Most experiments are weakly powered (low sample size).

      This comment is essentially the same as one raised in the first review (Reviewer #3’s comment 7 on weaknesses). We acknowledge the reviewer’s concern that the histological analyses were weakly powered due to the limited sample size. In our earlier revision, we responded as follows:

      Histological analyses were conducted with a relatively small sample size, as our previous experience suggested that interindividual variability in the results would not be substantial. Since significant differences were detected in many analyses, further increasing the sample size was deemed unnecessary.

      The variability of the mRNA content for a same target gene between experiments (genotype comparison vs E2 treatment comparison) raises questions about the reproducibility of the data (apparent disappearance of genotype effect).

      This comment is the same as one raised in the first review (Reviewer #3’s comment 8 on weaknesses), which we already addressed in our initial revision. For the reviewer’s convenience, we provide the response below:

      As the reviewer pointed out, the overall area of ara expression is larger in Figure 2J than in Figure 2F. However, the relative area ratios of ara expression among brain nuclei are consistent between the two figures, indicating the reproducibility of the results. Thus, this difference is unlikely to affect the conclusions of this study.

      Additionally, the differences in ara expression in pPPp and arb expression in aPPp between wild-type and cyp19a1b-deficient males appear less pronounced in Figures 2J and 2K than in Figures 2F and 2H. This is likely attributable to the smaller sample size used in the experiments for Figures 2J and 2K, resulting in less distinct differences. However, as the same genotype-dependent trends are observed in both sets of figures, the conclusion that ara and arb expression is reduced in cyp19a1b-deficient male brains remains valid.

      Conclusions:

      Overall, the claims regarding role of estrogens originating in the brain on male sexual behavior is supported by converging evidence from multiple mutant lines. The role of brain-derived estrogens on gene expression in the brain is weaker as are the results in females. 

      We appreciate the reviewer’s positive evaluation of our findings on male behavior. The concern regarding the role of brain-derived estrogens in gene expression has been addressed in our rebuttal, and the female data have been removed so that the analysis now focuses on males. The specific revisions for removing the female data are described in Response to reviewer #1’s comment 6 on weaknesses.

      Recommendations For The Authors:

      Reviewer #1 (Recommendations For The Authors):

      The manuscript is improved slightly. I am thankful the authors addressed some concerns, but for several concerns the referees raised, the authors acknowledged them yet did not make corresponding changes to the manuscript or disagreed that they were issues at all without explanation. All reviewers had issues with the imbalanced focus on males versus females and the male aggression assay. Yet, they did not perform additional experiments or even make changes to the framing and scope of the manuscript. If the authors had removed the female data, they may have had a more cohesive story, but then they would still be left with inadequate behavior assays in the males. If the authors don't have the time or resources to perform the additional work, then they should have said so. However, the work would be incomplete relative to the claims. That is a key point here. If they change their scope and claims, the authors avoid overstating their findings. I want to see this work published because I believe it moves the field forward. But the authors need to be realistic in their interpretations of their data. 

      In response to this and related comments, we have removed the female data and focused the manuscript on analyses in males. The specific revisions are described in Response to reviewer #1’s comment 6 on weaknesses. Additionally, we have validated that the cyp19a1b mutation in our medaka leads to loss of function (see Response to reviewer #3’s comment 1 on weaknesses), which further strengthens the reliability of our conclusions regarding male behavior.

      I agree with the reviewer who said we need to see validation of the absence of functional cyp19a1 b in the brain. However, the results from staining for the protein and performing in situ could be quizzical. Indeed, there aren't antibodies that could distinguish between aromatase a and b, and it is not uncommon for expression of a mutated gene to be normal. One approach they could do is measure aromatase activity, but they are *sort of* doing that by measuring brain E2. It's not perfect, but we teleost folks are limited in these areas. At the very least, they should show the predicted protein structure of the mutated aromatase alleles. It could show clearly that the tertiary structure is utterly absent, giving more support to the fact that their aromatase gene is non-functional. 

      As noted above, we have further validated the loss of cyp19a1b function by measuring cyp19a1b transcript levels in the brain and predicting the three-dimensional structure of the mutant protein. These analyses confirmed that cyp19a1b function is indeed lost, thereby increasing the reliability of our conclusions. For further details, please refer to Response to reviewer #3’s comment 1 on weaknesses.

      With all of this said, the work is important, and it is possible that with a reframing of the impact of their work in the context of their findings, I could consider the work complete. I think with a proper reframing, the work is still impactful. 

      In accordance with this feedback, and as described above, we have reframed the manuscript by removing the female data and focusing exclusively on males. This revision clarifies the scope of our study and reinforces the support for our conclusions. For further details, please refer to Response to reviewer #1’s comment 6 on weaknesses.

      (1) Clearly state in the Figure 1 legend that each data point for male aggressive behaviors represents the total # of behaviors calculated over the 4 males in each experimental tank.

      In response to this comment, we have revised the legend of Figure 1K (line 797). The original legend, “(K) Total number of each aggressive act observed among cyp19a1b<sup>+/+</sup>, cyp19a1b<sup>+/−</sup>, or cyp19a1<sup>−/−</sup> males in the tank (n = 6, 7, and 5, respectively),” has been updated to “(K) Total number of each aggressive act performed by cyp19a1b<sup>+/+</sup>, cyp19a1b<sup>+/−</sup>, and cyp19a1b<sup>−/−</sup> males. Each data point represents the sum of acts recorded for the 4 males of the same genotype in a single tank (n = 6, 7, and 5 tanks, respectively).” This clarifies that each data point reflects the total behaviors of the 4 males within each tank.

      (2) The authors wrote under "Response to reviewer #1's major comment "...the development of male behaviors may require moderate neuroestrogen levels that are sufficient to induce the expression of ara and arb, but not esr2b, in the underlying neural circuitry": "This may account for the lack of aggression recovery in E2-treated cyp19a1b-deficient males in this study.".

      What is meant by the latter statement? What accounts for the lack of aggression? The lack of increase in esr2b? Please clarify. 

      Line 365: In response to this comment, “This may account for the lack of aggression recovery in E2treated cyp19a1b-deficient males in this study.” has been revised to “Considering this, the lack of aggression recovery in E2-treated cyp19a1b-deficient males in this study may be explained by the possibility that the E2 dose used was sufficient to induce not only ara and arb but also esr2b expression in aggression-relevant circuits, which potentially suppressed aggression.”

      This revision clarifies that, while moderate brain estrogen levels are sufficient to promote male behaviors via induction of ara and arb, the E2 dose used in this study may have additionally induced esr2b in circuits relevant to aggression, potentially underlying the lack of aggression recovery.

      (3) This is a continuation of my comment/concern directly above. If the induction of ara and arb aren't enough, then how can, as the authors state, androgen signaling be the primary driver of these behaviors? 

      In response to this follow-up comment, we would like to clarify that, as described above, the lack of aggression recovery in E2-treated cyp19a1b-deficient males is not due to insufficient induction of ara and arb, but instead is likely because esr2b was also induced in aggression-relevant circuits, which may have suppressed aggression. Therefore, the concern that androgen signaling cannot be the primary driver of these behaviors is not applicable.

      (4) The authors' point about sticking with the terminology for the ar genes as "ara" and "arb" is not convincing. The whole point of needing a change to match the field of neuroendocrinology as a whole (that is, across all vertebrates) is researchers, especially those with high standing like the Okubo group, adopt the new terminology. Indeed, the Okubo group is THE leader in medaka neuroendocrinology. It would go a long way if they began adopting the new terminology of "ar1" and "ar2". I understand this may be laborious to a degree, and each group can choose to use their terminology, but I'd be remiss if I didn't express my opinion that changing the terminology could help our field as a whole. 

      We sincerely appreciate the reviewer’s thoughtful comments regarding nomenclature consistency in vertebrate neuroendocrinology. We understand the motivation behind the suggestion to adopt ar1 and ar2. However, we consider the established nomenclature of ara and arb to be more appropriate for the following reasons.

      First, adopting the ar1/ar2 nomenclature would introduce a discrepancy between gene and protein symbols. According to the NCBI International Protein Nomenclature Guidelines (Section 2B.Abbreviations and symbols;

      https://www.ncbi.nlm.nih.gov/genbank/internatprot_nomenguide/), the ZFIN Zebrafish Nomenclature Conventions (Section 2. PROTEINS:https://zfin.atlassian.net/wiki/spaces/general/pages/1818394635/ZFIN+Zebrafish+Nomenclature+Con ventions), and the author guidelines of many journal

      (e.g.,https://academic.oup.com/molehr/pages/Gene_And_Protein_Nomenclature), gene and protein symbols should be identical (with proteins designated in non-italic font and with the first letter capitalized). Maintaining consistency between gene and protein symbols helps avoid unnecessary confusion. The ara/arb nomenclature allows this, whereas ar1/ar2 does not.

      Second, the two androgen receptor genes in teleosts are paralogs derived from the third round of wholegenome duplication that occurred early in teleost evolution. For such duplicated genes, the ZFIN Zebrafish Nomenclature Conventions (Section 1.2. Duplicated genes) recommend appending the suffixes “a” and “b” to the approved symbol of the human or mouse ortholog. This convention clearly indicates that these genes are whole-genome duplication paralogs and provides an intuitive way to represent orthologous and paralogous relationships between teleost genes and those of other vertebrates. As a result, it has been widely adopted, and we consider it logical and beneficial to apply the same principle to androgen receptors.

      In light of these considerations, we respectfully maintain that the ara/arb nomenclature is more suitable for the present manuscript than the alternative ar1/ar2 system.

      (5) In the discussion please discuss these potentially unexpected findings.

      (a) gal was unaffected in female cyp19a1 mutants, but they exhibit mating behaviors towards females. Given gal is higher in males and these females act like females, what does this mean about the function of gal/its utility in being a male-specific marker (is it one??)? 

      (b) esr2b expression is higher in female cyp19a1 mutants. this is unexpected as well given esr2b is required for female-typical mating and is higher in females compared to males and E2 increases esr2b expression. please explain...well, what this means for our idea of what esr2b expression tell us. 

      We thank the reviewer for the insightful comments. As the female data have been removed from the manuscript, discussion of these findings in female cyp19a1b mutants is no longer necessary.

      Reviewer #3 (Recommendations For The Authors):

      The authors have addressed a number of answers to the reviewer's comments, notably they provided missing methodological information and rephrased the text. However, the authors have not addressed the main issues raised by the reviewers. Notably, it is regrettable that the reduced amount of brain aromatase cannot be confirmed, this seems to be the primary step when validating a new mutant. Even if protein products of the two genes may not be discriminated (which I can understand), it should be possible to evaluate the expression of a common messenger and/or peptide and confirm that aromatase expression is reduced in the brain. Since Cyp19a1b is relatively more abundant in the brain Cyp19a1a, this would strengthen the conclusion and provide confidence that the mutant indeed does silence aromatase expression in the brain. Although these short comings are acknowledged in the rebuttal letter, this is not mentioned in the discussion. Doing so would make the manuscript more transparent and clearer. 

      As noted in Response to reviewer #3’s comment 1 on weaknesses, we have validated the loss of Cyp19a1b function by measuring its transcript levels in the brain and predicting the three-dimensional structure of the mutant protein. These analyses confirmed that Cyp19a1b function is indeed lost, thereby increasing the reliability of our conclusions.

      FigS1 - panels C&D please indicate in which tissue were hormones measured. Blood?

      We thank the reviewer for pointing this out. In our study, “peripheral” refers to the caudal half of the body excluding the head and visceral organs, not blood. Accordingly, we have revised the figure legend and the description in the Materials and Methods section as follows:

      Legend for Figure 1B (line 787) now reads: “Levels of E2, testosterone, and 11KT in the brain (A) and peripheral tissues (caudal half of the body) (B) of adult cyp19a1b<sup>+/+</sup>, cyp19a1b<sup>+/−</sup>, and cyp19a1b<sup>−/−</sup> males (n = 3 per genotype).”

      Materials and methods (line 431): The sentence “Total lipids were extracted from the brain and peripheral tissues (from the caudal half) of” has been revised to “Total lipids were extracted from the brain and from peripheral tissues, specifically the caudal half of the body excluding the head and visceral organs, of.”

      Additional Alterations:

      We have reformatted the text and supporting materials to comply with the journal’s Author Guidelines. The following changes have been made:

      (1) Figures and supplementary files are now provided separately from the main text.

      (2) The title page has been reformatted without any changes to its content.

      (3) In-text citations have been changed from numerical references to the author–year format.

      (4) Figure labels have been revised from “Fig. 1,” “Fig. S1,” etc., to “Figure 1,” “Figure 1—figure supplement 1,” etc.

      (5) Table labels have been revised from “Table S1,” etc., to “Supplementary file 1,” etc.

      (6) Line 324: The typo “is” has been corrected to “are”.

      (7) Line 382: The section heading “Materials and Methods” has been changed to “Materials and methods” (lowercase “m”).

      (8) Line 383: The Key Resources Table has been placed at the beginning of the Materials and methods section.

      (9) Line 389: The sentence “Sexually mature adults (2–6 months) were used for experiments, and tissues were consistently sampled 1–5 hours after lights on.” has been revised to “Sexually mature adults (2–6 months) were used for experiments and assigned randomly to experimental groups. Tissues were consistently sampled 1–5 hours after lights on.”

      (10)  Line 393: The sentence “All fish were handled in accordance with the guidelines of the Institutional Animal Care and Use Committee of the University of Tokyo.” has been removed.

      (11)  Line 589: The following sentence has been added: “No power analysis was conducted due to the lack of relevant data; sample size was estimated based on previous studies reporting inter-individual variation in behavior and neural gene expression in medaka.”

      (12)  Line 598: The reference list has been reordered from numerical sequence to alphabetical order by author.

      (13)  In the figure legends, notations such as “A and B” have been revised to “A, B.”

  6. pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
    1. Paul Fulker was appointed chairman of the Federal Reserve, essentially the head of the United States Central Bank. in 1979 and his appointment signaled a dramatic shift in US economic governance

      for - economic history - 1979 - Paul. A. Volcker appointed chairman of Federal Reserve - Volcker Shock - shift - from employment to inflation - raised interest rates to an astounding 20%, intentionally causing a recession

    1. When the system is ready to perform a provide operation,

      it pops all CIDs from the same region as the head of the queue and provides them together in a single batch

    1. our Red Pill Boyfriend Will Ruin Your Life is that he was socially inept enough to actually tell her that “women hit the wall at thirty,” among other copy-paste red pill sayings. He wasn’t just a jerk: he was a red pill jerk with no filter.

      Men can lack enough theory of mind to say to their romantic partner that "Women hit the wall at 30". Shake my head, how does a guy like that get a date?

    1. Reviewer #2 (Public review):

      A long-standing debate in the field of Pavlovian learning relates to the phenomenon of timescale invariance in learning i.e. that the rate at which an animal learns about a Pavlovian CS is driven by the relative rate of reinforcement of the cue (CS) to the background rate of reinforcement. In practice, if a CS is reinforced on every trial, then the rate of acquisition is determined by the relative duration of the CS (T) and the ITI (C = inter-US-interval = duration of CS + ITI), specifically the ratio of C/T. Therefore, the point of acquisition should be the same with a 10s CS and a 90s ITI (T = 10; C = 90 + 10 = 100, C/T = 100/10 = 10) and with a 100s CS and a 900s ITI (T = 100; C = 900 + 100 = 1000, C/T = 1000/100 = 10). That is to say, the rate of acquisition is invariant to the absolute timescale as long as this ratio is the same. This idea has many other consequences, but is also notably different from more popular prediction-error based associative learning models such as the Rescorla-Wagner model. The initial demonstrations that the ratio C/T predicts the point of acquisition across a wide range of parameters (both within and across multiple studies) was conducted in Pigeons using a Pavlovian autoshaping procedure. What has remained under contention is whether or not this relationship holds across species, particularly in the standard appetitive Pavlovian conditioning paradigms used in rodents. The results from rodent studies aimed at testing this have been mixed, and often the debate around the source of these inconsistent results focuses on the different statistical methods used to identify the point of acquisition for the highly variable trial-by-trial responses at the level of individual animals.

      The authors successfully replicate the same effect found in pigeon autoshaping paradigms decades ago (with almost identical model parameters) in a standard Pavlovian appetitive paradigm in rats. They achieve this through a clever change the experimental design, using a convincingly wide range of parameters across 14 groups of rats, and by a thorough and meticulous analysis of these data. It is also interesting to note that the two authors have published on opposing sides of this debate for many years, and as a result have developed and refined many of the ideas in this manuscript through this process.

      Main findings

      (1) The present findings demonstrate that the point of initial acquisition of responding is predicted by the C/T ratio.

      (2) The terminal rates of responding to the CS appear to be related to the reinforcement rate of the CS (T; specifically, 1/T) but not its relation to the reinforcement rate of the context (i.e. C or C/T). In the present experiment, all CS trials were reinforced so it is also the case that the terminal rate of responding was related to the duration of the CS.

      (3) An unexpected finding was that responding during the ITI was similarly related to the rate of contextual reinforcement (1/C). This novel finding suggests that the terminal rate of responding during the ITI and the CS are related to their corresponding rates of reinforcement. This finding is surprising as it suggests that responding during the ITI is not being driven by the probability of reinforcement during the ITI.

      (4) Finally, the authors characterised the nature of increased responding from the point of initial acquisition until responding peaks at a maximum. Their analyses suggest that nature of this increase was best described as linear in the majority of rats, as opposed to the non-linear increase that might be predicted by prediction error learning models (e.g. Rescorla-Wagner). However, more detailed analyses revealed that these changes can be quite variable across rats, and more variable when the CS had lower informativeness (defined as C/T).

      Strengths and Weaknesses:

      There is an inherent paradox regarding the consistency of the acquisition data from Gibbon & Balsam's (1981) meta-analysis of autoshaping in pigeons, and the present results in magazine response frequency in rats. This consistency is remarkable and impressive, and is suggestive of a relatively conserved or similar underlying learning principle. However, the consistency is also surprising given some significant differences in how these experiments were run. Some of these differences might reasonably be expected to lead to differences in how these different species respond. For example:

      The autoshaping procedure commonly used in the pigeons from these data were pretrained to retrieve rewards from a grain hopper with an instrumental contingency between head entry into the hopper and grain availability. During Pavlovian training, pecking the key light also elicited an auditory click feedback stimulus, and when the grain hopper was made available, the hopper was also illuminated.

      In the present experimental procedure, the rats were not given contextual exposure to the pellet reinforcers in the magazine (e.g. a magazine training session is typically found in similar rodent procedures). The Pavlovian CS was a cue light within the magazine itself.

      These design features in the present rodent experiment are clearly intentional. Pretraining with the reinforcer in the testing chambers would reasonably alter the background rate of reinforcement (parameter), so it make sense not to include this but differs from the paradigm used in pigeons. Having the CS inside the magazine where pellets are delivered provides an effective way to reduce any potential response competition between CS and US directed responding and combines these all into the same physical response. This makes the magazine approach response more like the pecking of the light stimulus in the pigeon autoshaping paradigm. However, the location of the CS and US is separated in pigeon autoshaping, raising questions about why the findings across species are consistent despite these differences.

      Intriguingly, when the insertion of a lever is used as a Pavlovian cue in rodent studies, CS directed responding (sign-tracking) often develops over training such that eventually all animals bias their responding towards the lever than towards the US (goal-tracking at the magazine). However, the nature of this shift highlights the important point that these CS and US directed responses can be quite distinct physically as well as psychologically. Therefore, by conflating the development of these different forms of responding, it is not clear whether the relationship between C/T and the acquisition of responding describes the sum of all Pavlovian responding or predominantly CS or US directed responding.

      Another interesting aspect of these findings is that there is a large amount of variability that scales inversely with C/T. A potential account of the source of this variability is related to the absence of preexposure to the reward pellets. This is normally done within the animals' homecage as a form of preexposure to reduce neophobia. If some rats take longer to notice and then approach and finally consume the reward pellets in the magazine, the impact of this would systematically differ depending on the length of the ITI. For animals presented with relatively short CSs and ITIs, they may essentially miss the first couple of trials and/or attribute uneaten pellets accumulating in the magazine to the background/contextual rate of reinforcement. What is not currently clear is whether this was accounted for in some way by confirming when the rats first started retrieving and consuming the rewards from the magazine.

      While the generality of these findings across species is impressive, the very specific set of parameters employed to generate these data raise questions about the generality of these findings across other standard Pavlovian conditioning parameters. While this is obviously beyond the scope of the present experiment, it is important to consider that the present study explored a situation with 100% reinforcement on every trial, with a variable duration CS (drawn form a uniform distribution), with a single relatively brief CS (maximum of 122s) CS and a single US. Again, the choice of these parameters in the present experiment is appropriate and very deliberately based on refinements from many previous studies from the authors. This includes a number of criteria used to define magazine response frequency which includes discarding specific responses (discussed and reasonably justified clearly in the methods section). Similarly, the finding that terminal rates of responding are reliably related to 1/T is surprising, and it is not clear whether this might be a property specific to this form of variable duration CS, the use of a uniform sampling distribution, or the use of only a single CS. However, it is important to keeps these limitations in mind when considering some of the claims made in the discussion section of this manuscript that go beyond what these data can support.

    1. Reviewer #1 (Public review):

      One of the roadblocks in PfEMP1 research has been the challenges in manipulating var genes to incorporate markers to allow the transport of this protein to be tracked and to investigate the interactions taking place within the infected erythrocyte. In addition, the ability of Plasmodium falciparum to switch to different PfEMP1 variants during in vitro culture has complicated studies due to parasite populations drifting from the original (manipulated) var gene expression. Cronshagen et al have provided a useful system with which they demonstrate the ability to integrate a selectable drug marker into several different var genes that allows the PfEMP1 variant expression to be 'fixed'. This on its own represents a useful addition to the molecular toolbox and the range of var genes that have been modified suggests that the system will have broad application. As well as incorporating a selectable marker, the authors have also used selective linked integration (SLI) to introduce markers to track the transport of PfEMP1, investigate the route of transport and probe interactions with PfEMP1 proteins in the infected host cell.

      One of the major strengths of this paper is that the authors have not only put together a robust system for further functional studies, but they have used it to produce a range of interesting findings including:

      Co-activation of rif and var genes when in a head-to-head orientation.

      The reduced control of expression of var genes in the 3D7-MEED parasite line.

      More support for the PTEX transport route for PfEMP1.<br /> Identification of new proteins involved in PfEMP1 interactions in the infected erythrocyte, including some required for cytoadherence.

      In most cases the experimental evidence is straightforward, and the data support the conclusions strongly. The authors have been very careful in the depth of their investigation, and where unexpected results have been obtained, they have looked carefully at why these have occurred.

      A weakness of the paper is, as mentioned above, that the results are sometimes not as clear as might have been expected, for example, in the requirement for panning modified parasites to produce binding to EPCR. Where this has happened, the authors take a robust and thoughtful approach, and acknowledge that (as in most research) there are more questions to address. Being able to select specific var gene switches using drug markers will provide some useful starting points to understand how switching happens in P. falciparum. However, our trypanosome colleagues might remind us that forcing switches may show us some mechanisms, but perhaps not all.

      Despite these sometimes complicated findings, the authors have achieved their aim as stated in the title of the paper, and in doing so have provided an excellent resource to themselves and other researchers in the field to answer some important questions.

      Overall, the authors have produced a useful and robust system to support functional studies on PfEMP1, which provides a platform for future studies manipulating the domain content in var genes. They have used this system to produce a range of interesting findings and to support its use by the research community.

      Comments on revisions:

      I have no further recommendations for changes by the authors. They have addressed my concerns, and the paper reads very well.

    2. Reviewer #2 (Public review):

      Summary

      Croshagen et al develop a range of tools based on selection-linked integration (SLI) to study PfEMP1 function in P. falciparum. PfEMP1 is encoded by a family of ~60 var genes subject to mutually exclusive expression. Switching expression between different family members can modify the binding properties of the infected erythrocyte while avoiding the adaptive immune response. Although critical to parasite survival and Malaria disease pathology, PfEMP1 proteins are difficult to study owing to their large size and variable expression between parasites within the same population. The SLI approach previously developed by this group for genetic modification of P. falciparum is employed here to selectively and stably activate expression of target var genes at the population level. Using this strategy, the binding properties of specific PfEMP1 variants were measured for several distinct var genes with a novel semi-automated pipeline to increase throughput and reduce bias. Activation of similar var genes in both the common lab strain 3D7 and the cytoadhesion competent FCR3/IT4 strain revealed higher binding for several PfEMP1 IT4 variants with distinct receptors, indicating this strain provides a superior background for studying PfEMP1 binding. SLI also enables modifications to target var gene products to study PfEMP1 trafficking and identify interacting partners by proximity-labeling proteomics, revealing two novel exported proteins required for cytoadherence. Overall, the data demonstrate a range of SLI-based approaches for studying PfEMP1 that will be broadly useful for understanding the basis for cytoadhesion and parasite virulence.

      Comments:

      While the capability of SLI to active selected var gene expression was initially reported by Omelianczyk et al., the present study greatly expands the utility of this approach. Several distinct var genes are activated in two different P. falciparum strains and shown to modify the binding properties of infected RBCs to distinct endothelial receptors; development of SLI2 enables multiple SLI modifications in the same parasite line; SLI is used to modify target var genes to study PfEMP1 trafficking and determine PfEMP1 interactomes with BioID. Along the way, the authors also demonstrate a new selection marker for P. falciparum transfection (a mutant FNT lactate transporter that provides resistance to the compound BH267.meta). Curiously, Omelianczyk et al activated a single var (Pf3D7_0421300) and observed elevated expression of an adjacent var arranged in a head to tail manner, possibly resulting from local chromatin modifications enabling expression of the neighboring gene. In contrast, the present study observed activation of neighboring genes with head to head but not head to tail arrangement, which may be the result of shared promoter regions. The reason for these differing results is unclear although it should be noted that the two studies examined different var loci.

      The IT4var19 panned line that became binding-competent showed increased expression of both paralogs of ptp3 (as well as a phista and gbp), suggesting that overexpression of PTP3 may improve PfEMP1 display and binding. Interestingly, IT4 appears to be the only known P. falciparum strain (only available in PlasmoDB) that encodes more than one ptp3 gene (PfIT_140083100 and PfIT_140084700). PfIT_140084700 is almost identical to the 3D7 PTP3 (except for a ~120 residue insertion in 3D7 beginning at residue 400). In contrast, while the C-terminal region of PfIT_140083100 shows near perfect conservation with 3D7 PTP3 beginning at residue 450, the N-terminal regions between the PEXEL and residue 450 are quite different. This may indicate the generally stronger receptor binding observed in IT4 relative to 3D7 results from increased PTP3 activity due to multiple isoforms or that specialized trafficking machinery exists for some PfEMP1 proteins.

      Revisions:

      The authors thoughtfully addressed all the reviewer comments.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment:

      This study introduces an important approach using selection linked integration (SLI) to generate Plasmodium falciparum lines expressing single, specific surface adhesins PfEMP1 variants, enabling precise study of PfEMP1 trafficking, receptor binding, and cytoadhesion. By moving the system to different parasite strains and introducing an advanced SLI2 system for additional genomic edits, this work provides compelling evidence for an innovative and rigorous platform to explore PfEMP1 biology and identify novel proteins essential for malaria pathogenesis including immune evasion.

      Reviewer #1 (Public review):

      One of the roadblocks in PfEMP1 research has been the challenges in manipulating var genes to incorporate markers to allow the transport of this protein to be tracked and to investigate the interactions taking place within the infected erythrocyte. In addition, the ability of Plasmodium falciparum to switch to different PfEMP1 variants during in vitro culture has complicated studies due to parasite populations drifting from the original (manipulated) var gene expression. Cronshagen et al have provided a useful system with which they demonstrate the ability to integrate a selectable drug marker into several different var genes that allows the PfEMP1 variant expression to be 'fixed'. This on its own represents a useful addition to the molecular toolbox and the range of var genes that have been modified suggests that the system will have broad application. As well as incorporating a selectable marker, the authors have also used selective linked integration (SLI) to introduce markers to track the transport of PfEMP1, investigate the route of transport, and probe interactions with PfEMP1 proteins in the infected host cell.

      What I particularly like about this paper is that the authors have not only put together what appears to be a largely robust system for further functional studies, but they have used it to produce a range of interesting findings including:

      Co-activation of rif and var genes when in a head-to-head orientation.

      The reduced control of expression of var genes in the 3D7-MEED parasite line.

      More support for the PTEX transport route for PfEMP1.

      Identification of new proteins involved in PfEMP1 interactions in the infected erythrocyte, including some required for cytoadherence.

      In most cases the experimental evidence is straightforward, and the data support the conclusions strongly. The authors have been very careful in the depth of their investigation, and where unexpected results have been obtained, they have looked carefully at why these have occurred.

      We thank the reviewer for the kind assessment and the comments to improve the paper.

      (1) In terms of incorporating a drug marker to drive mono-variant expression, the authors show that they can manipulate a range of var genes in two parasite lines (3D7 and IT4), producing around 90% expression of the targeted PfEMP1. Removal of drug selection produces the expected 'drift' in variant types being expressed. The exceptions to this are the 3D7-MEED line, which looks to be an interesting starting point to understand why this variant appears to have impaired mutually exclusive var gene expression and the EPCR-binding IT4var19 line. This latter finding was unexpected and the modified construct required several rounds of panning to produce parasites expressing the targeted PfEMP1 and bind to EPCR. The authors identified a PTP3 deficiency as the cause of the lack of PfEMP1 expression, which is an interesting finding in itself but potentially worrying for future studies. What was not clear was whether the selected IT4var19 line retained specific PfEMP1 expression once receptor panning was removed.

      We do not have systematic long-term data for the Var19 line but do have medium-term data. After panning the Var19 line, the binding assays were done within 3 months without additional panning. The first binding assay was 2 months after the panning and the last binding assays three weeks later, totaling about 3 months without panning. While there is inherent variation in these assays that precludes detection of smaller changes, the last assay showed the highest level of binding, giving no indication for rapid loss of the binding phenotype. Hence, we can say that the binding phenotype appears to be stable for many weeks without panning the cells again and there was no indication for a rapid loss of binding in these parasites.

      Systematic long-term experiments to assess how long the Var19 parasites retain binding would be interesting, but given that the binding-phenotype appears to remain stable over many weeks or even months, this would only make sense if done over a much longer time frame. Such data might arise if the line is used over extended times for a specific project in which case it might be advisable to monitor continued binding. We included a statement in the discussion that the binding phenotype was stable over many weeks but that if long-term work with this line is planned, monitoring the binding phenotype might be advisable: “In the course of this work the binding phenotype of the IT4var19 expressor line remained stable over many weeks without further panning. However, given that initial panning had been needed for this particular line, it might be advisable for future studies to monitor the binding phenotype if the line is used for experiments requiring extended periods of cultivation.”

      (2) The transport studies using the mDHFR constructs were quite complicated to understand but were explained very clearly in the text with good logical reasoning.

      We are aware of this being a complex issue and are glad this was nevertheless understandable.

      (3) By introducing a second SLI system, the authors have been able to alter other genes thought to be involved in PfEMP1 biology, particularly transport. An example of this is the inactivation of PTP1, which causes a loss of binding to CD36 and ICAM-1. It would have been helpful to have more insight into the interpretation of the IFAs as the anti-SBP1 staining in Figure 5D (PTP-TGD) looks similar to that shown in Figure 1C, which has PTP intact. The anti-EXP2 results are clearly different.

      We realize the description of the PTP1-TGD IFA data and that of the other TGDs (see also response to Recommendation to authors point 4 and reviewer 2, major points 6 and 7) was rather cursory. The previously reported PTP1 phenotype is a fragmentation of the Maurer’s clefts into what in IFA appear to be many smaller pieces (Rug et al 2014, referenced in the manuscript). The control in Fig. 5D has 13 Maurer’s cleft spots (previous work indicates an average of ~15 MC per parasite, see e.g. the originally co-submitted eLife preprint doi.org/10.7554/eLife.103633.1 and references therein). The control mentioned by the reviewer in Fig. 1C has about 22 Maurer’s clefts foci, at the upper end of the typical range, but not unusual. In contrast, the PTP1-TGD in Fig. 5D, has more than 30 foci with an additional cytoplasmic pool and additional smaller, difficult to count foci. This is consistent with the published phenotype in Rug et al 2014. The EXP1 stained cell has more than 40 Maurer’s cleft foci, again beyond what typically is observed in controls. Therefore, these cells show a difference to the control in Fig. 5 but also to Fig. 1C. Please note that we are looking at two different strains, in Fig. 1 it is 3D7 and in Fig. 5 IT4. While we did not systematically assess this, the Maurer’s clefts number per cell seemed to be largely comparable between these strains (Fig. 10C and D in the other eLife preprint doi.org/10.7554/eLife.103633.1). 

      Overall, as the PTP1 loss phenotype has already been reported, we did not go into more experimental detail. However, we now modified the text to more clearly describe how the phenotype in the PTP1-TGD parasites was different to control: “IFAs showed that in the PTP1-TGD parasites, SBP1 and PfEMP1 were found in many small foci in the host cell that exceeded the average number of ~ 15 Maurer’s clefts typically found per infected RBC [66] (Fig. 5D). This phenotype resembled the previously reported Maurer’s clefts phenotype of the PTP1 knock out in CS2 parasites [39].”

      (4) It is good to see the validation of PfEMP1 expression includes binding to several relevant receptors. The data presented use CHO-GFP as a negative control, which is relevant, but it would have been good to also see the use of receptor mAbs to indicate specific adhesion patterns. The CHO system if fine for expression validation studies, but due to the high levels of receptor expression on these cells, moving to the use of microvascular endothelial cells would be advisable. This may explain the unexpected ICAM-1 binding seen with the panned IT4var19 line.

      We agree with the reviewer that it is desirable to have better binding systems for studying individual binding interactions. As the main purpose of this paper was to introduce the system and provide proof of principle that the cells show binding, we did not move to more complicated binding systems. However, we would like to point out that the CSA binding was done on receptor alone in addition to the CSA-expressing HBEC-5i cells and was competed successfully with soluble CSA. In addition, apart from the additional ICAM1-binding of the Var19 line, all binding phenotypes were conform with expectations. We therefore hope the tools used for binding studies are acceptable at this stage of introducing the system while future work interested in specific PfEMP1 receptor interactions may use better systems, tailored to the specific question (e.g. endothelial organoid models and engineered human capillaries and inhibitory antibodies or relevant recombinant domains for competition).

      (5) The proxiome work is very interesting and has identified new leads for proteins interacting with PfEMP1, as well as suggesting that KAHRP is not one of these. The reduced expression seen with BirA* in position 3 is a little concerning but there appears to be sufficient expression to allow interactions to be identified with this construct. The quantitative impact of reduced expression for proxiome experiments will clearly require further work to define it.

      This is a valid point. Clearly there seems to be some impact on binding when BirA* is placed in the extracellular domain (either through reduced presentation or direct reduction of binding efficiency of the modified PfEMP1; please see also minor comment 10 reviewer 2). The exact quantitative impact on the proxiome is difficult to assess but we note that the relative enrichment of hits to each other is rather similar to the other two positions (Fig. 6H-J). We therefore believe the BioIDs with the 3 PfEMP1-BirA* constructs are sufficient to provide a general coverage of proteins proximal to PfEMP1 and hope this will aid in the identification of further proteins involved in PfEMP1 transport and surface display as illustrated with two of the hits targeted here.

      The impact of placing a domain on the extracellular region of PfEMP1 will have to be further evaluated if needed in other studies. But the finding that a large folded domain can be placed into this part at all, even if binding was reduced, in our opinion is a success (it was not foreseeable whether any such change would be tolerated at all).

      (6) The reduced receptor binding results from the TryThrA and EMPIC3 knockouts were very interesting, particularly as both still display PfEMP1 on the surface of the infected erythrocyte. While care needs to be taken in cross-referencing adhesion work in P. berghei and whether the machinery truly is functionally orthologous, it is a fair point to make in the discussion. The suggestion that interacting proteins may influence the "correct presentation of PfEMP1" is intriguing and I look forward to further work on this.

      We hope future work will be able to shed light on this.

      Overall, the authors have produced a useful and reasonably robust system to support functional studies on PfEMP1, which may provide a platform for future studies manipulating the domain content in the exon 1 portion of var genes. They have used this system to produce a range of interesting findings and to support its use by the research community. Finally, a small concern. Being able to select specific var gene switches using drug markers could provide some useful starting points to understand how switching happens in P. falciparum. However, our trypanosome colleagues might remind us that forcing switches may show us some mechanisms but perhaps not all.

      Point noted! From non-systematic data with the Var01 line that has been cultured for extended periods of time (several years), it seems other non-targeted vars remain silent in our SLI “activation” lines but how much SLI-based var-expression “fixing” tampers with the integrity of natural switching mechanisms is indeed very difficult to gage at this stage. We now added a statement to the discussion that even if mutually exclusive expression is maintained, it is not certain the mechanisms controlling var expression all remain intact: “However, it should be noted that it is not known whether all mechanisms controlling mutually exclusive expression and switching remain intact in parasites with SLI-activated var genes.”

      Reviewer #2 (Public review):

      Summary

      Croshagen et al develop a range of tools based on selection-linked integration (SLI) to study PfEMP1 function in P. falciparum. PfEMP1 is encoded by a family of ~60 var genes subject to mutually exclusive expression. Switching expression between different family members can modify the binding properties of the infected erythrocyte while avoiding the adaptive immune response. Although critical to parasite survival and Malaria disease pathology, PfEMP1 proteins are difficult to study owing to their large size and variable expression between parasites within the same population. The SLI approach previously developed by this group for genetic modification of P. falciparum is employed here to selectively and stably activate the expression of target var genes at the population level. Using this strategy, the binding properties of specific PfEMP1 variants were measured for several distinct var genes with a novel semi-automated pipeline to increase throughput and reduce bias. Activation of similar var genes in both the common lab strain 3D7 and the cytoadhesion competent FCR3/IT4 strain revealed higher binding for several PfEMP1 IT4 variants with distinct receptors, indicating this strain provides a superior background for studying PfEMP1 binding. SLI also enables modifications to target var gene products to study PfEMP1 trafficking and identify interacting partners by proximity-labeling proteomics, revealing two novel exported proteins required for cytoadherence. Overall, the data demonstrate a range of SLI-based approaches for studying PfEMP1 that will be broadly useful for understanding the basis for cytoadhesion and parasite virulence.

      We thank the reviewer for the kind assessment and the comments to improve the paper.

      Comments

      (1) While the capability of SLI to actively select var gene expression was initially reported by Omelianczyk et al., the present study greatly expands the utility of this approach. Several distinct var genes are activated in two different P. falciparum strains and shown to modify the binding properties of infected RBCs to distinct endothelial receptors; development of SLI2 enables multiple SLI modifications in the same parasite line; SLI is used to modify target var genes to study PfEMP1 trafficking and determine PfEMP1 interactomes with BioID. Curiously, Omelianczyk et al activated a single var (Pf3D7_0421300) and observed elevated expression of an adjacent var arranged in a head-to-tail manner, possibly resulting from local chromatin modifications enabling expression of the neighboring gene. In contrast, the present study observed activation of neighboring genes with head-to-head but not head-totail arrangement, which may be the result of shared promoter regions. The reason for these differing results is unclear although it should be noted that the two studies examined different var loci.

      The point that we are looking at different loci is very valid and we realize this is not mentioned in the discussion. We now added to the discussion that it is unclear if our results and those cited may be generalized and that different var gene loci may respond differently

      “However, it is unclear if this can be generalized and it is possible that different var loci respond differently.”

      (2) The IT4var19 panned line that became binding-competent showed increased expression of both paralogs of ptp3 (as well as a phista and gbp), suggesting that overexpression of PTP3 may improve PfEMP1 display and binding. Interestingly, IT4 appears to be the only known P. falciparum strain (only available in PlasmoDB) that encodes more than one ptp3 gene (PfIT_140083100 and PfIT_140084700). PfIT_140084700 is almost identical to the 3D7 PTP3 (except for a ~120 residue insertion in 3D7 beginning at residue 400). In contrast, while the C-terminal region of PfIT_140083100 shows near-perfect conservation with 3D7 PTP3 beginning at residue 450, the N-terminal regions between the PEXEL and residue 450 are quite different. This may indicate the generally stronger receptor binding observed in IT4 relative to 3D7 results from increased PTP3 activity due to multiple isoforms or that specialized trafficking machinery exists for some PfEMP1 proteins.

      We thank the reviewer for pointing this out, the exact differences between the two PTP3s of IT4 and that of other strains definitely should be closely examined if the function of these proteins in PfEMP1 binding is analysed in more detail. 

      It is an interesting idea that the PTP3 duplication could be a reason for the superior binding of IT4. We always assumed that IT4 had better binding because it was less culture adapted but this does not preclude that PTP3(s) is(are) a reason for this. However, at least in our 3D7 PTP3 can’t be the reason for the poor binding, as our 3D7 still has PfEMP1 on the surface while in the unpanned IT4-Var19 line and in the Maier et al., Cell 2008 ptp3 KO (PMID: 18614010)) PfEMP1 is not on the surface anymore. 

      Testing the impact of having two PTP3s would be interesting, but given the “mosaic” similarity of the two PTP3s isoforms, a simple add-on experiment might not be informative. Nevertheless, it will be interesting in future work to explore this in more detail.

      Reviewer #3 (Public review):

      Summary:

      The submission from Cronshagen and colleagues describes the application of a previously described method (selection linked integration) to the systematic study of PfEMP1 trafficking in the human malaria parasite Plasmodium falciparum. PfEMP1 is the primary virulence factor and surface antigen of infected red blood cells and is therefore a major focus of research into malaria pathogenesis. Since the discovery of the var gene family that encodes PfEMP1 in the late 1990s, there have been multiple hypotheses for how the protein is trafficked to the infected cell surface, crossing multiple membranes along the way. One difficulty in studying this process is the large size of the var gene family and the propensity of the parasites to switch which var gene is expressed, thus preventing straightforward gene modification-based strategies for tagging the expressed PfEMP1. Here the authors solve this problem by forcing the expression of a targeted var gene by fusing the PfEMP1 coding region with a drug-selectable marker separated by a skip peptide. This enabled them to generate relatively homogenous populations of parasites all expressing tagged (or otherwise modified) forms of PfEMP1 suitable for study. They then applied this method to study various aspects of PfEMP1 trafficking.

      Strengths:

      The study is very thorough, and the data are well presented. The authors used SLI to target multiple var genes, thus demonstrating the robustness of their strategy. They then perform experiments to investigate possible trafficking through PTEX, they knock out proteins thought to be involved in PfEMP1 trafficking and observe defects in cytoadherence, and they perform proximity labeling to further identify proteins potentially involved in PfEMP1 export. These are independent and complimentary approaches that together tell a very compelling story.

      We thank the reviewer for the kind assessment and the comments to improve the paper.

      Weaknesses:

      (1)  When the authors targeted IT4var19, they were successful in transcriptionally activating the gene, however, they did not initially obtain cytoadherent parasites. To observe binding to ICAM-1 and EPCR, they had to perform selection using panning. This is an interesting observation and potentially provides insights into PfEMP1 surface display, folding, etc. However, it also raises questions about other instances in which cytoadherence was not observed. Would panning of these other lines have been successfully selected for cytoadherent infected cells? Did the authors attempt panning of their 3D7 lines? Given that these parasites do export PfEMP1 to the infected cell surface (Figure 1D), it is possible that panning would similarly rescue binding. Likewise, the authors knocked out PTP1, TryThrA, and EMPIC3 and detected a loss of cytoadhesion, but they did not attempt panning to see if this could rescue binding. To ensure that the lack of cytoadhesion in these cases is not serendipitous (as it was when they activated IT4var19), they should demonstrate that panning cannot rescue binding.

      These are very important considerations. Indeed, we had repeatedly attempted to pan 3D7 when we failed to get the SLI-generated 3D7 PfEMP1 expressor lines to bind, but this had not been successful. The lack of binding had been a major obstacle that had held up the project and was only solved when we moved to IT4 which readily bound (apart from Var19 which was created later in the project). After that we made no further efforts to understand why 3D7 does not bind but the fact that PfEMP1 is on the surface indicates this is not a PTP3 issue because loss of PTP3 also leads to loss of PfEMP1 surface display. Also, as the parent 3D7 could not be panned, we assumed this issue is not easily fixed in the SLI var lines we made in 3D7.

      Panning the TGD lines: we see the reasoning for conducting panning experiments with the TGD lines. However, on second thought, we are unsure this should be attempted. The outcome might not be easily interpretable as at least two forces will contribute to the selection in panning experiments with TGD lines that do not bind anymore:

      Firstly, panning would work against the SLI of the TGD, resulting in a tug of war between the TGD-SLI and binding. This is because a small number of parasites will loop out the TGD plasmid (revert) and would normally be eliminated during standard culturing due to the SLI drug used for the TGD. These revertant cells would bind and the panning would enrich them. Hence, panning and SLI are opposed forces in the case of a TGD abolishing binding. It is unclear how strong this effect would be, but this would for sure lead to mixed populations that complicate interpretations. 

      The second selecting force are possible compensatory changes to restore binding. These can be due to different causes: (i) reversal of potential independent changes that may have occurred in the TGD parasites and that are in reality causing the binding loss (i.e. such as ptp3 loss or similar, the concern of the reviewer) or (ii) new changes to compensate the loss of the TGD target (in this case the TGD is the cause of the binding loss but for instance a different change ameliorates it by for instance increasing PfEMP1 expression or surface display). As both TGDs show some residual binding and have VAR01 on the surface to at least some extent, it is possible that new compensatory changes might indeed occur that indirectly increase binding again. 

      In summary, even if more binding occurs after panning of the lines, it is not clear whether this is due to a compensatory change ameliorating the TGD or reversal of an unrelated change or are counter-selections against the SLI. To determine the cause, the panned TGD lines would need to be subjected to a complex and time-consuming analysis (WGS, RNASeq, possibly Maurer’s clefts phenotype) to find out whether they were SLI-revertants, or had an unrelated chance that was reverted or a new compensatory change that helps binding. This might be further muddled if a mix of cells come out of the selection that have different changes of the options indicated above. In that case, it might even require scRNASeq to make sense of the panning experiment. Due to the envisaged difficulty in interpreting the outcome, we did not attempt this panning.

      To exclude loss of ptp3 expression as the reason for binding loss (something we would not have seen in the WGS if it is only due to a transcriptional change), we now carried out RNASeq with the TGD lines that have a binding phenotype. While we did not generate replicas to obtain quantitative data, the results show that both ptp3 copies were expressed in these TGDs comparable to other parasite lines that do bind with the same SLI-activated var gene, indicating that the effect is not due to ptp3 (see response to point 4 on PTP3 expression in the Recommendations for the authors). While we can’t fully exclude other changes in the TGDs that might affect binding, the WGS did not show any obvious alterations that could be responsible for this. 

      (2) The authors perform a series of trafficking experiments to help discern whether PfEMP1 is trafficked through PTEX. While the results were not entirely definitive, they make a strong case for PTEX in PfEMP1 export. The authors then used BioID to obtain a proxiome for PfEMP1 and identified proteins they suggest are involved in PfEMP1 trafficking. However, it seemed that components of PTEX were missing from the list of interacting proteins. Is this surprising and does this observation shed any additional light on the possibility of PfEMP1 trafficking through PTEX? This warrants a comment or discussion.

      This is an interesting point and we agree that this warrants to be discussed. A likely reason why PTEX components are not picked up as interactors is that BirA* is expected to be unfolded when it passes through the channel and in that state can’t biotinylate. Labelling likely would only be possible if PfEMP1 lingered at the PTEX translocation step before BirA* became unfolded to go through the channel which we would not expect under physiological conditions. We added the following sentences to the discussion: “While our data indicates PfEMP1 uses PTEX to reach the host cell, this could be expected to have resulted in the identification of PTEX components in the PfEMP1 proxiomes, which was not the case. However, as BirA* must be unfolded to pass through PTEX, it likely is unable to biotinylate translocon components unless PfEMP1 is stalled during translocation. For this reason, a lack of PTEX components in the PfEMP1 proxiomes does not necessarily exclude passage through PTEX.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Most of my comments are in the public section. I would just highlight a few things:

      (1) In the binding studies section you talk about "human brain endothelial cells (HBEC-5i)". These cells do indeed express CSA but this is a property of their immortalisation rather than being brain endotheliium, which does not express CSA. I think this could be confusing to readers so I think you might want to reword this sentence to focus on CSA expressing the cell line rather than other features.

      We thank the reviewer for pointing this out, we now modified the sentence to focus on the fact these are CSA expressing cells and provided a reference for it.

      (2) As I said in the public section, CHO cells are great for proof of concept studies, but they are not endothelium. Not a problem for this paper.

      Noted! Please also see our response to the public review.

      (3) I wonder whether your comment about how well tolerated the Bir3* insertion is may be a bit too strong. I might say "Nonetheless, overall the BirA* modified PfEMP1 were functional."

      Changed as requested.

      (4) I'm not sure how you explain the IFA staining patterns to the uninitiated, but perhaps you could explain some of the key features you are looking for.

      We apologise for not giving an explanation of the IFA staining patterns in the first place. Please see detailed response to public review of this reviewer (point 3 on PTP1-TGD phenotype) and to reviewer 2 (Recommendations to the authors, points 6 and 7 on better explaining and quantifying the Maurer’s clefts phenotypes). For this we now also generated parasites that episomally express mCherry tagged SBP1 in the TGD parasites with the reduced binding phenotype. This resulted in amendments to Fig. S7, addition of a Fig. S8 and updated results to better explain the phenotypes. 

      This is a great paper - I just wish I'd had this system before.

      Thank you!

      Reviewer #2 (Recommendations for the authors):

      Major Comments

      (1) Does the RNAseq analysis of 3D7var0425800 and 3D7MEEDvar0425800 (Figure 1G, H) reveal any differential gene expression that might suggest a basis for loss of mutually exclusive var expression in the MEED line?

      We now carried out a thorough analysis of these RNASeq experiments to look for an underlying cause for the phenotype. This was added as new Figure 1J and new Table S3. This analysis again illustrated the increased transcript levels of var genes. In addition, it showed that transcripts of a number of other exported proteins, including members of other gene families, were up in the MEED line. 

      One hit that might be causal of the phenotype was sip2, which was down by close to 8-fold (pAdj 0.025). While recent work in P. berghei found this ApiAP2 to be involved in the expression of merozoite genes (Nishi et al., Sci Advances 2025(PMID: 40117352)), previous work in P. falciparum showed that it binds heterochromatic telomere regions and certain var upstream regions (Flück et al., PlosPath 2010 (PMID: 20195509), now cited in the manuscript). The other notable change was an upregulation of the non-coding RNA ruf6 which had been linked with impaired mono-allelic var expression (Guizetti et al., NAR 2016 (PMID: 27466391), now also cited in the manuscript). While it would go beyond this manuscript to follow this up, it is conceivable that alterations in chromosome end biology due to sip2 downregulation or upregulation of ruf6 are causes of the observed phenotype

      We now added a paragraph on the more comprehensive analysis of the RNA Seq data of the MEED vs non-MEED lines at the end of the second results section.

      (2) Could the inability of the PfEMP1-mDHFR fusion to block translocation (Fig 2A) reflect unique features of PfEMP1 trafficking, such as the existence of a soluble, chaperoned trafficking state that is not fully folded? Was a PfEMP1-BPTI fusion ever tested as an alternative to mDHFR?

      This is an interesting suggestion. The PfEMP1-BPTI was never tested. However, a chaperoned trafficking state would likely also affect BPTI. Given that both domains (mDHFR and BPTI) in principle do the same when folded and would block when the construct is in the PV, it is not so likely that using a different blocking domain would make a difference. Therefore, the scenario where BPTI would block when mDHFR does not, is not that probable. The opposite would be possible (mDHFR blocking while BPTI does not, because only the latter depends on the redox state). However, this would only happen if the block  occurred before the construct reaches the PV.

      At present, we believe the lacking block to be due to the organization of the domains in the construct. In the PfEMP1-mDHFR construct in this manuscript the position of the blocking domain is further away from the TMD compared to all other previously tested mDHFR fusions. Increased distance to the TMD has previously been found to be a factor impairing the blocking function of mDHFR (Mesen-Ramirez et al., PlosPath 2016 (PMID: 27168322)). Hence, our suspicion that this is the reason for the lacking block with the PfEMP1-mDHFR rather than the type of blocking domain. However, the latter option can’t be fully excluded and we might test BPTI in future work.

      (3) The late promoter SBP1-mDHFR is 2A fused with the KAHRP reporter. Since 2A skipping efficiency varies between fusion contexts and significant amounts of unskipped protein can be present, it would be helpful to include a WB to determine the efficiency of skipping and provide confidence that the co-blocked KAHRP in the +WR condition (Fig 2D) is not actually fused to the C-terminus of SBP1-mDHFR-GFP.

      Fortunately, this T2A fusion (crt_SBP1-mDHFR-GFP-2A-KAHRP-mScarlet<sup>epi</sup>) was used before in work that included a Western blot showing its efficient skipping (S3 A Fig in MesenRamirez et al., PlosPath 2016). In agreement with these Western blot result, fluorescence microscopy showed very limited overlap of SBP1-mDHFR-GFP and KAHRP-mCherry in absence of WR (Fig. 3B in Mesen-Ramirez et al., PlosPath 2016 and Fig. 2 in this manuscript) which would not be the case if these two constructs were fused together. Please note that KAHRP is known to transiently localize to the Maurer’s clefts before reaching the knobs (Wickham et al., EMBOJ 2001, PMID: 11598007), and therefore occasional overlap with SBP1 at the Maurer’s clefts is expected. However, we would expect much more overlap if a substantial proportion of the construct population would not be skipped and therefore the co-blocked KAHRP-mCherry in the +WR sample is unlikely to be due to inefficient skipping and attachment to SBP1-mDHFR-GFP.

      (4) Does comparison of RNAseq from the various 3D7 and IT4 lines in the study provide any insight into PTP3 expression levels between strains with different binding capacities? Was the expression level of ptp3a/b in the IT4var19 panned line similar to the expression in the parent or other activated IT4 lines? Could the expanded ptp3 gene number in IT4 indicate that specialized trafficking machinery exists for some PfEMP1 proteins (ie, IT4var19 requires the divergent PTP3 paralog for efficient trafficking)?

      PTP3 in the different IT4 lines that bind:

      In those parasite lines that did bind, the intrinsic variation in the binding assays, the different binding properties of different PfEMP1 variants and the variation in RNA Seq experiments to compare different parasite lines precludes a correlation of binding level vs ptp3 expression. For instance, if a PfEMP1 variant has lower binding capacity, ptp3 may still be higher but binding would be lower than if comparing to a parasite line with a better binding PfEMP1 variant. Studying the effect of PTP3 levels on binding could probably be done by overexpressing PTP3 in the same PfEMP1 SLI expressor line and assessing how this affects binding, but this would go beyond this manuscript.

      PTP3 in panned vs unpanned Var19:

      We did some comparisons between IT4 parent, and the IT4-Var19 panned and unpanned

      (see Author response table 1). This did not reveal any clear associations. While the parent had somewhat lower ptp3 transcript levels, they were still clearly higher than in the unpanned Var19 line and other lines had also ptp3 levels comparable to the panned IT4-Var19 (see Author response table 2) 

      PTP3 in the TGDs and possible reason for binding phenotype:

      A key point is whether PTP3 could have influenced the lack of binding in the TGD lines (see also weakness section and point 1 of public review of reviewer 3: ptp3 may be an indirect cause resulting in lacking binding in TGD parasites). We now did RNA Seq to check for ptp3 expression in the relevant TGD lines although we did not do a systematic quantitative comparison (which would require 3 replicates of RNASeq), but we reasoned that loss of expression would also be evident in one replicate. There was no indication that the TGD lines had lost PTP3 expression (see Author response table 2) and this is unlikely to explain the binding loss in a similar fashion to the Var19 parasites. Generally, the IT4 lines showed expression of both ptp3 genes and only in the Var19 parasites before panning were the transcript levels considerably lower:

      Author response table 1.

      Parent vs IT4-Var19 panned and unpanned

      Author response table 2.

      TGD lines with binding phenotype vs parent

      The absence of an influence of PTP3 on the binding phenotype in the cell lines in this manuscript (besides Var19) is further supported by its role in PfEMP1 surface display. Previous work has shown that KO of ptp3 leads to a loss of VAR2CSA surface display (Maier et al., Cell 2008). The unpanned Var19 parasite also lacked PfEMP1 surface display and panning and the resulting appearance of the binding phenotype was accompanied by surface display of PfEMP1. As both, the EMPIC3 and TryThra-TGD lines had still at least some PfEMP1 on the surface, this also (in addition to the RNA Seq above) speaks against PTP3 being the cause of the binding phenotype. The same applies to 3D7 which despite the poor binding displays PfEMP1 on the host cell surface (Figure 1D). This indicating that also the binding phenotype in 3D7 is not due to PTP3 expression loss, as this would have abolished PfEMP1 surface display. 

      The idea about PTP3 paralogs for specific PfEMP1s is intriguing. In the future it might be interesting to test the frequency of parasites with two PTP3 paralogs in endemic settings and correlate it with the PfEMP1 repertoire, variant expression and potentially disease severity. 

      (5) The IT4var01 line shows substantially lower binding in Figure 5F compared with the data shown in Figure 4E and 6F. Does this reflect changes in the binding capacity of the line over time or is this variability inherent to the assay?

      There is some inherent variability in these assays. While we did not systematically assess this, we had no indication that this was due to the parasite line changing. The Var01 line was cultured for months and was frozen down and thawed more than once without a clear gradual trend for more or less binding. While we can’t exclude some variation from the parasite side, we suspect it is more a factor of the expression of the receptor on the CHO cells the iRBCs bind to. 

      Specifically, the assays in Fig. 6F and 4E mentioned by the reviewer both had an average binding to CD36 of around 1000 iE/mm2, only the experiments in Fig. 5F are different (~ 500 iE/mm2) but these were done with a different batch of CHO cells at a different time to the experiments in Fig. 6F and 4E. 

      (6) In Figure S7A, TryThrA and EMPIC3 show distinct localization as circles around the PfEMP1 signal while PeMP2 appears to co-localize with PfEMP1 or as immediately adjacent spots (strong colocalization is less apparent than SBP1, and the various PfEMP1 IFAs throughout the study). Does this indicate that TryThrA and EMPIC3 are peripheral MC proteins? Does this have any implications for their function in PfEMP1 binding? Some discussion would help as these differences are not mentioned in the text. For the EMPIC3 TGD IFAs, localization of SBP1 and PfEMP1 is noted to be normal but REX1 is not mentioned (although this also appears normal).

      We apologise for the lacking description of the candidate localisations and cursory description of the Maurer’s clefts phenotypes (next point). Our original intent was to not distract too much from the main flow of the manuscript as almost every part of the manuscript could be followed up with more details. However, we fully agree that this is unsatisfactory and now provided more description (this point) and more data (next point).

      Localisation of TryThrA and EMPIC3 compared to PfEMP1 at the Maurer’s clefts: the circular pattern is reminiscent of the results with Maurer’s clefts proteins reported by McMillan et al using 3D-SIM in 3D7 parasites (McMillan et al., Cell Microbiology 2014 (PMID: 23421990)). In that work SBP1 and MAHRP1 (both integral TMD proteins) were found in foci but REX1 (no TMD) in circular structures around these foci similar to what we observed here for TryThrA and EMPIC3 which both also lack a TMD. The SIM data in McMillan et al indicated that also PfEMP1 is “more peripheral”, although it did only partially overlap with REX1. The conclusion from that work was that there are sub-compartments at the Maurer’s clefts. In our IFAs (Fig. S7A) PfEMP1 is also only partially overlapping with the TryThrA and EMPIC3 circles, potentially indicating similar subcompartments to those observed by 3D-SIM. We agree with the reviewer that this might be indicative of peripheral MC proteins, fitting with a lack of TMD in these candidates, but we did not further speculate on this in the manuscript.

      We now added enlargements of the ring-like structures to better illustrate this observation in Fig. S7A. In addition, we now specifically mention the localization data and the ring like signal with TryThrA and EMPIC3 in the results and state that this may be similar to the observations by McMillan et al., Cell Microbiology 2014.

      We also thank the reviewer for pointing out that we had forgotten to mention REX1 in the EMPIC3-TGD, this was amended.  

      (7) The atypical localization in TryThrA TGD line claimed for PfEMP1 and SBP1 in Fig S7B is not obvious. While most REX1 is clustered into a few spots in the IFA staining for SBP1 and REX1, SBP1 is only partially located in these spots and appears normal in the above IFA staining for SBP1 and HA. The atypical localization of PfEMP1-HA is also not obvious to me. The authors should clarify what is meant by "atypical" localization and provide support with quantification given the difference between the two SBP1 images shown.

      We apologise for the inadequate description of these IFA phenotypes. The abnormal signal for SBP1, REX1 and PfEMP1 in the TryThrA-TGD included two phenotypes found with all 3 proteins: 

      (1) a dispersed signal for these proteins in the host cell in addition to foci (the control and the other TGD parasites have only dots in the host cell with no or very little detectable dispersed signal). 

      (2) foci of disproportionally high intensity and size, that we assumed might be aggregation or enlargement of the Maurer’s clefts or of the detected proteins.

      The reason for the difference between the REX1 (aggregation) phenotype and the PfEMP1 and SBP1 (dispersed signal, more smaller foci) phenotypes in the images in Fig. S7B is that both phenotypes were seen with all 3 proteins but we chose a REX1 stained cell to illustrate the aggregation phenotype (the SBP1 signal in the same cell is similar to the REX1 signal, illustrating that this phenotype is not REX1 specific; please note that this cell also has a dispersed pool of REX1 and SBP1). 

      Based on the IFAs 66% (n = 106 cells) of the cells in the TryThrA-TGD parasites had one or both of the observed phenotypes. We did not include this into the previous version of the manuscript because a description would have required detouring from the main focus of this results section. In addition, IFAs have some limitations for accurate quantifications, particularly for soluble pools (depending on fixing efficiency and agent, more or less of a soluble pool in the host cell can leak out). 

      To answer the request to better explain and quantify the phenotype and given the limitations of IFA, we now transfected the TryThrA-TGD parasites with a plasmid mediating episomal expression of SBP1-mCherry, permitting live cell imaging and a better classification of the Maurer’s clefts phenotype. Due to the two SLI modifications in these parasites (using up 4 resistance markers) we had to use a new selection marker (mutated lactate transporter PfFNT, providing resistance to BH267.meta (Walloch et al., J. Med. Chem. 2020 (PMID: 32816478))) to transfect these parasites with an additional plasmid. 

      These results are now provided as Fig. S8 and detailed in the last results section. The new data shows that the majority of the TryThrA-TGD parasites contain a dispersed pool of SBP1 in the host cell. About a third of the parasites also showed disproportionally strong SBP1 foci that may be aggregates of the Maurer’s clefts. We also transfected the EMPIC3-TGD parasites with the FNT plasmid mediating episomal SBP1-mCherry expression and observed only few cells with a cytoplasmic pool or aggregates (Fig. S8). Overall these findings agree with the previous IFA results. As the IFA suggests similar results also for REX1 and PfEMP1, this defect is likely not SBP1 specific but more general (Maurer’s clefts morphology; association or transport of multiple proteins to the Maurer’s clefts). This gives a likely explanation for the cytoadherence phenotype in the TryThrA-TGD parasites. The reason for the EMPIC3-TGD phenotype remains to be determined as we did not detect obvious changes of the Maurer’s clefts morphology or in the transport of proteins to these structures in these experiments. 

      Minor comments

      (1) Italicized numbers in parenthesis are present in several places in the manuscript but it is not clear what these refer to (perhaps differently formatted citations from a previous version of the manuscript). Figure 1

      legend: (121); Figure S3 legend: (110), (111); Figure S6 legend: (66); etc.

      We thank the reviewer for pointing out this issue with the references, this was amended.

      (2) Figure 5A and legend: "BSD-R: BSD-resistance gene". Blasticidin-S (BS) is the drug while Blasticidin-S deaminase (BSD) is the resistance gene.

      We thank the reviewer for pointing this out, the legend and figure were changed.

      (3) Figure 5E legend: µ-SBP1-N should be α-SBP1-N.

      This was amended.

      (4) Figure S5 legend: "(Full data in Table S1)" should be Table S3.

      This was amended.

      (5) Figure S1G: The pie chart shows PF3D7_0425700 accounts for 43% of rif expression in 3D7var0425800 but the text indicates 62%.

      We apologize for this mistake, the text was corrected. We also improved the citations to Fig. S1G and H in this section.

      (6) "most PfEMP1-trafficking proteins show a similar early expression..." The authors might consider including a table of proteins known to be required for EMP1 trafficking and a graph showing their expression timing. Are any with later expressions known?

      Most exported proteins are expressed early, which is nicely shown in Marti et al 2004 (cited for the statement) in a graph of the expression timing of all PEXEL proteins (Fig. 4B in that paper). PNEPs also have a similar profile (Grüring et al 2011, also cited for that statement), further illustrated by using early expression as a criterion to find more PNEPs (Heiber et al., 2013 (PMID: 23950716)). Together this includes most if not all of the known PfEMP1 trafficking proteins. The originally co-submitted paper (Blancke-Soares & Stäcker et al., eLife preprint doi.org/10.7554/eLife.103633.1) analysed several later expressed exported proteins

      (Pf332, MSRP6) but their disruption, while influencing Maurer’s clefs morphology and anchoring, did not influence PfEMP1 transport. However, there are some conflicting results for Pf332 (referenced in Blancke-Soares & Stäcker et al). This illustrates that it may not be so easy to decide which proteins are bona fide PfEMP1 trafficking proteins. We therefore did not add a table and hope it is acceptable for the reader to rely on the provided 3 references to back this statement.

      (7)  Figure S1J: The predominate var in the IT4 WT parent is var66 (which appears to be syntenic with Pf3D7_0809100, the predominate var in the 3D7 WT parent). Is there something about this locus or parasite culture conditions that selects for these vars in culture? Is this observed in other labs as well?

      This is a very interesting point (although we are not certain these vars are indeed syntenic, they are on different chromosomes). As far as we know at least Pf3D7_0809100 is commonly a dominant var transcribed in other labs and was found expressed also in sporozoites (Zanghì et al. Cell Rep. 2018). However, it is unclear how uniform this really is. For IT4 we do not know in full but have also here commonly observed centromeric var genes to be dominating transcripts in unselected parasite cultures. It is possible that transcription drifts to centromeric var genes in cultured parasites. However, given the anecdotal evidence, it is unknown to which extent this is related to an inherent switching and regulation regiment or a consequence of faulty regulation following prolonged culturing.

      (8) Figure 4B, C: Presumably the asterisks on the DNA gels indicate non-specific bands but this is not described in the legend. Why are non-specific bands not consistent between parent and integrated lanes?

      We apologize for not mentioning this in the legend, this was amended.

      It is not clear why the non-specific bands differ between the lines but in part this might be due to different concentrations and quality of DNA preps. A PCR can also behave differently depending on whether the correct primer target is present or not. If present, the PCR will run efficiently and other spurious products will be outcompeted, but in absence of the correct target, they might become detectable.  

      Overall, we do not think the non-specific bands are indications of anything untoward with the lines, as for instance in Fig. 4B the high band in the 5’ integration in the IT4 line (that does not occur anywhere else) can’t be due to a genomic change as this is the parental line and does not contain the plasmid for integration. In the same gel, the ori locus band of incorrect size (likely due to crossreaction of the primers to another var gene which due to the high similarity of the ATS region is not always fully avoidable), is present in both, the parent IT4 and the integrant line which therefore also is not of concern. In C there are a couple of bands of incorrect size in the Integration line. One of these is very faint and both are too large and again therefore are likely other vars that are inefficiently picked up by these primers. The reason they are not seen in the parent line is that there the correct primer binding site is present, which then efficiently produces a product that outcompetes the product derived from non-optimal matching primer products and hence appear in the Int line where the correct match is not there anymore. For these reasons we believe these bands are not of any concern.  

      (9) Figure 4C: Is there a reason KAHRP was used as a co-marker for the IFA detecting IT4var19 expression instead of SBP1 which was used throughout the rest of the study?

      This is a coincidence as this line was tested when other lines were tested for KAHRP. As there were foci in the host cell we were satisfied that the HA-tagged PfEMP1 is produced and the localization deemed plausible. 

      (10) Figure 6: Streptavidin labeling for the IT4var01-BirA position 3 line is substantially less than the other two lines in both IFA and WB. Does the position 3 fusion reduce PfEMP1 protein levels or is this a result of the context or surface display of the fusion? Interestingly, the position 3 trypsin cleavage product appears consistently more robust compared with the other two configurations. Does this indicate that positioning BirA upstream of the TM increases RBC membrane insertion and/or makes the surface localized protein more accessible to trypsin?

      It is possible that RBC membrane insertion or trypsin accessibility is increased for the position 3 construct. But there could also be other explanations:

      The reason for the more robustly detected protected fragment for the position 3 construct in the WB might also be its smaller size (in contrast to the other two versions, it does not contain BirA*) which might permit more efficient transfer to the WB membrane. In that case the more robust band might not (only) be due to better membrane insertion or better trypsin accessibility.

      The lower biotinylation signal with the position 3 construct might also be explained by the farther distance of BirA* to the ATS (compared to position 1 and 2), the region where interactors are expected to bind. The position 1 and 2 constructs may therefore generally be more efficient (as closer) to biotinylate ATS proximal proteins. Further, in the final destination (PfEMP1 inserted into the RBC membrane) BirA* would be on the other side of the membrane in the position 3 construct while in the position 1 and 2 constructs BirA* would be on the side of the membrane where the ATS anchors PfEMP1 in the knob structure. In that case, labelling with position 3 would come from interactions/proximities during transport or at the Maurer’s clefts (if there indeed PfEMP1 is not membrane embedded) and might therefore be less.

      Hence, while alterations in trypsin accessibility and RBC membrane insertion are possible explanations, other explanations exist. At present, we do not know which of these explanations apply and therefore did not mention any of them in the manuscript. 

      Reviewer #3 (Recommendations for the authors):

      (1) In the abstract and on page 8, the authors mention that they generate cell lines binding to "all major endothelial receptors" and "all known major receptors". This is a pretty allencompassing statement that might not be fully accepted by others who have reported binding to other receptors not considered in this paper (e.g. VCAM, TSP, hyaluronic acid, etc). It would be better to change this statement to something like "the most common endothelial receptors" or "the dominant endothelial receptors", or something similar.

      We agree with the reviewer that these statements are too all-encompassing and changed them to “the most common endothelial receptors” (introduction) and “the most common receptors” (results).

      (2) The authors targeted two rif genes for activation and in each case the gene became the most highly expressed member of the family. However, unlike var genes, there were other rif genes also expressed in these lines and the activated copy did not always make up the majority of rif mRNAs. The authors might wish to highlight that this is inconsistent with mutually exclusive expression of this gene family, something that has been discussed in the past but not definitively shown.

      We thank the reviewer for highlighting this, we now added the following statement to this section: “While SLI-activation of rif genes also led to the dominant expression of the targeted rif gene, other rif genes still took up a substantial proportion of all detected rif transcripts, speaking against a mutually exclusive expression in the manner seen with var genes.”

      (3) In Figure 6, H-J, the authors display volcano plots showing proteins that are thought to interact with PfEMP1. These are labeled with names from the literature, however, several are named simply "1, 2, 3, 4, 5, or 6". What do these numbers stand for?

      We apologize for not clarifying this and thank the reviewer for pointing this out. There is a legend for the numbered proteins in what is now Table S4 (previously Table S3). We now amended the legend of Figure 6 to explain the numbers and pointing the reader to Table S4 for the accessions.

    1. Reviewer #1 (Public review):

      Summary:

      This unique study reports original and extensive behavioral data collected by the authors on 21 living mammal taxa in zoo conditions (primates, tree shrew, rodents, carnivorans, and marsupials) on how descent along a vertical substrate can be done effectively and securely using gait variables. Ten morphological variables reflecting head size and limb proportions are examined in relationship to vertical descent strategies and then applied to reconstruct modes of vertical descent in fossil mammals.

      Strengths:

      This is a broad and data-rich comparative study, which requires a good understanding of the mammal groups being compared and how they are interrelated, the kinematic variables that underlie the locomotion used by the animals during vertical descent, and the morphological variables that are associated with vertical descent styles. Thankfully, the study presents data in a cogent way with clear hypotheses at the beginning, followed by results and a discussion that addresses each of those hypotheses using the relevant behavioral and morphological variables, always keeping in mind the relationships of the mammal groups under investigation. As pointed out in the study, there is a clear phylogenetic signal associated with vertical descent style. Strepsirrhine primates much prefer descending tail first, platyrrhine primates descend sideways when given a choice, whereas all other mammals (with the exception of the raccoon) descend head first. Not surprisingly, all mammals descending a vertical substrate do so in a more deliberate way, by reducing speed, and by keeping the limbs in contact for a longer period (i.e., higher duty factors).

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment:

      This valuable study examines how mammals descend effectively and securely along vertical substrates. The conclusions from comparative analyses based on behavioral data and morphological measurements collected from 21 species across a wide range of taxa are convincing, making the work of interest to all biologists studying animal locomotion.

      We would like to greatly thank the two reviewers for their time in reviewing this work, and for their valuable comments and suggestions that will help to improve this manuscript.

      Overall, we agree with the weaknesses raised, which are mainly areas for consideration in future studies: to study more species, and in a natural habitat context.

      We will nevertheless add a few modifications to improve the manuscript, notably by making certain figures more readable, and adding definitions and bibliography in the main text concerning gait characteristics.

      We also provide brief comments on each point of weakness raised by the reviewers below, in blue.

      Reviewer #1 (Public review):

      Summary:

      This unique study reports original and extensive behavioral data collected by the authors on 21 living mammal taxa in zoo conditions (primates, tree shrew, rodents, carnivorans, and marsupials) on how descent along a vertical substrate can be done effectively and securely using gait variables. Ten morphological variables reflecting head size and limb proportions are examined in relationship to vertical descent strategies and then applied to reconstruct modes of vertical descent in fossil mammals.

      Strengths:

      This is a broad and data-rich comparative study, which requires a good understanding of the mammal groups being compared and how they are interrelated, the kinematic variables that underlie the locomotion used by the animals during vertical descent, and the morphological variables that are associated with vertical descent styles. Thankfully, the study presents data in a cogent way with clear hypotheses at the beginning, followed by results and a discussion that addresses each of those hypotheses using the relevant behavioral and morphological variables, always keeping in mind the relationships of the mammal groups under investigation. As pointed out in the study, there is a clear phylogenetic signal associated with vertical descent style. Strepsirrhine primates much prefer descending tail first, platyrrhine primates descend sideways when given a choice, whereas all other mammals (with the exception of the raccoon) descend head first. Not surprisingly, all mammals descending a vertical substrate do so in a more deliberate way, by reducing speed, and by keeping the limbs in contact for a longer period (i.e., higher duty factors).

      Weaknesses:

      The different gait patterns used by mammals during vertical descent are a bit more difficult to interpret. It is somewhat paradoxical that asymmetrical gaits such as bounds, half bounds, and gallops are more common during descent since they are associated with higher speeds and lower duty factors. Also, the arguments about the limb support polygons provided by DSDC vs. LSDC gaits apply for horizontal substrates, but perhaps not as much for vertical substrates.

      We analyzed gait patterns using methods commonly found in the literature and discussed our results accordingly. However, the study of limbs support polygons was indeed developed specifically for studying locomotion on horizontal supports, and may not be applicable for studying vertical locomotion, which is in fact a type of locomotion shared by all arboreal species. In the future, it would be interesting to consider new methods for analyzing vertical gaits.

      The importance of body mass cannot be overemphasized as it affects all aspects of an animal's biology. In this case, larger mammals with larger heads avoid descending head-first. Variation in trunk/tail and limb proportions also covaries with different vertical descent strategies. For example, a lower intermembral index is associated with tail-first descent. That said, the authors are quick to acknowledge that the five lemur species of their sample are driving this correlation. There is a wide range of intermembral indices among primates, and this simple measure of forelimb over hindlimb has vital functional implications for locomotion: primates with relatively long hindlimbs tend to emphasize leaping, primates with more even limb proportions are typically pronograde quadrupeds, and primates with relatively long forelimbs tend to emphasize suspensory locomotion and brachiation. Equally important is the fact that the intermembral index has been shown to increase with body mass in many primate families as a way to keep functional equivalence for (ascending) climbing behavior (see Jungers, 1985). Therefore, the manner in which a primate descends a vertical substrate may just be a by-product of limb proportions that evolved for different locomotor purposes. Clearly, more vertical descent data within a wider array of primate intermembral indices would clarify these relationships. Similarly, vertical descent data for other primate groups with longer tails, such as arboreal cercopithecoids, and particularly atelines with very long and prehensile tails, should provide more insights into the relationship between longer tail length and tail-first descent observed in the five lemurs. The relatively longer hallux of lemurs correlates with tail-first descent, whereas the more evenly grasping autopods of platyrrhines allow for all four limbs to be used for sideways descent. In that context, the pygmy loris offers a striking contrast. Here is a small primate equipped with four pincer-like, highly grasping autopods and a tail reduced to a short stub. Interestingly, this primate is unique within the sample in showing the strongest preference for head-first descent, just like other non-primate mammals. Again, a wider sample of primates should go a long way in clarifying the morphological and behavioral relationships reported in this study.

      We agree with this statement. In the future, we plan to study other species, particularly large-bodied ones with varied intermembral indexes.

      Reconstruction of the ancient lifestyles, including preferred locomotor behaviors, is a formidable task that requires careful documentation of strong form-function relationships from extant species that can be used as analogs to infer behavior in extinct species. The fossil record offers challenges of its own, as complete and undistorted skulls and postcranial skeletons are rare occurrences. When more complete remains are available, the entire evidence should be considered to reconstruct the adaptive profile of a fossil species rather than a single ("magic") trait.

      We completely agree with this, and we would like to emphasize that our intention here was simply to conduct a modest inference test, the purpose of which is to provide food for thought for future studies, and whose results should be considered in light of a comprehensive evolutionary model.

      Reviewer #2 (Public review):

      Summary:

      This paper contains kinematic analyses of a large comparative sample of small to medium-sized arboreal mammals (n = 21 species) traveling on near-vertical arboreal supports of varying diameter. This data is paired with morphological measures from the extant sample to reconstruct potential behaviors in a selection of fossil euarchontaglires. This research is valuable to anyone working in mammal locomotion and primate evolution.

      Strengths:

      The experimental data collection methods align with best research practices in this field and are presented with enough detail to allow for reproducibility of the study as well as comparison with similar datasets. The four predictions in the introduction are well aligned with the design of the study to allow for hypothesis testing. Behaviors are well described and documented, and Figure 1 does an excellent job in conveying the variety of locomotor behaviors observed in this sample. I think the authors took an interesting and unique angle by considering the influence of encephalization quotient on descent and the experience of forward pitch in animals with very large heads.

      Weaknesses:

      The authors acknowledge the challenges that are inherent with working with captive animals in enclosures and how that might influence observed behaviors compared to these species' wild counterparts. The number of individuals per species in this sample is low; however, this is consistent with the majority of experimental papers in this area of research because of the difficulties in attaining larger sample sizes.

      Yes, that is indeed the main cost/benefit trade-off with this type of study. Working with captive animals allows for large comparative studies, but there is a risk of variations in locomotor behavior among individuals in the natural environment, as well as few individuals per species in the dataset. That is why we plan and encourage colleagues to conduct studies in the natural environment to compare with these results. However, this type of study is very time-consuming and requires focusing on a single species at a time, which limits the comparative aspect.

      Figure 2 is difficult to interpret because of the large amount of information it is trying to convey.

      We agree that this figure is dense. One possible solution would be to combine species by phylogenetic groups to reduce the amount of information, as we did with Fig. 3 on the dataset relating to gaits. However, we believe that this would be unfortunate in the case of speed and duty factor because we would have to provide the complete figure in SI anyway, as the species-level information is valuable. We therefore prefer to keep this comprehensive figure here and we will enlarge the data points to improve their visibility, and provide the figure with a sufficiently high resolution to allow zooming in on the details.

      Reviewer #1 (Recommendations for the authors):

      As indicated in the first section above, this is a strong comparative study that addresses important questions, relative to the evolution of arboreal locomotion in primates and close mammal relatives. My recommendations should be taken in the context of improving a manuscript that is already generally acceptable.

      (1) The terms symmetrical and asymmetrical gaits should be briefly defined in the main text (not just in the Methods section) by citing work done by Hildebrand and other relevant studies. To that effect, the statement on lines 96-97 about the convergence of symmetrical gaits is unclear. What does "Symmetrical gaits have evolved convergently in rodents, scandentians, carnivorans, and marsupials" mean? Symmetrical gaits such as the walk, run, trot, etc., are pretty the norm in most mammals and were likely found in metatherians and basal eutherians. This needs clarification. On line 239, the term "ambling" is used in the context of related asymmetrical gaits. To be clear, the amble is a type of running gait involving no whole-body aerial phase and is therefore a symmetrical gait (see Schmitt et al., 2006).

      We have added a definition of the terms symmetrical and asymmetrical gaits and added references in the introduction such as: “Symmetrical gaits are defined as locomotor patterns in which the footfalls of a girdle (a pair of fore- or hindlimbs) are evenly spaced in time, with the right and left limbs of a pair of limbs being approximately 50% out of phase with each other (Hildebrand, 1966, 1967). Symmetrical gaits can be further divided into two types: diagonal-sequence gaits, in which a hindlimb footfall is followed by that of the contralateral forelimb, and lateral-sequence gaits, in which a hindlimb footfall is followed by that of the ipsilateral forelimb (Hildebrand, 1967; Shapiro and Raichlen, 2005; Cartmill et al., 2007b). In contrast, asymmetrical gaits are characterized by unevenly spaced footfalls within a girdle, with the right and left limbs moving in near synchrony (Hildebrand, 1977).” Now found in lines 87-94.

      We corrected the sentence such as “Symmetrical gaits are also common in rodents, scandentians, etc..” Now found in line 107.

      Thank you for pointing this out. We indeed did not use the right term to mention related asymmetrical gaits with increased duty factors. We removed the term « ambling » and the associated reference here. Now found in line 256.

      (2) Correlations are used in the paper to examine how brain mass scales with body mass. It is correct to assume that a correlation significantly different from 0 is indicative of allometry (in this case, positive). That said, lines are used in Figure S2 that go through the bivariate scatter plot. The vast majority of scaling studies rely on regression techniques to calculate and compare slopes, which are different statistically from correlations. In this case, a slope not significantly different from 1.0 would support the hypothesis of isometry based on geometric similarity (as brain mass and body mass are two volumes). The authors could refer to the work of Bob Martin and the 1985 edited book by Jungers and contributions therein. These studies should also be cited in the paper.

      Thank you for recommending us this better suited method. We replaced the correlations with major axis orthogonal regressions, as recommended by Martin and Barbour 1989. We found a positive slope for all species significantly different from 1 (0.36), indicating a negative allometry (we realized we were mistaken about the allometry terminology, initially reporting a “positive allometry” instead of a positive correlation).

      We corrected in the manuscript in the Results and Methods sections, and cited Martin and Barbour 1989 such as:

      “To ensure that the EQs of the different species studied are comparable and meaningful, we tested the allometry between the brain and body masses in our dataset following [84] and found a significant and positive slope for all species (major axis orthogonal regression on log transformed values: slope = 0.36, r<sup>2</sup> = 0.92, p = 5.0.10<sup>-12</sup>), indicating a negative allometry (r = 0.97, df = 19, p = 2.0.10<sup>-13</sup>), and similar allometric coefficients when restricting the analysis to phylogenetic groups (Fig. S2).” Now found in lines 289-298.

      - “To control that brain allometry is homogeneous among all phylogenetic groups, to be able to compare EQ between species, we computed major axis orthogonal regressions, following the recommendation of Martin and Barbour [84], between the Log transformed brain and body masses, over all species and by phylogenetic group using the sma package in R (Fig. S2).” Now found in lines 336-338.

      We also changed Figure S2 in Supplementary Information accordingly.

      (3) Trunk length is used as the denominator for many of the indices used in the study. In this way, trunk length is considered to be a proxy for body size. There should be a demonstration that trunk length scales isometrically with body mass in all of the mammals compared. If not the case, some of the indices may not be directly comparable.

      We did not use trunk length as a proxy for body mass, but to compute geometric body proportions in order to test whether intrinsic body proportions could be related to vertical descent behaviors, namely the length of the tail and of the fore- and hindlimbs relative to the animal. We chose those indices to quantify the capability of limbs to act as levers or counterweights to rotate the animals for this specific question of vertical descent behavior. We therefore do not think that body mass allometry with respect to trunk length is relevant to compare these indices across species here. Also, we don’t expect that trunk length (which is a single dimension) would scale isometrically with body mass, which scales more as a volume.

      (4) Given the numerous comparisons done in this study, a Bonferroni correction method should be considered to mitigate type I error (accepting a false positive).

      We had already corrected all our statistical tests using the Benjamini-Hochberg method to control for false positives; see the SuppTables Excel file for the complete results of the statistical analyses. We chose this method over the Bonferroni correction because the more modern and balanced Benjamini-Hochberg procedure is better suited for analyses involving a large number of hypotheses.

      (5) The terms "arm" and "leg" used in the main text and Table 1 are anatomically incorrect. Instead, the terms "forelimb" and hindlimb" should be used as they include the length sum of the stylopod, zeugopod, and autopod.

      Indeed, thank you for pointing that out. We have corrected this error within the manuscript as well as in the figures 4 and S3.

      (6) On p. 14, the authors make the statement that the postcranial anatomy of Adapis and Notharctus remains undescribed. The authors should consult the work of Dagosto, Covert, Godinot and others.

      We did not state that the postcranial remains of Adapis and Notharctus have not been described. However, we were unfortunately unable to find published illustrations of the known postcranial elements that could be reliably used in this study. To avoid any misunderstanding, we removed the sentence such as: “However, we could not find suitable illustrations of the known postcranial elements of these species in the literature that could be reliably incorporated into this study. Thus, we only included their reconstructed body mass and EQ,..”. Now found in lines 393-397.

      Reviewer #2 (Recommendations for the authors):

      (1) Line 65/69 - Perchalski et al. 2021 is a single-author publication, so no et al. or w/ colleagues.

      Indeed. This has been corrected in the manuscript, now found in lines 65 and 70.

      (2) Lines 96-98 - Is it appropriate to say that the use of symmetrical gaits are examples of convergent evolution? There's less burden of evidence to state that these are shared behaviors, rather than suggesting they independently evolved across all those groups.

      We agree with this and corrected the sentence such as “Symmetrical gaits are also common in rodents, scandentians, etc..” Now found in line 107.

      (3) Line 198 - I am confused by how to interpret (-16,36 %) compared to how other numbers are presented in the rest of the paragraph.

      To avoid confusion, we rephrased this sentence such as: “In contrast, primates did not significantly reduce their speed compared to ascents when descending sideways or tail-first (Fig. 2A, SuppTables B).”  Now found in lines 207-209.

    1. In religious art, a halo often hovers over a saint's head, bathing the individual in a heavenly light to create the impression that that person is good.

      This is interesting because the halo effect was not only expressed in literature but also deliberately used in other areas like art.

    1. Reviewer #3 (Public review):

      The goal of the work is to establish the linkage between the spatial transcription factors (STFs) that function transiently to establish the identities of the individual NBs and the terminal selector genes (typically homeodomain genes) that appear in the newborn post-mitotic neurons. How is the identity of the NB maintained and carried forward after the spatial genes have faded away? Focusing on a single neuroblast (NB 7-1), the authors present evidence that the fork-head transcription factor, fd4, provides a bridge linking the transient spatial cues that initially specified neuroblast identity with the terminal selector genes that establish and maintain the identity of the stem cell's progeny.

      The study is systematic, concise, and takes full advantage of 40+ years of work on the molecular players that establish neuronal identities in the Drosophila CNS. In the embryonic VNC, fd4 is expressed only in the NB 7-1 and its lineage. They show that Fd4 appears in the NB while the latter is still expressing the Spatial Transcription Factors and continues after the expression of the latter fades out. Fd4 is maintained through the early life of the neuronal progeny but then declines as the neurons turn on their terminal selector genes. Hence, fd4 expression is compatible with it being a bridging factor between the two sets of genes.

      Experimental support for the "bridging" role of Fd4 comes from a set of loss-of-function and gain-of-function manipulations. The loss of function of Fd4, and the partially redundant gene Fd5, from lineage 7-1 does not affect the size of the lineage, but terminal markers of late-born neuronal phenotypes, like Eve and Dbx, are reduced or missing. By contrast, ectopic expression of fd4, but not fd5, results in ectopic expression of the terminal markers eve and Dbx throughout diverse VNC lineages.

      A detailed test of fd4's expression was then carried out using lineages 7-3 and 5-6, two well-characterized lineages in Drosophila. Lineage 7-3 is much smaller than 7-1 and continues to be so when subjected to fd4 misexpression. However, under the influence of ectopic Fd4 expression, the lineage 7-3 neurons lost their expected serotonin and corazonin expression and showed Eve expression as well as motoneuron phenotypes that partially mimic the U motoneurons of lineage 7-1.

      Ectopic expression of Fd4 also produced changes in the 5-6 lineage. Expression of apterous, a feature of lineage 5-6, was suppressed, and expression of the 7-1 marker, Eve, was evident. Dbx expression was also evident in the transformed 5-6 lineages, but extremely restricted as compared to a normal 7-1 lineage. Considering the partial redundancy of fd4 and fd5, it would have been interesting to express both genes in the 5-6 lineage. The anatomical changes that are exhibited by motoneurons in response to Fd4 expression confirm that these cells do, indeed, show a shift in their cellular identity.

    2. Author response:

      Reviewer #1 (Public Review):

      Lai and Doe address the integration of spatial information with temporal patterning and genes that specify cell fate. They identify the Forkhead transcription factor Fd4 as a lineage-restricted cell fate regulator that bridges transient spatial transcription factors to terminal selector genes in the developing Drosophila ventral nerve cord. The experimental evidence convincingly demonstrates that Fd4 is both necessary for lateborn NB7-1 neurons, but also sufficient to transform other neural stem cell lineages toward the NB7-1 identity. This work addresses an important question that will be of interest to developmental neurobiologists: How can cell identities defined by initial transient developmental cues be maintained in the progeny cells, even if the molecular mechanism remains to be investigated? In addition, the study proposes a broader concept of lineage identity genes that could be utilized in other lineages and regions in the Drosophila nervous system and in other species. 

      Thanks for the accurate summary and positive comments!

      While the spatial factors patterning the neuroepithelium to define the neuroblast lineages in the Drosophila ventral nerve cord are known, these factors are sometimes absent or not required during neurogenesis. In the current work, Lai and Doe identified Fd4 in the NB7-1 lineage that bridges this gap and explains how NB7-1 neurons are specified after Engrailed (En) and Vnd cease their expression. They show that Fd4 is transiently co-expressed with En and Vnd and is present in all nascent NB7-1 progenies. They further demonstrate that Fd4 is required for later-born NB7-1 progenies and sufficient for the induction of NB7-1 markers (Eve and Dbx) while repressing markers of other lineages when force-expressed in neural progenitors, e.g., in the NB56 lineage and in the NB7-3 lineage. They also demonstrate that, when Fd4 is ectopically expressed in NB7-3 and NB5-6 lineages, this leads to the ectopic generation of dorsal muscle-innervating neurons. The inclusion of functional validation using axon projections demonstrates that the transformed neurons acquire appropriate NB7-1 characteristics beyond just molecular markers. Quantitative analyses are thorough and well-presented for all experiments.

      Thanks for the positive comments!

      (1) While Fd4 is required and sufficient for several later-born NB7-1 progeny features, a comparison between early-born (Hb/Eve) and later-born (Run/Eve) appears missing for pan-progenitor gain of Fd4 (with sca-Gal4; Figure 4) and for the NB7-3 lineage (Figure 6). Having a quantification for both could make it clearer whether Fd4 preferentially induces later-born neurons or is sufficient for NB7-1 features without temporal restriction.

      We quantified the percentage of Hb+ and Runt+ cells among Eve+ cells with sca-gal4, and the results are shown in Figure 4-figure supplement 1. We found that the proportion of early-born cells is slightly reduced but the proportion of later-born cells remain similar. Interestingly, we also found a subset of Eve+ cells with a mixed fate (Hb+Runt+) but the reason remains unclear.

      (2) Fd4 and Fd5 are shown to be partially redundant, as Fd4 loss of function alone does not alter the number of Eve+ and Dbx+ neurons. This information is critical and should be included in Figure 3.

      Because every hemisegment in an fd4 single mutant is normal, we just added it as the following text: “In fd4 mutants, we observe no change in the number of Eve+ neurons or Dbx+ neurons (n=40 hemisegments).”

      (3) Several observations suggest that lineage identity maintenance involves both Fd4dependent and Fd4-independent mechanisms. In particular, the fact that fd4-Gal4 reporter remains active in fd4/fd5 mutants even after Vnd and En disappear indicates that Fd4's own expression, a key feature of NB7-1 identity, is maintained independently of Fd4 protein. This raises questions about what proportion of lineage identity features require Fd4 versus other maintenance mechanisms, which deserves discussion.

      We agree, thanks for raising this point. We add the following text to the Discussion. “Interestingly, the fd4 fd5 mutant maintains expression of fd4:gal4, suggesting that the fd4/fd5 locus may have established a chromatin state that allows “permanent” expression in the absence of Vnd, En, and Fd4/Fd5 proteins.”

      (4) Similarly, while gain of Fd4 induces NB7-1 lineage markers and dorsal muscle innervation in NB5-6 and NB7-3 lineages, drivers for the two lineages remain active despite the loss of molecular markers, indicating some regulatory elements retain activity consistent with their original lineage identity. It is therefore important to understand the degree of functional conversion in the gain-of-function experiments. Sparse labeling of Fd4 overexpressing NB5-6 and NB7-3 progenies, as was done in Seroka and Doe (2019), would be an option.

      We agree it is interesting that the NB7-3 and NB5-6 drivers remain on following Fd4 misexpression. To explore this, we used sca-gal4 to overexpress Fd4 and observed that Lbe expression persisted while Eg was largely repressed (see Author response image 1 below). The results show that Lbe and Eg respond differently to Fd4. A non-mutually exclusive possibility is that the continued expression of lbe-Gal4 UAS-GFP or eg-Gal4 UAS-GFP may be due to the lengthy perdurance of both Gal4 and GFP.

      Author response image 1.

      (5) The less-penetrant induction of Dbx+ neurons in NB5-6 with Fd4-overexpression is interesting. It might be worth the authors discussing whether it is an Fd4 feature or an NB56 feature by examining Dbx+ neuron number in NB7-3 with Fd4-overexpression.

      In the NB7-3 lineages misexpressing Fd4, only 5 lineages generated Dbx+ cells (0.1±0.4, n=64 hemisegments), suggesting that the low penetrance of Dbx+ induction is an intrinsic feature of Fd4 rather than lineage context. We have added this information in the results section. 

      (6) It is logical to hypothesize that spatial factors specify early-born neurons directly, so only late-born neurons require Fd4, but it was not tested. The model would be strengthened by examining whether Fd4-Gal4-driven Vnd rescues the generation of laterborn neurons in fd4/fd5 mutants.

      When we used en-gal4 driver to express UAS-vnd in the fd4/fd5 mutant background, we found an average 7.4±2.2 Eve+ cells per hemisegment (n=36), significantly higher than fd4/fd5 mutant alone (3.9±0.8 cells, n=52, p=2.6x10<sup.-11</sup>) (Figure 3J). In addition, 0.2±0.5 Eve+ cells were ectopic Hb+ (excluding U1/U2), indicating that Vnd-En integration is sufficient to generate both early-born and late-born Eve+ cells in the fd4/fd5 mutants. We have added the results to the text.

      (7) It is mentioned that Fd5 is not sufficient for the NB7-1 lineage identity. The observation is intriguing in how similar regulators serve distinct roles, but the data are not shown. The analysis in Figure 4 should be performed for Fd5 as supplemental information.

      Thanks for the suggestion. Because the results are exactly the same as the wild type, we don’t think it is necessary to provide an additional images or analysis as supplemental information.

      Reviewer #2 (Public review):

      Via a detailed expression analysis, they find that Fd4 is selectively expressed in embryonic NB7-1 and newly born neurons within this lineage. They also undertake a comprehensive genetic analysis to provide evidence that fd4 is necessary and sufficient for the identity of NB7-1 progeny. 

      Thanks for the accurate summary!

      The analysis is both careful and rigorous, and the findings are of interest to developmental neurobiologists interested in molecular mechanisms underlying the generation of neuronal diversity. Great care was taken to make the figures clear and accessible. This work takes great advantage of years of painstaking descriptive work that has mapped embryonic neuroblast lineages in Drosophila. 

      Thanks for the positive comments!

      The argument that Fd4 is necessary for NB7-1 lineage identity is based on a Fd4/Fd5 double mutant. Loss of fd4 alone did not alter the number of NB7-1-derived Eve+ or Dbx+ neurons. The authors clearly demonstrate redundancy between fd4 and fd5, and the fact that the LOF analysis is based on a double mutant should be better woven through the text.

      The authors generated an Fd5 mutant. I assume that Fd5 single mutants do not display NB7-1 lineage defects, but this is not stated. The focus on Fd4 over Fd5 is based on its highly specific expression profile and the dramatic misexpression phenotypes. But the LOF analysis demonstrates redundancy, and the conclusions in the abstract and through the results should reflect the existence of Fd5 in the conclusions of this manuscript.

      We agree, and have added new text to clarify the single mutant phenotypes (there are none) and the double mutant phenotype (loss of NB7-1 molecular and morphological features. The following text is added to the manuscript: “Not surprisingly, we found that fd4 single mutants or fd5 single mutants had no phenotype (Eve+ neurons were all normal). Thus, to assess their roles, we generated a fd4 and fd5 double mutant. Because many Eve+ and Dbx+ cells are generated outside of NB7-1 lineage, it was also essential to identify the Eve+ or Dbx+ cells within NB7-1 lineage in wild type and fd4 mutant embryos. To achieve this, we replaced the open reading frame of fd4 with gal4 (called fd4-gal4) (see Methods); this stock simultaneously knocked out both fd4 and fd5 (called fd4/fd5 mutant hereafter) while specifically labeling the NB7-1 lineage. For the remainder of this paper we use the fd4/fd5 double mutant to assay for loss of function phenotypes.”

      It is notable that Fd4 overexpression can rewire motor circuits. This analysis adds another dimension to the changes in transcription factor expression and, importantly, demonstrates functional consequences. Could the authors test whether U4 and U5 motor axon targeting changes in the fd4/fd5 double mutant? To strengthen claims regarding the importance of fd4/fd5 for lineage identity, it would help to address terminal features of U motorneuron identity in the LOF condition.

      Thanks for raising this important point. We examined the axon targeting on body wall muscles in both wild type and in fd4/fd5 mutant background and added the results in Figure 3-figure supplement 2. We found that the axon targeting in the late-born neuron region (LL1) is significantly reduced, suggesting that the loss of late-born neurons in fd4/fd5 mutant leads to the absence of innervation of corresponding muscle targets.

      Reviewer #3 (Public review):

      The goal of the work is to establish the linkage between the spatial transcription factors (STFs) that function transiently to establish the identities of the individual NBs and the terminal selector genes (typically homeodomain genes) that appear in the newborn postmitotic neurons. How is the identity of the NB maintained and carried forward after the spatial genes have faded away? Focusing on a single neuroblast (NB 7-1), the authors present evidence that the fork-head transcription factor, fd4, provides a bridge linking the transient spatial cues that initially specified neuroblast identity with the terminal selector genes that establish and maintain the identity of the stem cell's progeny. 

      Thanks for the positive comments!

      The study is systematic, concise, and takes full advantage of 40+ years of work on the molecular players that establish neuronal identities in the Drosophila CNS. In the embryonic VNC, fd4 is expressed only in the NB 7-1 and its lineage. They show that Fd4 appears in the NB while the latter is still expressing the Spatial Transcription Factors and continues after the expression of the latter fades out. Fd4 is maintained through the early life of the neuronal progeny but then declines as the neurons turn on their terminal selector genes. Hence, fd4 expression is compatible with it being a bridging factor between the two sets of genes. 

      Thanks for the accurate summary!

      Experimental support for the "bridging" role of Fd4 comes from a set of loss-of-function and gain-of-function manipulations. The loss of function of Fd4, and the partially redundant gene Fd5, from lineage 7-1 does not aoect the size of the lineage, but terminal markers of late-born neuronal phenotypes, like Eve and Dbx, are reduced or missing. By contrast, ectopic expression of fd4, but not fd5, results in ectopic expression of the terminal markers eve and Dbx throughout diverse VNC lineages. 

      Thanks for the accurate summary!

      A detailed test of fd4's expression was then carried out using lineages 7-3 and 5-6, two well-characterized lineages in Drosophila. Lineage 7-3 is much smaller than 7-1 and continues to be so when subjected to fd4 misexpression. However, under the influence of ectopic Fd4 expression, the lineage 7-3 neurons lost their expected serotonin and corazonin expression and showed Eve expression as well as motoneuron phenotypes that partially mimic the U motoneurons of lineage 7-1.

      Thanks for the positive comments!

      Ectopic expression of Fd4 also produced changes in the 5-6 lineage. Expression of apterous, a feature of lineage 5-6, was suppressed, and expression of the 7-1 marker, Eve, was evident. Dbx expression was also evident in the transformed 5-6 lineages, but extremely restricted as compared to a normal 7-1 lineage. Considering the partial redundancy of fd4 and fd5, it would have been interesting to express both genes in the 5-6 lineage. The anatomical changes that are exhibited by motoneurons in response to Fd4 expression confirm that these cells do, indeed, show a shift in their cellular identity.

      We appreciate the positive comments. We agree double misexpression of Fd4 and Fd5 might give a stronger phenotype (as the reviewer says) but the lack of this experiment does not change the conclusions that Fd4 can promote NB7-1 molecular and morphological aspects at the expense of NB5-6 molecular markers.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript addresses the important problem of the uncoupling of oxidative phosphorylation due to hypoxia-ischemia injury in the neonatal brain and provides insight into the neuroprotective mechanisms of hypothermia treatment.

      Strengths:

      The authors used a combination of in vivo imaging of awake P10 mice and experiments on isolated mitochondria to assess various key parameters of brain metabolism during hypoxia-ischemia with and without hypothermia treatment. This unique approach resulted in a comprehensive data set that provides solid evidence to support the derived conclusions.

      Weaknesses:

      Several potential weaknesses were identified in the original submission, which the authors subsequently addressed in the revised manuscript. Here is the brief list of the questions:

      (1) Is it possible that the observed relatively low baseline OEF and trends of increased OEF and CBF over several hours after the imaging start were partially due to slow recovery from anesthesia?

      (2) What was the pain management, and is there a possibility that some of the observations were influenced by the pain-reducing drugs or their absence?

      (3) Were P10 mice significantly stressed during imaging in the awake state because they didn't have head-restraint habituation training?

      (4) Considering high metabolism and blood flow in the cortex, it could be potentially challenging to predict cortical temperature based on the skull temperature, particularly in the deeper part of the cortex.

      (5) The map of estimated CMRO2 looks quite heterogeneous across the brain surface. Could this be partially resulting from the measurement artefact?

      (6) It would be beneficial to provide more detailed justification for using P10 mice in the experiments.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review)

      (1) This manuscript addresses an important problem of the uncoupling of oxidative phosphorylation due to hypoxia-ischemia injury of the neonatal brain and provides insight into the neuroprotective mechanisms of hypothermia treatment.

      The authors used a combination of in vivo imaging of awake P10 mice and experiments on isolated mitochondria to assess various key parameters of the brain metabolism during hypoxia-ischemia with and without hypothermia treatment. This unique approach resulted in a comprehensive data set that provides solid evidence for the derived conclusions

      We thank the reviewer for the positive feedback.

      (2) The experiments were performed acutely on the same day when the surgery was performed. There is a possibility that the physiology of mice at the time of imaging was still affected by the previously applied anesthesia. This is particularly of concern since the duration of anesthesia was relatively long. Is it possible that the observed relatively low baseline OEF (~20%) and trends of increased OEF and CBF over several hours after the imaging start were partially due to slow recovery from prolonged anesthesia? The potential effects of long exposure to anesthesia before imaging experiments were not discussed.

      We thank the reviewer for this important comment and for pointing out the potential influence of anesthesia on the physiological state of the animals. We apologize for any confusion. To clarify, all PAM imaging experiments were conducted in awake animals. Isoflurane anesthesia was used only during two brief surgical procedures: (1) the installation of the head-restraint plastic head plate and (2) the right common carotid artery (CCA) ligation. Each anesthesia session lasted less than 20 minutes.

      We have revised the Methods section to provide additional details:

      For the subsection Procedures for PAM Imaging on page 17, we clarified the sequence of procedures during the head plate installation, as well as the corresponding anesthesia duration:

      “After the applied glue was solidified (~20 min), the animal was first returned to its cage for full recovery from anesthesia, and then carefully moved to the treadmill and secured to the metal arm-piece with two #4–40 screws for awake PAM imaging. The total duration of anesthesia, including preparation and glue solidification, was approximately 20 minutes.”

      For the subsection Neonatal Cerebral HI and Hypothermia Treatment on page 19, we also clarified the CCA ligation procedure:

      “Briefly, P10 mice of both sexes anesthetized with 2% isoflurane were subjected to the right CCA-ligation. To manage pain, 0.25% Bupivacaine was administered locally prior to the surgical procedures, which took less than 10 minutes. After a recovery period for one hour, awake mice were exposed to 10% O<sub>2</sub> for 40 minutes in a hypoxic chamber at 37 °C.”

      Regarding the reviewer’s concern about the observed trends in OEF and CBF, we agree that residual effects of anesthesia could, in principle, influence physiological parameters. However, we believe this is unlikely in this study for the following reasons. First, all imaging was conducted in awake animals after a clearly defined recovery period. Second, the trend of increasing OEF and CBF over time was consistent across animals and aligned with expected physiological responses following hypoxic-ischemic injury. In particular, the relatively low baseline OEF (0.21 at 37°C) is consistent with our previous study (0.25; (Cao et al., 2018)). The gradual increase in CBF and OEF reflects metabolic compensation and reperfusion following hypoxia-ischemia, as previously described (Lin and Powers, 2018). Therefore, we believe the observed changes are of physiological origin rather than anesthesia-related artifacts.

      (3) The Methods Section does not provide information about drugs administered to reduce the pain. If pain was not managed, mice could be experiencing significant pain during experiments in the awake state after the surgery. Since the imaging sessions were long (my impression based on information from the manuscript is that imaging sessions were ~4 hours long or even longer), the level of pain was also likely to change during the experiments. It was not discussed how significant and potentially evolving pain during imaging sessions could have affected the measurements (e.g., blood flow and CMRO<sub>2</sub>). If mice received pain management during experiments, then it was not discussed if there are known effects of used drugs on CBF, CMRO<sub>2</sub>, and lesion size after 24 hr.

      We thank the reviewer for this valuable comment regarding pain management. We confirm that local analgesia was administered to all animals prior to surgical procedures. Specifically, 0.25% Bupivacaine was applied locally before both the head-restraint plate installation and the CCA ligation. These details have now been clarified in the Methods section:

      For the subsection Procedures for PAM Imaging on page 16, we added:

      “To manage pain, 0.25% Bupivacaine was administered locally prior to the surgical procedures.”

      For the subsection Neonatal Cerebral HI and Hypothermia Treatment on page 18, we added:

      “To manage pain, 0.25% Bupivacaine was administered locally prior to the surgical procedures, which took less than 10 minutes.”

      To our knowledge, Bupivacaine has minimal systemic effects at the dose used and is unlikely to significantly alter CBF, CMRO<sub>2</sub>, or lesion development (Greenberg et al., 1998). No other analgesics (e.g., NSAIDs or opioids) were administered unless distress symptoms were observed—which did not occur in this study.

      Additionally, although imaging sessions were extended (up to 2 hours), animals remained calm and showed no signs of pain or distress during or after the procedures. Throughout the experimental period (up to 24 hours post-surgery), animals were monitored for signs of discomfort (e.g., abnormal activity, breathing, or weight gain), but no additional analgesia was required. The neonatal HI procedures are considered minimally invasive, and based on our protocol and prior experience, local Bupivacaine provides effective analgesia during and after the brief surgeries. We have added a corresponding note in the Discussion section (newly added subsection: Limitations in this study, the last paragraph) on page 15:

      “We observed no signs of distress or pain and did not use stress- or pain-reducing drugs during imaging. However, potential effects of stress or residual pain on CBF and CMRO<sub>2</sub> cannot be fully ruled out. Future studies could incorporate more detailed pain assessment and stress-mitigation strategies to further enhance physiological reliability.”

      (4) Animals were imaged in the awake state, but they were not previously trained for the imaging procedure with head restraint. Did animals receive any drugs to reduce stress? Our experience with well-trained young-adult as well as old mice is that they can typically endure 2 and sometimes up to 3 hours of head-restrained awake imaging with intermittent breaks for receiving the rewards before showing signs of anxiety. We do not have experience with imaging P10 mice in the awake state. Is it possible that P10 mice were significantly stressed during imaging and that their stress level changed during the imaging session? This concern about the potential effects of stress on the various measured parameters was not discussed.

      We thank the reviewer for this important comment regarding the potential effects of stress during awake imaging. The neonatal mice used in our study were P10, a stage at which animals are still physiologically immature and relatively inactive. Due to their small size and limited mobility, these animals did not struggle or show signs of distress during the imaging sessions. All animals remained calm and stable throughout the procedure, and no stress-reducing drugs were administered.

      We agree that, unlike older animals, P10 mice are not amenable to prior behavioral training. However, their underdeveloped motor activity and natural docility at this stage allowed for stable head-restrained imaging without inducing overt stress responses. Although no behavioral signs of stress were observed, we acknowledge that subtle physiological effects cannot be entirely excluded. We have added a brief discussion in the Discussion section (newly added subsection: Limitations in this study, the last paragraph) on page 15:

      “Lastly, for awake imaging, the small size of neonatal mice at P10 aids stability during awake PAM imaging, though it limits the feasibility of prior training, which is typically possible in older animals.”

      (5) The temperature of the skull was measured during the hypothermia experiment by lowering the water temperature in the water bath above the animal's head. Considering high metabolism and blood flow in the cortex, it could be challenging to predict cortical temperature based on the skull temperature, particularly in the deeper part of the cortex.

      We thank the reviewer for this helpful comment and for highlighting an important technical consideration. We acknowledge that we did not directly measure intracortical tissue temperature during the hypothermia experiments. While we recognize that relying on skull temperature may have limitations—particularly in reflecting temperature changes in deeper cortical regions—this approach is consistent with clinical practice, where intracortical temperature is typically not measured. Moreover, prior studies have shown that skull or brain surface temperature generally reflects cortical thermal dynamics to a reasonable extent under controlled conditions (Kiyatkin, 2007). We have added the following note in the Discussion section (newly added subsection: Limitations in this study, the 2<sup>nd</sup> paragraph) on page 14:

      “A technical limitation is the absence of direct intracortical temperature measurements during hypothermia; we relied on skull temperature, which may not fully capture temperature dynamics in deeper cortical layers. However, this approach aligns with clinical practice, where intracortical temperature is not typically measured. Future studies could benefit from more precise intracortical assessments.”

      (6) The map of estimated CMRO<sub>2</sub> (Fig. 4B) looks very heterogeneous across the brain surface. Is it a coincidence that the highest CMRO<sub>2</sub> is observed within the central part of the field of view? Is there previous evidence that CMRO<sub>2</sub> in these parts of the mouse cortex could vary a few folds over a 1-2 mm distance?

      We appreciate the reviewer’s insightful observation regarding the spatial heterogeneity observed in the estimated CMRO<sub>2</sub> map (Fig. 4B). This heterogeneity is not a result of scanning bias, as uniform contour scanning was performed across the entire field of view. The higher CMRO<sub>2</sub> values observed in the central region are unlikely to be artifacts and more likely reflect underlying physiological variability.

      Our CMRO<sub>2</sub> estimation is based on an algorithm we previously developed and validated in other tissues. Specifically, we have successfully applied this algorithm to assess oxygen metabolism in the mouse kidney (Sun et al., 2021) and to monitor vascular adaptation and tissue oxygen metabolism during cutaneous wound healing (Sun et al., 2022). These studies demonstrated the algorithm's capability to capture spatial variations in oxygen metabolism. Although the current application to the brain is novel, the algorithm has been validated in controlled experimental settings and shown to produce consistent results. We acknowledge that the observed range of CMRO<sub>2</sub> appears relatively broad across a 1–2 mm distance; however, such heterogeneity may arise from local differences in vascular density, metabolic demand, or tissue oxygenation — all of which can vary across cortical regions, even within small spatial scales. We have added a brief note in the Discussion (Subsection: Optical CMRO<sub>2</sub> detection in neonatal care) on page 13 to acknowledge this point:

      “Additionally, the spatial heterogeneity in estimated CMRO<sub>2</sub> observed in our data may reflect underlying physiological variability, including differences in vascular structure or metabolic demand across cortical regions. Future studies will aim to further validate and interpret these spatial patterns.”

      (7) The justification for using P10 mice in the experiments has not been well presented in the manuscript.

      We thank the reviewer for pointing out the need to clarify our choice of developmental stage. We chose P10 mice for our hypoxia-ischemia injury model because this stage is widely recognized as developmentally comparable to human term infants in terms of brain maturation. This approach has been validated by several previous studies (Clancy et al., 2007; Mallard and Vexler, 2015; Sheldon et al., 2018). We have added the following clarification to the Methods section (Subsection: Neonatal Cerebral HI and Hypothermia Treatment) on page 18:

      “P10 mice were chosen for our experiments as they are widely used to model near-term infants in humans. At this developmental stage, the brain maturation in mice closely parallels that of near-term infants, making them an appropriate model for studying neonatal brain injury and therapeutic interventions (Clancy et al., 2007; Mallard and Vexler, 2015; Sheldon et al., 2018).”

      (8) It was not discussed how the observations made in this manuscript could be affected by the potential discrepancy between the developmental stages of P10 mice and human babies regarding cellular metabolism and neurovascular coupling.

      We thank the reviewer for raising this important point regarding developmental differences between P10 mice and human infants. We have discussed this issue by adding the following statement to the Discussion section (newly added subsection: Limitations in this study, the 1<sup>st</sup> paragraph) on page 15, where we summarize the overall study design and model selection:

      “While P10 mice are widely used to model near-term human infants, developmental differences in cellular metabolism and neurovascular coupling may affect the observed outcomes and limit direct clinical translation (Clancy et al., 2007; Mallard and Vexler, 2015; Sheldon et al., 2018). Nevertheless, the P10 model remains a valuable and widely accepted tool for studying neonatal hypoxia-ischemia mechanisms and evaluating therapeutic interventions.”

      (9) Regarding the brain temperature measurements, the authors should use a new cohort of mice, implant the miniature thermocouples 1 mm, 0.5 mm, and immediately below the skull in different mice, and verify the temperature in the brain cortex under conditions applied in the experiments. The same approach could be applied to a few mice undergoing 4-hr-long hypothermia treatment in a chamber, which will provide information about the brain temperature that resulted in observed protection from the injury.

      We thank the reviewer for this helpful recommendation. We fully agree that direct intracortical temperature measurement would provide more accurate insight into thermal dynamics during hypothermia treatment. However, the primary aim of this study was not to characterize the precise intracortical temperature response under hypothermic conditions, but rather to examine the effects of hypothermia on CMRO<sub>2</sub> and mitochondrial function. Due to the substantial time and resources required to perform direct intracortical temperature monitoring—and considering the technical focus of the current work—we respectfully suggest reserving such investigations for a future study specifically focused on thermal dynamics in hypoxia-ischemia models.

      We have acknowledged this limitation in the subsection Limitations in this study of the Discussion on page 15, noting that skull temperature was used as an approximation of brain temperature and that this approach is consistent with clinical practice, where intracortical temperature is typically not measured. We also note that future studies may benefit from more precise assessments using intracortical probes.

      (10) The mean values presented in Fig. 4G are much lower than the peak values in the 2D panels and potentially were calculated as the average values over the entire field of view. Please provide more details on how CMRO<sub>2</sub> was estimated and if the validity of the measurements is expected across the entire field of view. If there are parts of the field of view where the estimation of CMRO<sub>2</sub> is more reliable for technical reasons, maybe one way to compute the mean values is to restrict the usable data to the more centralized part of the field of view.

      We thank the reviewer for this thoughtful comment. We confirm that CMRO<sub>2</sub> values shown in Figure 4G were calculated as spatial averages over the entire field of view (FOV; ~5 × 3 mm<sup>2</sup>) encompassing both hemicortices, as shown in Figure 1C. Regarding the observed CMRO<sub>2</sub> values, The apparent difference likely reflects a comparison between two different post-HI time points. Specifically, the ~0.5 value shown for the 37°C ipsilateral group in Figure 4G reflects the average CMRO<sub>2</sub> measured 24 hours after HI, while the ~1.5 value in Figure 2D (red line) corresponds to CMRO<sub>2</sub> during the early 0–2 hour post-HI period. The temporal difference accounts for the apparent discrepancy in magnitude. We understand the importance of consistency across the field of view and have clarified this point in the subsection Procedures for PAM Imaging in the Methods on page 17 “For the imaging field covering both hemicortices between the Bregma and Lambda of the neonatal mouse (5 × 3 mm<sup>2</sup> as shown in Figure 1C, with each hemicortex measuring 2.5 × 3 mm<sup>2</sup>)”, as well as in the Figure 4 legend on page 34 “Correlation of CMRO<sub>2</sub> and post-HI brain infarction in mouse neonates at 24 hours”.

      In our model and setup, CMRO<sub>2</sub> estimation is spatially robust across the FOV under standard imaging conditions. We recognize, however, that certain peripheral regions may be more prone to signal attenuation. Future refinement of region selection could further improve spatial averaging strategies. For the current study, full-FOV averaging was used consistently across all groups to maintain comparability.

      (11) Minor: Results presented in Supplementary Tables have too many significant digits.

      Thank you for the helpful suggestion. We have revised Supplementary Tables S1 and S2 to reduce the number of significant digits and improve clarity.

      Reviewer #2 (Public review)

      (1) In this study, authors have hypothesized that mitochondrial injury in HIE is caused by OXPHOS-uncoupling, which is the cause of secondary energy failure in HI. In addition, therapeutic hypothermia rescues secondary energy failure. The methodologies used are state-of-the art and include PAM technique in live animal, bioenergetic studies in the isolated mitochondria, and others.

      The study is comprehensive and impressive. The article is well written and statistical analyses are appropriate.

      We thank the reviewer for the positive feedback.

      (2) The manuscript does not discuss the limitation of this animal model study in view of the clinical scenario of neonatal hypoxia-ischemia.

      We thank the reviewer for this valuable feedback. In response, we have added a dedicated “Limitations in this study” subsection in the Discussion, where we address the potential limitations of this animal model in the context of the clinical scenario of neonatal hypoxia-ischemia in the first paragraph on page 14, including the developmental differences between P10 mice and human infants.

      (3) I see many studies on Pubmed on bioenergetics and HI. Hence, it is unclear what is novel and what is known.

      We thank the reviewer for this important comment regarding the novelty of our study in the context of existing research on bioenergetics and hypoxia-ischemia (HI). To better clarify the novel aspects of our work, we have highlighted the relevant content in the Abstract (page 4) and Introduction (page 5). Specifically, while many studies have explored HI-related bioenergetic dysfunction, the mechanisms by which therapeutic hypothermia modulates CMRO<sub>2</sub> and mitochondrial function post-HI remain poorly understood.

      Abstract on page 4: “However, it is unclear how post-HI hypothermia helps to restore the balance, as cooling reduces CMRO<sub>2</sub>. Also, how transient HI leads to secondary energy failure (SEF) in neonatal brains remains elusive. Using photoacoustic microscopy, we examined the effects of HI on CMRO<sub>2</sub> in awake 10-day-old mice, supplemented by bioenergetic analysis of purified cortical mitochondria.”

      Introduction on page 5: “The use of awake mouse neonates avoided the confounding effects of anesthesia on CBF and CMRO<sub>2</sub> (Cao et al., 2017; Gao et al., 2017; Sciortino et al., 2021; Slupe and Kirsch, 2018). In addition, we measured the oxygen consumption rate (OCR), reactive oxygen species (ROS), and the membrane potential of mitochondria that were immediately purified from the same cortical area imaged by PAM. This dual-modal analysis enabled a direct comparison of cerebral oxygen metabolism and cortical mitochondrial respiration in the same animal. Moreover, we compared the effects of therapeutic hypothermia on oxygen metabolism and mitochondrial respiration, and correlated the extent of CMRO<sub>2</sub>-reduction with the severity of infarction at 24 hours after HI. Our results suggest that blocking HI-induced OXPHOS-uncoupling is an acute effect of hypothermia and that optical detection of CMRO<sub>2</sub> may have clinical applications in HIE.”

      In this study, we propose that uncoupled oxidative phosphorylation (OXPHOS) underlies the secondary energy failure observed after HI, and we demonstrate that hypothermia suppresses this pathological CMRO<sub>2</sub> surge, thereby protecting mitochondrial integrity and preventing injury. Additionally, our use of photoacoustic microscopy (PAM) in awake neonatal mice represents a novel, non-invasive approach to track cerebral oxygen metabolism, with potential clinical relevance for guiding hypothermia therapy.

      (4) What are the limitations of ex-vivo mitochondrial studies?

      We thank the reviewer for this insightful comment. We acknowledge that ex-vivo mitochondrial assays do not fully replicate in vivo physiological conditions, as they lack systemic factors such as blood flow, cellular interactions, and intact tissue architecture. However, these assays are well-established and widely accepted in the field for evaluating mitochondrial function under controlled conditions (Caspersen et al., 2008; Niatsetskaya et al., 2012). Despite their limitations, they enable direct comparisons of mitochondrial activity across experimental groups and provide valuable mechanistic insights that complement in vivo observations.

      (5) PAM technique limits the resolution of the image beyond 500-750 micron depth. Assessing basal ganglia may not be possible with this approach?

      We thank the reviewer for this important comment. We agree that the imaging depth of PAM is limited and may not allow assessment of deeper brain structures such as the basal ganglia. However, in our neonatal HI model—as in many clinical cases of HIE—cortical injury is typically more severe and represents a major focus for mechanistic and therapeutic investigations. The cortical regions assessed with PAM are thus highly relevant to the pathophysiology of neonatal HI. We have now acknowledged this depth limitation in the third paragraph of the newly added Limitations in this study subsection of the Discussion on page 15:

      “Another limitation of this study is the restricted imaging depth of the PAM technique, which is typically less than 1 mm and therefore does not allow assessment of deeper brain structures such as the basal ganglia. However, in both our neonatal HI model and most clinical cases of neonatal hypoxia-ischemia, cortical injury tends to be more prominent and functionally significant. As such, our cortical measurements remain highly relevant for investigating the mechanisms of injury and evaluating therapeutic interventions.”

      (6) Hypothermia in present study reduces the brain temperature from 37 to 29-32 degree centigrade. In clinical set up, head temp is reduced to 33-34.5 in neonatal hypoxia ischemia. Hence a drop in temperature to 29 degrees is much lower relative to the clinical practice. How the present study with greater drop in head temperature can be interpreted for understanding the pathophysiology of therapeutic hypothermia in neonatal HIE. Moreover, in HIE model using higher temperature of 37 and dropping to 29 seems to be much different than the clinical scenario. Please discuss.

      We thank the reviewer for raising this important point regarding temperature ranges in our study. In Figure 1, we used a broader temperature range (down to 29°C) to explore the general relationship between temperature and CMRO<sub>2</sub> in uninjured neonatal mice. This experiment was not intended to model therapeutic hypothermia directly, but rather to characterize the baseline physiological responses.

      For all experiments involving hypothermia as a therapeutic intervention following HI, we consistently maintained a brain temperature of 32°C, which falls within the clinically accepted mild hypothermia range for neonatal HIE (typically 33–34.5°C). We believe this temperature closely mimics clinical practice and supports the translational relevance of our findings.

      (7) NMR was assessed ex-vivo. How does it relate to in vivo assessment. Infants admitted in Neonatal intensive Care Unit, frequently get MRI with spectroscopy. How do the MRS findings in human newborns with HIE correlate with the ex-vivo evaluation of metabolites.

      We thank the reviewer for this insightful question. While our study assessed brain metabolites ex vivo, similar metabolic changes have been observed in vivo using proton magnetic resonance spectroscopy (¹H-MRS) in infants with HIE. Specifically, reductions in N-acetylaspartate (NAA) — a marker of neuronal integrity — have been reported in neonates with severe brain injury, aligning with our ex vivo findings. This correlation between in vivo and ex vivo assessments supports the translational relevance of our model for studying metabolic disruption in neonatal HIE. We have added this point to the subsection Using Optically Measured CMRO<sub>2</sub> to Detect Neonatal HI Brain Injury of the Results on page 8, along with a supporting clinical reference (Lally et al., 2019):

      “In addition, in vivo proton MRS in infants with HIE has also shown a reduction in NAA, particularly in cases of severe injury (Lally et al., 2019). This reduction in NAA, observed in neonatal intensive care settings, reflects neuronal and axonal loss or dysfunction and serves as a biomarker for injury severity. The alignment between our ex vivo observations and in vivo MRS findings in clinical studies reinforces the translational relevance of our model for investigating metabolic disturbances in neonatal HIE.”

      Reviewer #3 (Public review)

      (1) In Sun et al. present a comprehensive study using a novel photoacoustic microscopy setup and mitochondrial analysis to investigate the impact of hypoxia-ischemia (HI) on brain metabolism and the protective role of therapeutic hypothermia. The authors elegantly demonstrate three connected findings: (1) HI initially suppresses brain metabolism, (2) subsequently triggers a metabolic surge linked to oxidative phosphorylation uncoupling and brain damage, and (3) therapeutic hypothermia mitigates HI-induced damage by blocking this surge and reducing mitochondrial stress.

      The study's design and execution are great, with a clear presentation of results and methods. Data is nicely presented, and methodological details are thorough.

      We thank the reviewer for the positive feedback.

      (2) However, a minor concern is the extensive use of abbreviations, which can hinder readability. As all the abbreviations are introduced in the text, their overuse may render the text hard to read to non-specialist audiences. Additionally, sharing the custom Matlab and other software scripts online, particularly those used for blood vessel segmentation, would be a valuable resource for the scientific community. In addition, while the study focuses on the short-term effects of HI, exploring the long-term consequences and definitively elucidating HI's impact on mitochondria would further strengthen the manuscript's impact.

      We thank the reviewer for these valuable suggestions. Please find our point-by-point responses below:

      Abbreviations: To improve readability, we have added a List of Abbreviations on page 3 to help readers, especially non-specialists, navigate the terminology more easily.

      MATLAB Code Availability: The methodology for blood vessel segmentation was described in detail in our previous publication (Sun et al., 2020). We have now updated the subsection Quantification of Cerebral Hemodynamics and Oxygen Metabolism by PAM of the Methods on page 18 to provide additional details and have indicated that the MATLAB scripts are available upon request.

      “Briefly, this process involves generating a vascular map using signal amplitude from the Hilbert transformation, selecting a region slightly larger than the vessel of interest, and applying Otsu’s thresholding method to remove background pixels. Isolated or spurious boundary fragments are then removed to improve boundary smoothness. The customized MATLAB code used for vessel segmentation is available upon request.”

      Long-Term Effects of Hypothermia: We agree that exploring long-term outcomes would enhance the broader impact of this research. While our study focuses on the acute phase following HI, prior studies have shown long-term neuroprotective benefits of therapeutic hypothermia, such as enhanced white matter development (Koo et al., 2017). We have added this point to the fourth paragraph in the subsection Limitations in this study of the Discussion on page 15:

      “While our study focuses on the acute effects of hypothermia, previous research has shown long-term neuroprotective benefits, including improved white matter development post-injury (Koo et al., 2017). These findings highlight hypothermia's potential for both immediate and extended recovery, warranting further study of long-term outcomes.”

      (3) Extensive use of abbreviations.

      Thank you for the helpful suggestion. To improve readability for a broader audience, we have added a List of Abbreviations on page 3 of the manuscript to assist readers in navigating terminology used throughout the text. This has been included as Response #2 to Reviewer #3.

      (4) Share code used to conduct the study.

      Thank you for the suggestion. The methodology for vessel segmentation was previously published (Sun et al., 2020), and we have noted in the subsection Quantification of Cerebral Hemodynamics and Oxygen Metabolism by PAM of the Methods on page 18 that the MATLAB code is available upon request. This has also been included as Response #2 to Reviewer #3.

      Reference:

      Cao R, Li J, Kharel Y, Zhang C, Morris E, Santos WL, Lynch KR, Zuo Z, Hu S. 2018. Photoacoustic microscopy reveals the hemodynamic basis of sphingosine 1-phosphate-induced neuroprotection against ischemic stroke. Theranostics 8:6111–6120. doi:10.7150/thno.29435

      Caspersen CS, Sosunov A, Utkina-Sosunova I, Ratner VI, Starkov AA, Ten VS. 2008. An Isolation Method for Assessment of Brain Mitochondria Function in Neonatal Mice with Hypoxic-Ischemic Brain Injury. Developmental Neuroscience 30:319–324. doi:10.1159/000121416

      Clancy B, Kersh B, Hyde J, Darlington RB, Anand KJS, Finlay BL. 2007. Web-based method for translating neurodevelopment from laboratory species to humans. Neuroinformatics 5:79–94. doi:10.1385/ni:5:1:79

      Greenberg RS, Zahurak M, Belden C, Tunkel DE. 1998. Assessment of oropharyngeal distance in children using magnetic resonance imaging. Anesth Analg 87:1048–1051. doi:10.1097/00000539-199811000-00014

      Kiyatkin EA. 2007. Brain temperature fluctuations during physiological and pathological conditions. Eur J Appl Physiol 101:3–17. doi:10.1007/s00421-007-0450-7

      Koo E, Sheldon RA, Lee BS, Vexler ZS, Ferriero DM. 2017. Effects of therapeutic hypothermia on white matter injury from murine neonatal hypoxia-ischemia. Pediatr Res 82:518–526. doi:10.1038/pr.2017.75

      Lally PJ, Montaldo P, Oliveira V, Soe A, Swamy R, Bassett P, Mendoza J, Atreja G, Kariholu U, Pattnayak S, Sashikumar P, Harizaj H, Mitchell M, Ganesh V, Harigopal S, Dixon J, English P, Clarke P, Muthukumar P, Satodia P, Wayte S, Abernethy LJ, Yajamanyam K, Bainbridge A, Price D, Huertas A, Sharp DJ, Kalra V, Chawla S, Shankaran S, Thayyil S, MARBLE consortium. 2019. Magnetic resonance spectroscopy assessment of brain injury after moderate hypothermia in neonatal encephalopathy: a prospective multicentre cohort study. Lancet Neurol 18:35–45. doi:10.1016/S1474-4422(18)30325-9

      Lin W, Powers WJ. 2018. Oxygen metabolism in acute ischemic stroke. J Cereb Blood Flow Metab 38:1481–1499. doi:10.1177/0271678X17722095

      Mallard C, Vexler Z. 2015. Modeling ischemia in the immature brain: how translational are animal models? Stroke 46:3006–3011. doi:10.1161/STROKEAHA.115.007776

      Niatsetskaya ZV, Sosunov SA, Matsiukevich D, Utkina-Sosunova IV, Ratner VI, Starkov AA, Ten VS. 2012. The Oxygen Free Radicals Originating from Mitochondrial Complex I Contribute to Oxidative Brain Injury Following Hypoxia–Ischemia in Neonatal Mice. J Neurosci 32:3235–3244. doi:10.1523/JNEUROSCI.6303-11.2012

      Sheldon RA, Windsor C, Ferriero DM. 2018. Strain-Related Differences in Mouse Neonatal Hypoxia-Ischemia. Dev Neurosci 40:490–496. doi:10.1159/000495880

      Sun N, Bruce AC, Ning B, Cao R, Wang Y, Zhong F, Peirce SM, Hu S. 2022. Photoacoustic microscopy of vascular adaptation and tissue oxygen metabolism during cutaneous wound healing. Biomed Opt Express, BOE 13:2695–2706. doi:10.1364/BOE.456198

      Sun N, Ning B, Bruce AC, Cao R, Seaman SA, Wang T, Fritsche-Danielson R, Carlsson LG, Peirce SM, Hu S. 2020. In vivo imaging of hemodynamic redistribution and arteriogenesis across microvascular network. Microcirculation 27:e12598. doi:10.1111/micc.12598

      Sun N, Zheng S, Rosin DL, Poudel N, Yao J, Perry HM, Cao R, Okusa MD, Hu S. 2021. Development of a photoacoustic microscopy technique to assess peritubular capillary function and oxygen metabolism in the mouse kidney. Kidney International 100:613–620. doi:10.1016/j.kint.2021.06.018

    1. Reviewer #3 (Public review):

      Summary:

      This study investigates two main questions:

      (1) whether brain activity recorded during immersive virtual reality can differentiate facial expressions and stereoscopic depth, and

      (2) whether depth cues modulate facial information processing.

      The results show that both expression and depth information can be decoded from multivariate EEG recorded in a head-mounted VR setup. However, the results show that the decoding performance of facial expressions does not benefit from depth information.

      Strengths:

      The study is technically strong and well executed. EEG data are of high quality despite the challenges of recording inside a head-mounted VR system. The work effectively combines stereoscopic stimulus presentation, eye-tracking to monitor gaze behavior, and time-resolved multivariate decoding techniques. Together, these elements provide an exemplary demonstration of how to collect and analyze high-quality EEG data in immersive VR environments.

      Weaknesses:

      The major limitation concerns the theoretical question about how stereoscopic depth modulates facial expression processing. While previous work has suggested that stereoscopic depth cues can shape natural face perception and emphasize the importance of binocular information in recognizing facial expressions (lines 95-97), the present study reports a null effect of depth. However, the stimulus configuration they used likely constrained the ability to detect any depth-related effects. All facial stimuli were static, frontal, and presented at a fixed distance. This design leads to near-ceiling behavioral performance and no behavioral effect of depth on expression recognition. It makes the null modulation of depth on expression processing unsurprising and limits the theoretical reach of the study. Adding more subtle or naturalistic features (such as various viewing angles and dynamic expressions) to the stimulus set if the authors aim to advance a strong theoretical claim about the role of binocular disparity. Or reframing the work as a technical validation of EEG decoding in this context.

      Another issue relates to the claim that eye movements cannot explain the EEG decoding results. It is a real challenge to remove eye-movement-related artifacts and confounds, as the VR setup tends to encourage viewers to explore the environment freely. However, nearly half of the eye-tracking datasets were lost (usable in only 17 of 33 participants), which substantially weakens the evidence for EEG-gaze dissociation. Moreover, it would be almost impossible to decode facial information from only two-dimensional gaze direction, given that with 60 EEG channels, the decoding accuracy was modest (AUC ≈ 0.60). These two factors together limited the strength of the reported null correlation between neural and eye-data decoding.

      The decoding analysis appears to use all 60 EEG channels as input features. I wonder why the authors did not examine using more spatially specific channel subsets. Facial expression and depth cues are known to preferentially engage occipito-temporal regions (e.g., N170-related sites), yet the current approach treats all sensors equally. Including all the channels may add noise and irrelevant signals to facial information decoding. Besides, using a subset of spatial-specific channels would align more directly with the subsequent source reconstruction.

    1. eLife Assessment

      The authors ask whether a simple whole-head spectral power analysis of human magnetoencephalography data recorded at rest in a large cohort of adults shows robust effects of age, and their results provide compelling evidence that it does. The relative simplicity of the analysis is a major strength of the paper, and the authors are careful to control for many different confounds - although perhaps highly correlated factors like brain anatomy still pose a slight issue. The paper provides a valuable power analysis framework that should inform researchers across the broader neuroimaging community

    2. Reviewer #1 (Public review):

      Summary:

      This is a careful, well-powered treatment of age effects in resting-state MEG. Rather than extracting (say) complex connectivity measures, the authors look at the 'simplest possible thing': changes in the overall power spectrum across age.

      Strengths:

      They find significant age-related changes at different frequency bands: broadly, attenuation at low-frequency (alpha) and increased beta. These patterns are identified in a large dataset (CamCAN) and then verified in other public data.

      Weaknesses:

      Some secondary interpretations (what is "unique" to age vs global anatomy) may go beyond what the statistics strictly warrant in the current form, but these can be tightened with (I think, fairly quick) additions already foreshadowed by the authors' own analyses.

      Aims:

      The authors set out to replace piecemeal, band-by-band ageing claims with t-maps, and Cohen's f2 over sensors×frequency ("GLM-Spectrum").

      On CamCAN, six spatio-spectral peaks survive relatively strict statistical controls. The larger effects are in low-frequency and upper-alpha/beta ranges (f2 approx 0.2-0.3), while lower-alpha and gamma reach significance but with small practical impact (f2 < 0.075). A nice finding is that the same qualitative profile appears in three additional independent datasets.

      Two analyses are especially interesting. First, the authors show a difference between absolute and relative spectral magnitude (basically, within-subject normalization). Relative scaling sharpens the spectral specificity of the spatial maps, while absolute magnitude is dominated by a broad spatial mode that correlates positively across frequencies, likely reflecting head-position/field-spread factors. The replication of the main age profile is robust to preprocessing decisions (e.g., SSS movement compensation choices) - the bigger determinant of the effect is whether they apply sensor normalization (relative vs absolute).

      Second, lots of brain-related things might be related to age, and the authors spend some time trying to back out confounds/covariates. This section is handled transparently (in general, I found the writing style very clear throughout) - they examine single covariates (sex, BP, GGMV, etc.) and compare simple vs partial age effects. For example, aging is correlated with reductions in global grey-matter volume (GGMV), but it would be nice to find a measure that is independent of this: controlling for GGMV (via a linear model) reduces age-related effect sizes heterogeneously across space/frequency but does not eliminate them, a nuance the authors treat carefully.

      This is a nice paper, and I have only a few concrete suggestions:

      (1) High-gamma:

      There can be a lot of EMG / eye movement contamination (I know these were RS eyes closed data, but still..) above 30-40 Hz, and these effects are the weakest anyway. Could you add an analysis (e.g., ICA/label-based muscle component removal) and show the gamma band's sensitivity to that step? Or just note this point more clearly?

      (2) GGMV confound control:

      Controlling for GGMV reduces, but does not eliminate, age effects. I have a few questions about this: a) Could we see the residuals as a function of age? I wonder if there are non-linear effects or something else that the regression is not accounting for. Also, b) GGMV and age are highly colinear - is this an issue? Can regression really split them apart robustly? I think by some cunning orthogonalisation, you can compute the effect of age independent of GGVM. I don't think this is the same as the effect 'adjusted' for GGMV (which is what is shown here if I'm reading it correctly). Finally, of course, GGMV might actually be the thing you want to look at (because it might more accurately reflect clinical issues) - so strong correlations are not really a problem: I think really the focus might even be on using MEG to predict GGMV and controlling for age.

    3. Reviewer #2 (Public review):

      This paper describes the application of the "GLM-Spectrum" mass univariate approach to examine the effects of age on M/EEG power spectra. Its strengths include promotion of the unbiased approach, suitable for future meta/mega-analyses, and the provision of effect sizes for powering future studies. These are useful contributions to the literature. What is perhaps lacking is a discussion of the limitations of this approach, in comparison to other methods.

      An analogy is the mass univariate approach to spatial localisation of effects in fMRI/PET images. This approach is unbiased by prior assumptions about the organisation of the brain, but potentially also less sensitive, by ignoring that prior knowledge. For example, a voxelwise univariate approach is less sensitive to detecting effects in functionally homogeneous brain regions, where SNR can be increased by averaging over voxels. In the context of power spectra, the authors' approach deliberately ignores knowledge about the dominant frequency bands/oscillations in human power spectra. This is in contrast to approaches like FOOOF and IRASA, which explicitly parametrise frequency components. I am not saying these methods are better; I just think that the authors should acknowledge that these approaches have advantages over their mass univariate approach (in sensitivity and interpretation; see below). I guess it is a type of bias-sensitivity trade-off: the authors want to avoid bias, but they should acknowledge the corresponding loss of sensitivity, as well as loss of interpretation compared to model-based approaches (i.e, models that parameterise frequency; I don't mean the statistical models for each frequency separately).

      An example of the interpretational loss can be seen in the authors' observation of opposite-signed effects of age around the alpha peak. While the authors acknowledge that this pattern can arise from a reduction in alpha frequency with age, this is an indirect inference, and a direct (and likely much more sensitive) approach would be to parametrise and estimate the peak alpha frequency directly for each participant, as done with FOOOF for example (possibly with group priors, as in Medrano et al, 2025, EJN). The authors emphasise the nonlinear effects of age in Figure 2A, but their approach cannot test this directly (e.g., in terms of plotting effects of age on frequency, magnitude, and width for each participant), so for me, this figure illustrates a weakness of their approach, not a strength.

      Then I think the section "Two dissociable and opposite effects in the alpha range" in the Discussion section is confusing, because if there is a single reduction in alpha peak frequency and magnitude with age, then there is only one "effect", not "two dissociable" ones. If the authors do want to claim that there are two dissociable age effects within the alpha range, then they need to do a statistical test, e.g., that the topographies of low and high alpha are significantly different. This then reveals another limitation of the mass univariate approach - that space (channel) is not parametrised either - so one cannot test for significant channel x effect interactions within this framework, as necessary to really claim a dissociation (e.g., in underlying neural generators).

      While the authors show that normalisation of each person's power spectra by the sum across frequencies helps improve some statistics, they might want to say more about disadvantages of this approach, e.g., loss of sensitivity to any effects (eg of age) that are broadly distributed across majority of frequencies, loss of real SI units (absolute effect sizes) (as well as problems if normalisation were used for techniques like FOOOF, where the 1/f exponent would be affected).

      The authors should give more information on how artifactual ICs were defined. This may be important for cardiac artefacts, since Schmidt et al (2004, eLife) have pointed out how "standard" ICA thresholds can fail to remove all cardiac effects. This is very important for the effects of age, given that age affects cardiac dynamics (even though the focus of Schmidt et al is the 1/f exponent, could residual cardiac effects cause artifactual age effects in current results, even above ~1Hz?).

      The authors should clarify the precise maxfilter arguments, and explain what "reference" was used for the "trans" option - e.g., did the authors consider transforming the data to match a sphere at the centre of the helmet, which might not only remove some of the global power differences due to different head positions, but also be best for generalisation of the effect sizes they report to future studies (assuming the centre of the helmet is the most likely location on average)? And on that matter, did head positions actually differ by age at all?

    1. a message toterms, the writer aims to get information into the rlem, of course, is that filling a reader's head with informsimple as filling a glass with water. Readers "process"linguistic input into a conceptual code that must be intion already stored

      Kind of like a computer. The informational perspective relies on being clear and coherent for ease of audience absorption.

    1. Heat is the term used to describe the energy transfer into or out of an object due to temperature differences between that object and its surroundings. For example, holding a hot cup of tea will warm your hand as the heat transfers from the cup to your hand. Heat flows from a region of higher temperature to a region of lower temperature, so if you hold an ice cube, the heat flows from your hand to the ice cube. Heat flow stops when temperatures are equal.

      head is the word we use to describe energy transfer heat moves from hot to cold

  7. read.amazon.com read.amazon.com
    1. Walker later joined GlaxoSmithKline, the pharmaceutical and biotechnology company, as head of scientific computing. The first thing he did was build a data center cluster by using thousands of GeForce gaming cards that cost only about $800 apiece. This caught the attention of Nvidia’s vice president of health care, Kimberly Powell, who called Walker and said, “You’re at GSK now. You need to be buying our enterprise products.” “No,” countered Walker. “I should be doing what’s best for my employer. That’s my job.”

    1. In recent times, decade of the 40s of thetwentieth century, another of the pioneers ofAstrobiology was the Soviet astronomer GavriilAdrianovich Tikhov, who laid the foundations of anincipient "Astrobotany".Tikhov studied the albedo formations of Mars,speculating that the origin of chromatic and brightnesschanges on the Martian surface were caused byseasonal cycles of falling leaves in forests populated bydeciduous trees [7], (see Fig. 1).Figure 1. Albedo formations of Mars during the greatopposition of Mars in 2003. (Source: Rafael BalaguerRosa, Astrogirona, Astronomical Society of Girona).3. Astrobiology in ancestral societies.But these conceptions are very modern.Perhaps the idea that life thrives in the entire universe,and that maybe the inhabitants of Earth are sons of anextraterrestrial life, are rooted in our deepest psychefrom the very beginning of our species, Homo sapiens,(and maybe other human species, too), more than200,000 years ago.This idea is based on the fact that manyancestral cultures, different and located throughout theplanet, have interpreted that our human origins, and thevery origin of life on Earth, is actually of extraterrestrialorigin. This certainty is born of the shamanicexperience of the altered states of consciousness, wherethe subjective experiences (and then shared andcollective) suggest the real existence of spiritual orhigher beings, who descend from the sky, from space.

      Soviet astronomer Gavriil Tikhov speculated about life on Mars due to albedo changes. He was a Soviet astronomer becoming one of the very first pioneers in astrobiology and astrobotany (being appointed the head of astrobotany in Alma-Ata to investigate life on planets in the Solar System). He was also an astronomer at the Pulkovo Observatory from 1906 until 1941.

    1. In Rust, the tooling can answer a lot more questions for me. What type is cookie_token? A simple hover in any code editor with an LSP tells me, definitively, that it’s Option<String>.

      It's common enough to wonder about function behavior on null, undefined, etc., so consider "an LSP" (read: editor plugin) that could synthesize these annotations and insert the appropriate disclosure/disclaimer at the head of the function like this.

    1. 530-B HARKLE ROAD, STE 100, Santa Fe, NM, 87505, USA

      Meshtastic LLC is registered in Sante Fe, New Mexico, USA. That information is not contained in the meshtastic website.

      The head office address given is that of a registration agent though: New Mexico Registered Agent by High Desert Corporate Filings LLC

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1)How is this simplified model representative of what is observed biologically? A bump model does not naturally produce oscillations. How would the dynamics of a rhythm generator interact with this simplistic model?

      Bump models naturally produce sequential activity, and can be engineered to repeat this sequential activity periodically (Zhang, 1996; Samsonovich and McNaughton, 1997; Murray and Escola, 2017). This is the basis for the oscillatory behavior in the model presented here. As we describe in our paper, such a model is consistent with numerous neurobiological observations about cell-type-specific connectivity patterns. The reviewer is, however, correct to point out that our model does not incorporate other key neurobiological features--in particular, intracellular dynamical properties--that have been shown to play important roles in rhythm generation. Our aim in this work is to establish a circuit-level mechanism for rhythm generation, complementary to classical models that rely on intracellular dynamics for rhythm generation. Whether and how these mechanisms work together is something that we plan to explore in future work, and we have added a sentence to the Discussion to this effect.

      (2) Would this theoretical construct survive being expressed in a biophysical model? It seems that it should, but even a simple biological model with the basic patterns of connectivity shown here would greatly increase confidence in the biological plausibility of the theory.

      We thank the reviewer for pointing out this way to strengthen our paper. We implemented the connectivity developed in the rate models in a spiking neuron model which used EI-balanced Poisson noise as input drive. We found that we could reproduce all the main results of our analysis. In particular, with a realistic number of neurons, we observed swimming activity characterized by (i) left-right alternation, (ii) rostal-caudal propagation, and (iii) variable speed control with constant phase lag. The spiking model demonstrates that the connectivity-motif based mechanisms for rhythmogenesis that we propose are robust in a biophysical setting.

      We included these results in the updated manuscript in a new Results subsection titled “Robustness in a biophysical model.”

      (3) How stable is this model in its output patterns? Is it robust to noise? Does noise, in fact, smooth out the abrupt transitions in frequency in the middle range?

      The newly added spiking model implementation of the network demonstrates that the core mechanisms of our models are robust to noise,  since the connectivity is randomly chosen and the input drive is Poisson noise.

      To test the effect of noise as it is parametrically varied, we also added noise directly to the rate models in the form of white noise input to each unit. Namely, the rate model was adapted to obey the stochastic differential equation

      \[

      \tau_i \frac{dr_i(t)}{dt} = -r_i(t) + \left[ \sum_j W_{ij} r_j(t - \Delta_{ij}) + D_i + \sigma\xi_t \right]_+

      \]

      Here $\xi_t$ is a standard Gaussian white noise and $\sigma$ sets the strength of the noise. We found that the swimming patterns were robust at all frequencies up to $\sigma =  0.05$. Above this level, coherent oscillations started to break down for some swim frequencies. To investigate whether the noise smoothed out abrupt transitions, we swept through different values of noise and modularity of excitatory connections. The results showed very minor improvement in controllability (see figure below), but this was not significant enough to include in the manuscript.

      Author response image 1.

      (4) All figure captions are inadequate. They should have enough information for the reader to understand the figure and the point that was meant to be conveyed. For example, Figure 1 does not explain what the red dot is, what is black, what is white, or what the gradations of gray are. Or even if this is a representative connectivity of one node, or if this shows all the connections? The authors should not leave the reader guessing.

      All figure captions have been updated to enhance clarity and address these concerns.

      Reviewer #2 (Public review):

      (1) Figure 1A, if I interpret Figure 1B correctly, should there not be long descending projections as well that don't seem to be illustrated?

      Thank you for highlighting this potential point of confusion. The diagram in question was only intended to be a rough schematic of the types of connections present in the model. We have added additional descending connections as requested

      (2)Page 5, It would be good to define what is meant by slow and fast here, as this definition changes with age in zebrafish (what developmental age)?

      We have updated the manuscript to include the sentence: “These values were chosen to coincide with observed ranges from larval zebrafish.” with appropriate citation.

      Reviewer #3 (Public review):

      (1) The authors describe a single unit as a neuron, be it excitatory or inhibitory, and the output of the simulation is the firing rate of these neurons. Experimentally and in other modeling studies, motor neurons are incorporated in the model, and the output of the network is based on motor neuron firing rate, not the interneurons themselves. Why did the authors choose to build the model this way?

      We chose to leave out the motor neurons from our models for a few reasons. While motor neurons read out the rhythmic activity generated by the interneurons and may provide some feedback, they are not required for rhythmogenesis. In fact, interneuron activity (especially in the excitatory V2a neurons (Agha et al., 2024)) is highly correlated with the ventral root bursts within the same segment. This suggests that motor neurons are primarily a local readout of the rhythmic activity of interneurons; therefore, the rhythmic swimming activity can be deduced directly from the interneurons themselves.

      Moreover, there is a lack of experimental observation of the connectivity between all the cell types considered in our model and motor neurons. Hence, it was unclear how we should include them in the model. To address this, we are currently developing a data-driven approach that will determine the proper connectivity between the motor neurons and the interneurons, including intrasegmental connections.

      (2) In the single population model (Figure 1), the authors use ipsilateral inhibitory connections that are long-range in an ascending direction. Experimentally, these connections have been shown to be local, while long-range ipsilateral connections have been shown to be descending. What were the reasons the authors chose this connectivity? Do the authors think local ascending inhibitions contribute to rostrocaudal propagation, and how?

      The long-range ascending ipsilateral inhibitory connections arises from a limitation of our modeling framework. The V1 neurons that provide these connections have been shown experimentally to fire later than other neurons (especially descending V2a  neurons) within the same hemisegment (Jay et al., J Neurosci, 2023); however, our model can only produce synchronized local activity. Hence, we replace local phase offsets with spatial offsets to produce correctly structured recurrent phasic inputs. We are currently investigating a data-driven method for determining intrasegmental connectivity which should be able to produce the local phase offset and address this concern; however, this is beyond the scope of the current paper.

      (3) In the two-population model, the authors show independent control of frequency and rhythm, as has been reported experimentally. However, in these previous experimental studies, frequency and amplitude are regulated by different neurons, suggesting different networks dedicated to frequency and amplitude control. However, in the current model, the same population with the same connections can contribute to frequency or amplitude depending on relative tonic drive. Can the authors please address these differences either by changes in the model or by adding to the Discussion?

      Our prior  experimental results that suggested a separation of frequency and amplitude control circuits focus on motor neuron recruitment, instead of interneuron activity (Jay et al., J Neurosci 2023; Menelaou and McLean, Nat Commun 2019). To avoid potential confusion about amplitudes of interneurons vs. of motor neurons, we have removed the results from Figure 3 about control of amplitude in the 2-population model, instead focusing this figure on the control of frequency via speed-module recruitment. For the same reason, we have removed the panel showing the effects of targeted ablations on interneuron amplitudes in Figure 7. We have kept the result about amplitude control in our Supplemental Figure S2 for the 8-population model, but we try to make it clear in the text that any relationship between interneuron amplitude and motor neuron amplitude would depend on how motor neurons are modeled, which we do not pursue in this work.

      (4) It would be helpful to add a paragraph in the Discussion on how these results could be applicable to other model systems beyond zebrafish. Cell intrinsic rhythmogenesis is a popular concept in the field, and these results show an interesting and novel alternative. It would help to know if there is any experimental evidence suggesting such network-based propagation in other systems, invertebrates, or vertebrates.

      We have expanded a paragraph in the Discussion to address these questions. In particular, we highlight how a recent study of mouse locomotor circuits produced a model with similar key features (Komi et al., 2024). These authors made direct use of experimentally determined connectivity structure and cell-type distributions, which informed a model that produced purely network-based rhythmogenesis. We also point out that inhibition-dominated connectivity has been used for understanding oscillatory behavior in neural circuits outside the context of motor control (Zhang, 1996; Samsonovich and McNaughton, 1997; Murray and Escola, 2017). Finally, we address a study that used the cell-type specific connectivity within the C. Elegans locomotor circuit as the architecture for an artificial motor control system and found that the resulting system could more efficiently learn motor control tasks than general machine learning architectures (Bhattasali et al. 2022). Like our model, the Komi et al. and Bhattasali et al. models generate rhythm via structured connectivity motifs rather than via intracellular dynamical properties, suggesting that these may be a key mechanism underlying locomotion across species.

      Reviewer #1 (Recommendations for the authors):

      (1) Express this modeling construct in a simple biophysical model.

      See the new Results subsection titled “Robustness in a biophysical model.”

      (2) Please cite the classic models of Kopell, Ermentrout, Williams, Sigvardt etc., especially where you say "classic models".

      We have added relevant citations including the mentioned authors.

      (3) "Rhythmogenesis remain incompletely understood" changed to "Rhythmogenesis remains incompletely understood".

      We chose not to make this change since the ‘remain’ refers to the plural ‘core mechanisms’ not the singular ‘rhythmogenesis’.

      Reviewer #3 (Recommendations for the authors):

      (1) The figures are well made; however, it would help to add more details to the figure legends. For example, what neuron's firing rate is shown in Figure 1C? What is the red dot in 1B? Figures 3E,F,G: what is being plotted? Mean and SD? Blue dot in Figure 5C?

      All figure captions have been updated to enhance clarity and address these concerns.

      (2) A, B text missing in Figure 7.

      We have revised this figure and its caption; please see our response to Comment 3 above.

      (3) It would be nice to see the tonic drive pattern that is fed to the model for each case, along with the different firing rates in the figures. It would help understand how the tonic drive is changed to rhythmic activity.

      The tonic drive in the rate models is implemented as a constant excitatory input that is uniform across all units within the same speed-population. There is no patterning in time or location to this drive.

      References

      (1) Moneeza A Agha, Sandeep Kishore, and David L McLean. Cell-type-specific origins of locomotor rhythmicity at different speeds in larval zebrafish. eLife, July 2024

      (2) Nikhil Bhattasali, Anthony M Zador, and Tatiana Engel. Neural circuit architectural priors for embodied control. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 12744–12759. Curran Associates, Inc., 2022.

      (3) Salif Komi, August Winther, Grace A. Houser, Roar Jakob Sørensen, Silas Dalum Larsen, Madelaine C. Adamssom Bonfils, Guanghui Li, and Rune W. Berg. Spatial and network principles behind neural generation of locomotion. bioRxiv, 2024

      (4) James M Murray and G Sean Escola. Learning multiple variable-speed sequences in striatum via cortical tutoring. eLife, 6:e26084, May 2017.

      (5) Alexei Samsonovich and Bruce L McNaughton. Path integration and cognitive mapping in a continuous attractor neural network model. Journal of Neuroscience, 17(15):5900–5920, 1997.

      (6) K Zhang. Representation of spatial orientation by the intrinsic dynamics of the head-direction cell ensemble: a theory. Journal of Neuroscience, 16(6):2112–2126, 1996.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors investigate the mechanisms underlying the virulence of OMVs using a Drosophila model. They reveal a complex interplay between host defenses and OMV pathogenicity. Although the study enhances our understanding of Drosophila innate immunity, additional evidence is needed to strengthen the conclusions.

      Strengths:

      (1) In Figure 1, Toll pathway mutants infected with OMVs displayed three distinct phenotypic outcomes: mildly enhanced resistance to OMV infection, a response similar to that of the control, or increased susceptibility. Therefore, in addition to Imd and Kenny mutants from the Imd pathway, further mutants, such as Relish and PGRP-LC, should be examined to assess whether the Imd pathway is involved in host defense against OMVs.

      (2) Plasmatocytes clear particles via phagocytosis or endocytosis. However, flies lacking all hemocytes showed increased resistance to OMV challenge, raising the question of whether hemocytes actually aid the pathogen. To explore this hypothesis, the uptake of fluorescently tagged OMVs should be examined.

      (3) Hayan cleaves PPO into active PO. However, Hayan and PPO mutants exhibit opposite phenotypes upon OMV injection, raising the question of whether OMV-induced pathogenesis is linked to melanization.

      (4) Puckered mRNA levels were used as a read-out for JNK pathway activity. A transient induction of the JNK pathway was observed in head and thorax tissues. It would be beneficial if the authors could directly examine JNK activation in neuronal cells using immunostaining for pJNK.

      (5) In Figure 4B, the kayak was knocked down using the pan-neuronal driver elav-Gal4. To confirm the specificity and validity of this observation, the experiment should be repeated using another neural-specific driver.

      Weaknesses:

      It is unclear how many Serratia marcescens cells a 69 nL injection of 0.1 ng/nL OMVs corresponds to.

    1. Cybertruck driver had likely self-inflicted gunshot wound to the head,

      This headline is different since it immediately states the drivers self-inflicted gun shot, and not mentioning the explosion.